Vectorize DotFeatures covariance/mean calculation #3908

micmn · 2017-07-06T14:02:25Z

Would be nice also to parallelize the covariance main loop but is not obvious how to do it properly since it works on one vector at a time, tried a few variants with openmp but it didn’t improve; the straightforward way to speed-up get_cov would be to compute it by matrix prod (tried, approx. 4X on 4 threads), but this requires the storage of the centered data matrix while the current algorithm doesn’t need that.

karlnapf · 2017-07-06T14:11:22Z

Maybe offer a flag for that? And so some clever auto settings of it....

codecov · 2017-07-06T14:49:58Z

Codecov Report

Merging #3908 into develop will decrease coverage by 0.04%.
The diff coverage is 100%.

@@             Coverage Diff             @@
##           develop    #3908      +/-   ##
===========================================
- Coverage    53.29%   53.24%   -0.05%     
===========================================
  Files         1432     1432              
  Lines       104428   104422       -6     
===========================================
- Hits         55652    55597      -55     
- Misses       48776    48825      +49

Impacted Files	Coverage Δ
src/shogun/features/DotFeatures.cpp	`78.86% <100%> (-0.64%)`	⬇️
src/gpl/shogun/lib/external/libqp_gsmo.cpp	`19.73% <0%> (-65.79%)`	⬇️
...g/kernelselection/internals/OptimizationSolver.cpp	`84.21% <0%> (-8.78%)`	⬇️
src/shogun/optimization/liblinear/tron.cpp	`86.71% <0%> (-0.7%)`	⬇️
src/shogun/base/Parameter.cpp	`55.38% <0%> (+0.05%)`	⬆️
src/shogun/machine/KernelMachine.cpp	`79.23% <0%> (+0.31%)`	⬆️
src/shogun/mathematics/linalg/LinalgNamespace.h	`50.46% <0%> (+2.31%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5ab2f2...672d36f. Read the comment docs.

karlnapf · 2017-07-09T15:35:13Z

I'd be keen to have a flag that improves speed substantially here, trading off with storing the dataset twice. Most of the time that is fine anyways.

Thoughts?

micmn · 2017-07-09T15:52:00Z

Yeah, seems reasonable, what about auto setting you mentioned? any suggestion? Il 09/lug/2017 17:35, "Heiko Strathmann" <notifications@github.com> ha scritto:

…

I'd be keen to have a flag that improves speed substantially here, trading off with storing the dataset twice. Most of the time that is fine anyways. Thoughts? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#3908 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKH6Tt1ReE75OjJmDe_ZP2T96Q-ERwHvks5sMPM1gaJpZM4OPre1> .

karlnapf · 2017-07-09T18:26:46Z

For now, I would just offer a flag "copy_data_for_speed" which would be appropriately documented, and enabled by default

Vectorize DotFeatures covariance/mean calculation.

672d36f

vigsterkr merged commit 41129f8 into shogun-toolbox:develop Jul 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorize DotFeatures covariance/mean calculation #3908

Vectorize DotFeatures covariance/mean calculation #3908

micmn commented Jul 6, 2017

karlnapf commented Jul 6, 2017

codecov bot commented Jul 6, 2017

karlnapf commented Jul 9, 2017

micmn commented Jul 9, 2017 via email

karlnapf commented Jul 9, 2017

Vectorize DotFeatures covariance/mean calculation #3908

Vectorize DotFeatures covariance/mean calculation #3908

Conversation

micmn commented Jul 6, 2017

karlnapf commented Jul 6, 2017

codecov bot commented Jul 6, 2017

Codecov Report

karlnapf commented Jul 9, 2017

micmn commented Jul 9, 2017 via email

karlnapf commented Jul 9, 2017