Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorize DotFeatures covariance/mean calculation #3908

Merged
merged 1 commit into from Jul 7, 2017

Conversation

micmn
Copy link
Contributor

@micmn micmn commented Jul 6, 2017

Would be nice also to parallelize the covariance main loop but is not obvious how to do it properly since it works on one vector at a time, tried a few variants with openmp but it didn’t improve; the straightforward way to speed-up get_cov would be to compute it by matrix prod (tried, approx. 4X on 4 threads), but this requires the storage of the centered data matrix while the current algorithm doesn’t need that.

@karlnapf
Copy link
Member

karlnapf commented Jul 6, 2017

Maybe offer a flag for that? And so some clever auto settings of it....

@codecov
Copy link

codecov bot commented Jul 6, 2017

Codecov Report

Merging #3908 into develop will decrease coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3908      +/-   ##
===========================================
- Coverage    53.29%   53.24%   -0.05%     
===========================================
  Files         1432     1432              
  Lines       104428   104422       -6     
===========================================
- Hits         55652    55597      -55     
- Misses       48776    48825      +49
Impacted Files Coverage Δ
src/shogun/features/DotFeatures.cpp 78.86% <100%> (-0.64%) ⬇️
src/gpl/shogun/lib/external/libqp_gsmo.cpp 19.73% <0%> (-65.79%) ⬇️
...g/kernelselection/internals/OptimizationSolver.cpp 84.21% <0%> (-8.78%) ⬇️
src/shogun/optimization/liblinear/tron.cpp 86.71% <0%> (-0.7%) ⬇️
src/shogun/base/Parameter.cpp 55.38% <0%> (+0.05%) ⬆️
src/shogun/machine/KernelMachine.cpp 79.23% <0%> (+0.31%) ⬆️
src/shogun/mathematics/linalg/LinalgNamespace.h 50.46% <0%> (+2.31%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5ab2f2...672d36f. Read the comment docs.

@vigsterkr vigsterkr merged commit 41129f8 into shogun-toolbox:develop Jul 7, 2017
@karlnapf
Copy link
Member

karlnapf commented Jul 9, 2017

I'd be keen to have a flag that improves speed substantially here, trading off with storing the dataset twice. Most of the time that is fine anyways.

Thoughts?

@micmn
Copy link
Contributor Author

micmn commented Jul 9, 2017 via email

@karlnapf
Copy link
Member

karlnapf commented Jul 9, 2017

For now, I would just offer a flag "copy_data_for_speed" which would be appropriately documented, and enabled by default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants