Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numba-based StdDevUDF #640

Merged
merged 14 commits into from Mar 2, 2020
Merged

Numba-based StdDevUDF #640

merged 14 commits into from Mar 2, 2020

Conversation

uellue
Copy link
Member

@uellue uellue commented Feb 22, 2020

This version is slightly about 2x faster compared to the version in #625 for the benign case of signal buffers that fit comfortably into the CPU caches.

For the malign case of signal buffers that don't fit the caches well,
it is about 10x faster since it eliminates large intermediate buffers and
calculates each pixel in one pass the result in cache-friendly blocks and CPU -friendly access patterns.

  • Added test for numerical stability.
  • Added citation in bibtex

Contributor Checklist:

  • I have added myself to the creators.json file
  • I have added/updated documentation for all user-facing changes
  • I have added/updated test cases

This version is slightly faster for the benign case of signal buffers
that fit comfortably into the CPU caches.

For the malign case of signal buffers that don't fit the caches well,
it is 5x-7x faster since it eliminates large intermediate buffers and
calculates each pixel in one pass.

* Added test for numerical stability.
* Added citation
@sk1p sk1p added this to the 0.5 milestone Feb 22, 2020
@sk1p sk1p added this to In progress in Performance via automation Feb 22, 2020
A second pass or citation in main text seem to fix the issue with undefined reference -- perhaps
BibTeX cache not generated when citing in docstring?
@codecov
Copy link

codecov bot commented Feb 24, 2020

Codecov Report

Merging #640 into master will increase coverage by 0.35%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #640      +/-   ##
==========================================
+ Coverage   58.13%   58.49%   +0.35%     
==========================================
  Files         210      210              
  Lines        8012     8068      +56     
  Branches     1049     1065      +16     
==========================================
+ Hits         4658     4719      +61     
+ Misses       3163     3160       -3     
+ Partials      191      189       -2     
Impacted Files Coverage Δ
src/libertem/web/messages.py 84.48% <0.00%> (-1.73%) ⬇️
src/libertem/web/jobs.py 65.83% <0.00%> (-0.84%) ⬇️
client/src/dataset/helpers.ts 0.00% <0.00%> (ø) ⬆️
client/src/dataset/components/SERParamsForm.tsx 0.00% <0.00%> (ø) ⬆️
client/src/dataset/components/FRMS6ParamsForm.tsx 0.00% <0.00%> (ø) ⬆️
...lient/src/dataset/components/RawFileParamsForm.tsx 0.00% <0.00%> (ø) ⬆️
src/libertem/api.py 87.78% <0.00%> (+0.09%) ⬆️
src/libertem/io/dataset/base.py 89.26% <0.00%> (+0.93%) ⬆️
src/libertem/io/dataset/raw.py 92.68% <0.00%> (+8.75%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 05edd76...8dfc2ec. Read the comment docs.

CI error not reproducible locally.
This reverts commit 5af3fca.
:cite:`Schubert2018` glitches in CI in docstrings and some RST files.

Hard to reproduce the CI behavior
Copy link
Member

@sk1p sk1p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments - most importantly I think the docstrings, variable naming and thus readability can be improved - the general structure and implementation looks good!

src/libertem/udf/stddev.py Outdated Show resolved Hide resolved
src/libertem/udf/stddev.py Outdated Show resolved Hide resolved
src/libertem/udf/stddev.py Outdated Show resolved Hide resolved
src/libertem/udf/stddev.py Outdated Show resolved Hide resolved
src/libertem/udf/stddev.py Outdated Show resolved Hide resolved
src/libertem/udf/stddev.py Outdated Show resolved Hide resolved
src/libertem/udf/stddev.py Outdated Show resolved Hide resolved
src/libertem/udf/stddev.py Outdated Show resolved Hide resolved
tests/udf/test_stddev.py Outdated Show resolved Hide resolved
src/libertem/udf/stddev.py Show resolved Hide resolved
Use permanent DOI instead of 0.1.0
Thx @sk1p for the feedback!
@uellue
Copy link
Member Author

uellue commented Feb 28, 2020

Thx @sk1p for the feedback! This should be ready for a second round of review.

@uellue
Copy link
Member Author

uellue commented Feb 28, 2020

The Numba coverage for process_tile was not counted for some reason. It did work before, with the exception of not passing through the blocked part of the loop since the input data was too small.

@sk1p sk1p mentioned this pull request Feb 29, 2020
Copy link
Member

@sk1p sk1p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done, and impressive speed-up! With the improved doc strings and variable names, the code is much better readable. Two more suggestions added, other than those this is ready to be merged.

src/libertem/udf/stddev.py Outdated Show resolved Hide resolved
src/libertem/udf/stddev.py Outdated Show resolved Hide resolved
tests/udf/test_stddev.py Outdated Show resolved Hide resolved
@sk1p
Copy link
Member

sk1p commented Feb 29, 2020

The Numba coverage for process_tile was not counted for some reason. It did work before, with the exception of not passing through the blocked part of the loop since the input data was too small.

coverage issues are best debugged locally (I don't trust codecov... the travis job says coverage was uploaded, but apparently it was not correctly merged into the full coverage report, sigh.) - did you try tox -e numba_coverage and checking the result in the htmlcov directory?

@uellue
Copy link
Member Author

uellue commented Mar 2, 2020

did you try tox -e numba_coverage and checking the result in the htmlcov directory?

Good point! The function was actually never called since the partition did fit in a single tile. I've adapted the parameters accordingly. At least locally, the coverage is 100 % now.

@sk1p sk1p merged commit 425b5e2 into LiberTEM:master Mar 2, 2020
Performance automation moved this from In progress to Done Mar 2, 2020
@sk1p
Copy link
Member

sk1p commented Mar 2, 2020

Perfect, thanks, merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Performance
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants