Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: use lru_cache for polstr and variants #1250

Merged
merged 7 commits into from
Jan 31, 2023
Merged

Conversation

steven-murray
Copy link
Contributor

Description

Applies the lru_cache decorator to some utility functions, especially the polnum2str function and its variants. These are good functions for this decorator, as they have a very limited set of possible input parameters, so do not take much extra cache memory.

Motivation and Context

It turns out that these functions can take a LOT of the total time for some hera_cal scripts, because they are called every time you request a baseline-pol key from a HERAData object. Thus, if you loop over all baselines in a file to get their data, you end up calling polstr2num like 65k times. Since polstr2num does a deepcopy of a dict, it actually takes a non-negligible amount of time (for example, the delay filter was taking ~20min for a 2-integration file, of which about 7min was taken by the polstr2num function).

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation change (documentation changes only)
  • Version change
  • Build or continuous integration change

Checklist:

Bug fix checklist:

  • My fix includes a new test that breaks as a result of the bug (if possible).
  • All new and existing tests pass.
  • I have updated the CHANGELOG.

@steven-murray
Copy link
Contributor Author

Ech. It's not working because the polarization numbers can be input as arrays, which I didn't realize. Currently this works fine for hera_cal because we never do this, but it won't work in general for pyuvdata. Unless you have a good idea about how to do this more generally, @bhazelton or @mkolopanis, I'll just close this PR and do it in hera_cal.

@bhazelton
Copy link
Member

can we ditch the deep copy operation?

@steven-murray
Copy link
Contributor Author

@bhazelton possibly, I'll have a check of the logic.

@bhazelton
Copy link
Member

@plaplant suggests creating a private function with the decorator to be called by the existing function which can convert from an array or list to a tuple as needed.

@codecov
Copy link

codecov bot commented Jan 19, 2023

Codecov Report

Merging #1250 (82113bf) into main (5c3cf17) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1250   +/-   ##
=======================================
  Coverage   99.91%   99.91%           
=======================================
  Files          33       33           
  Lines       18655    18677   +22     
=======================================
+ Hits        18639    18661   +22     
  Misses         16       16           
Impacted Files Coverage Δ
pyuvdata/utils.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5c3cf17...82113bf. Read the comment docs.

@steven-murray
Copy link
Contributor Author

@bhazelton I added @plaplant as a reviewer as well, since he had ideas about this :-) But in any case it should be ready to go.

Copy link
Member

@bhazelton bhazelton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me, not sure why pre-commit is so unhappy...

CHANGELOG.md Show resolved Hide resolved
setup.cfg Outdated
@@ -5,7 +5,7 @@ description_file = README.md
# B905 is for using zip without the `strict` argument, which was introduced in
# python 3.10. We should probably add this check (remove it from the ignore) when we
# require 3.10.
ignore = W503, E203, N806, B905, B907
ignore = W503, E203, N806, B905, B907, B028
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think B028 is the same as B907. It started out as B028 but then was converted to B907 in flake8-bugbear version 23.1.17 . We now require that version or newer, so I don't think we need B028 in this list. See: https://github.com/PyCQA/flake8-bugbear#23117

Copy link
Member

@bhazelton bhazelton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!!

@bhazelton bhazelton merged commit c2969e0 into main Jan 31, 2023
@bhazelton bhazelton deleted the cache-polnum2str branch January 31, 2023 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants