perf: use lru_cache for polstr and variants #1250

steven-murray · 2023-01-18T16:39:06Z

Description

Applies the lru_cache decorator to some utility functions, especially the polnum2str function and its variants. These are good functions for this decorator, as they have a very limited set of possible input parameters, so do not take much extra cache memory.

Motivation and Context

It turns out that these functions can take a LOT of the total time for some hera_cal scripts, because they are called every time you request a baseline-pol key from a HERAData object. Thus, if you loop over all baselines in a file to get their data, you end up calling polstr2num like 65k times. Since polstr2num does a deepcopy of a dict, it actually takes a non-negligible amount of time (for example, the delay filter was taking ~20min for a 2-integration file, of which about 7min was taken by the polstr2num function).

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation change (documentation changes only)
Version change
Build or continuous integration change

Checklist:

I have read the contribution guide.
My code follows the code style of this project.

Bug fix checklist:

My fix includes a new test that breaks as a result of the bug (if possible).
All new and existing tests pass.
I have updated the CHANGELOG.

steven-murray · 2023-01-18T21:14:09Z

Ech. It's not working because the polarization numbers can be input as arrays, which I didn't realize. Currently this works fine for hera_cal because we never do this, but it won't work in general for pyuvdata. Unless you have a good idea about how to do this more generally, @bhazelton or @mkolopanis, I'll just close this PR and do it in hera_cal.

bhazelton · 2023-01-18T21:34:17Z

can we ditch the deep copy operation?

steven-murray · 2023-01-19T16:09:09Z

@bhazelton possibly, I'll have a check of the logic.

bhazelton · 2023-01-19T18:18:36Z

@plaplant suggests creating a private function with the decorator to be called by the existing function which can convert from an array or list to a tuple as needed.

codecov · 2023-01-19T22:15:02Z

Codecov Report

Merging #1250 (82113bf) into main (5c3cf17) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1250   +/-   ##
=======================================
  Coverage   99.91%   99.91%           
=======================================
  Files          33       33           
  Lines       18655    18677   +22     
=======================================
+ Hits        18639    18661   +22     
  Misses         16       16

Impacted Files	Coverage Δ
pyuvdata/utils.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5c3cf17...82113bf. Read the comment docs.

steven-murray · 2023-01-30T20:26:38Z

@bhazelton I added @plaplant as a reviewer as well, since he had ideas about this :-) But in any case it should be ready to go.

bhazelton

Looks reasonable to me, not sure why pre-commit is so unhappy...

CHANGELOG.md

bhazelton · 2023-01-30T21:30:33Z

setup.cfg

@@ -5,7 +5,7 @@ description_file = README.md
 # B905 is for using zip without the `strict` argument, which was introduced in
 # python 3.10. We should probably add this check (remove it from the ignore) when we
 # require 3.10.
-ignore = W503, E203, N806, B905, B907
+ignore = W503, E203, N806, B905, B907, B028


I think B028 is the same as B907. It started out as B028 but then was converted to B907 in flake8-bugbear version 23.1.17 . We now require that version or newer, so I don't think we need B028 in this list. See: https://github.com/PyCQA/flake8-bugbear#23117

bhazelton

Looks good. Thanks!!

steven-murray requested a review from bhazelton January 18, 2023 16:39

steven-murray mentioned this pull request Jan 18, 2023

Delay filter speedup HERA-Team/hera_cal#863

Merged

bhazelton added enhancement UVData labels Jan 19, 2023

steven-murray force-pushed the cache-polnum2str branch from a45c41d to 0032ceb Compare January 30, 2023 20:25

steven-murray added 4 commits January 30, 2023 13:25

perf: use lru_cache for polstr and variants

5bab817

fix: use a new np-capable decorator instead of lru_cache

7e387f7

fix: import Iterable from typing

8021104

fix: import future annotations

b53faa0

steven-murray requested a review from plaplant January 30, 2023 20:26

style: no B028

0032ceb

bhazelton reviewed Jan 30, 2023

View reviewed changes

steven-murray added 2 commits January 30, 2023 14:39

maint: remove unnecessary B028

3ebf554

maint: use isort 5.12 to fix pre-commit

82113bf

bhazelton approved these changes Jan 31, 2023

View reviewed changes

bhazelton merged commit c2969e0 into main Jan 31, 2023

bhazelton deleted the cache-polnum2str branch January 31, 2023 22:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: use lru_cache for polstr and variants #1250

perf: use lru_cache for polstr and variants #1250

steven-murray commented Jan 18, 2023

steven-murray commented Jan 18, 2023

bhazelton commented Jan 18, 2023

steven-murray commented Jan 19, 2023

bhazelton commented Jan 19, 2023

codecov bot commented Jan 19, 2023 •

edited

Loading

steven-murray commented Jan 30, 2023

bhazelton left a comment

bhazelton Jan 30, 2023

bhazelton left a comment

perf: use lru_cache for polstr and variants #1250

perf: use lru_cache for polstr and variants #1250

Conversation

steven-murray commented Jan 18, 2023

Description

Motivation and Context

Types of changes

Checklist:

steven-murray commented Jan 18, 2023

bhazelton commented Jan 18, 2023

steven-murray commented Jan 19, 2023

bhazelton commented Jan 19, 2023

codecov bot commented Jan 19, 2023 • edited Loading

Codecov Report

steven-murray commented Jan 30, 2023

bhazelton left a comment

Choose a reason for hiding this comment

bhazelton Jan 30, 2023

Choose a reason for hiding this comment

bhazelton left a comment

Choose a reason for hiding this comment

codecov bot commented Jan 19, 2023 •

edited

Loading