Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of timerange extraction in scraper #6101

Merged
merged 3 commits into from Apr 26, 2022

Conversation

dstansby
Copy link
Member

@dstansby dstansby commented Apr 24, 2022

I was digging into why some SUVI tests take so long to run (even with vcrpy, ie. ignoring the remote response times), and when there are a lot of URLs to parse, get_timerange_from_exdict is the bottleneck, as it is run on every URL. Most of the time was spent in astropy.time.Time logic, but this is unnesseary as this code doesn't use any special features of Time, so:

  • switch to just using datetime
  • pre-compute the deltas
  • re-order the many if statements so they're not all executed if seconds are specified in the timestamp

This leads to a ~50% improvement in running the scraper for me. On some SUVI searches in the current test suite, this reudces their runtime by ~3 seconds.

@dstansby dstansby force-pushed the scraper-perf branch 2 times, most recently from 4d48739 to a55f245 Compare April 24, 2022 09:09
@dstansby dstansby marked this pull request as ready for review April 25, 2022 11:29
@dstansby dstansby requested a review from a team as a code owner April 25, 2022 11:29
@samaloney
Copy link
Contributor

This looks great - I have a vague recollection of something about leap seconds which was why we use astropy.time.Time but I can't remember or see how it would make a difference.

@nabobalis nabobalis added net Affects the net submodule CodeFix labels Apr 26, 2022
@nabobalis
Copy link
Contributor

pre-commit.ci autofix

@nabobalis nabobalis added the Merge When CI Passes Hit that merge button when it's all green! label Apr 26, 2022
@nabobalis
Copy link
Contributor

The SRS remote tests are failing on main. Otherwise the other online fails are flaky ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merge When CI Passes Hit that merge button when it's all green! net Affects the net submodule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants