Skip to content

Consider which CI jobs run on each trigger and which notebooks each job executes #175

@troyraen

Description

@troyraen

Summary

Currently, we run 6-15+ jobs simultaneously for each CI trigger which all execute the notebooks. In addition, some jobs execute the notebooks in parallel, and some notebooks execute calls in parallel. I think this is resulting in too many simultaneous calls to IRSA. My sense is that an appropriate max number of simultaneous calls is around 10.

Details

Trigger Jobs Runs in Forks[*] Total
PR to main • 1x build-docs (CircleCI)
• 5x test matrix (GHA)
No 6
Push to main • 1x publish_html (GHA)
• 1x build-docs (CircleCI)
• 5x test matrix (GHA)
No 7
PR to any branch • 5x test matrix (GHA) No 5
Weekly cron • 5x test matrix (GHA) Yes 15+

[*] This is what runs in my fork and one other that I know of. Not sure if there are more.

Two things compounding the problem:

  1. Some jobs execute the notebooks in parallel -- is this all GHA jobs?
  2. Some notebooks make calls in parallel. For example, the spherex_cutouts.md notebook uses max_workers=10 to get cutouts in parallel. This means we're likely making between 60 - 150 simultaneous calls for data from that notebook alone, plus calls from the other notebooks.

So I'm concerned that we're being unfriendly at best, and may actually be causing problems for ourselves (execution failures), IRSA, and/or other users. This is especially true if IRSA is already experiencing heavy load from other users, as has been the case recently for SPHEREx data in particular.

Changes that could help

Sorted by impact. I'm looking for help with this table, both correcting what's there (esp. impact and implementation estimates) and adding new rows/ideas. @bsipocz and others, what do you think?

Done Trigger Job Change Impact (hi/med/low) Implementation (easy/hard)
A [x] #160 Weekly cron test matrix stop these from running in forks hi easy
B [ ] Weekly cron test matrix Spread out so one matrix cell runs each day med/hi easy
C [ ] PR to main, Push to main, and PR to any branch test matrix IMO, running the test matrix should be done once before pushing to main, but running every time a commit is pushed to a PR is too much. Also, it's not necessary to run again after merge (on push to main) if it's been run before merging. How can we accomplish that? Some ideas:
1. Remove from trigger "PR to main"; only run it on "Push to main". It wouldn't run until after merge and we'd have to make sure to monitor the results and respond to failures. So, non-ideal, but maybe an acceptable trade off?
2. Implement a GH label to skip the test matrix job. Auto-add it to PRs when opened and require it to be removed (and test matrix run) before merging. I think(?) all this is doable in the config files, so maybe a better option than the above?
med/hi ?
D [x] #176 Push to main build-docs remove from this trigger low/med easy
E [ ] #143 PR to main build-docs only execute notebooks with changes.
render the others with no output
low/med easy? (shell hack, diff with main and add non-changed notebooks to excluded list?)
F [ ] Push to main publish_html only execute notebooks with changes.
use cache for the rest
low/med ? (how to implement the caching?)

Metadata

Metadata

Labels

help wantedExtra attention is neededinfrastructureInfrastructure related issues/PRs.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions