-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: gate hpc by request #9198
ci: gate hpc by request #9198
Conversation
Docsite preview being generated for this PR. |
Docsite preview being generated for this PR. |
✅ Deploy Preview for determined-ui ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
d85cfc6
to
e5ba597
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #9198 +/- ##
==========================================
+ Coverage 44.56% 44.58% +0.01%
==========================================
Files 1275 1275
Lines 156216 156216
Branches 2451 2451
==========================================
+ Hits 69625 69647 +22
+ Misses 86351 86329 -22
Partials 240 240
Flags with carried forward coverage won't be shown. Click here to find out more. |
.circleci/real_config.yml
Outdated
@@ -4207,11 +4207,16 @@ workflows: | |||
only: | |||
- main | |||
|
|||
- request-e2e-hpc-tests: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tests won't run on main automatically with this change right?
I think we need to duplicate the jobs into test-e2e-longrunning and put the request-e2e-hpc-tests
there.
.circleci/real_config.yml
Outdated
@@ -4286,6 +4293,7 @@ workflows: | |||
mark: ["e2e_slurm and not parallel and not gpu_required"] | |||
requires: | |||
- build-go | |||
- request-e2e-hpc-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have another layer of skipping tests that only runs certain slurm tests with github pr labels
https://github.com/determined-ai/determined/blob/e5ba597858bcc6cf37d6db42d2720a21a9c3e10e/.circleci/real_config.yml#L2556-L2561C15
test-e2e-hpc-gcp
will run every time though?
Personally I'm not against just removing the label running mechanism
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll give this a look
.circleci/real_config.yml
Outdated
host: localhost | ||
port: 5432 | ||
name: determined | ||
password: launcher |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should use circleci env vars for passwords
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just one NIT.
- unless: | ||
condition: | ||
or: | ||
- equal: [ true, <<parameters.always-run>> ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think always-run
is unused now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, will remove
mark: ["e2e_pbs and not parallel and not gpu_required"] | ||
extra-pytest-flags: ["-k 'not test_slurm_verify_home'"] | ||
requires: | ||
- build-go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the dependencies can be fixed pretty easily, but I'm not sure what you meant by build-go would get run twice on a commit. Would it be an issue just to move it outside of the request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example build-react
runs in e2e-tests
and also in the test-e2e-longrunning
. In a perfect world we would only build go / react once per commit. These tests run on every commit a lot so it is wasteful.
I don't really know an easy solution besides duplicating build go so if you just want to move it outside of the request that is fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NicholasBlaskey sorry got caught up in release stuff. I think I fixed the dependencies issue and updated
c56cb45
to
2fef7b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
1c0c13e
to
7358578
Compare
Ticket
Description
Quick solution so that slurm tests aren't run by default every push. We ideally do not want to run HPC tests on every commit to decrease spend and runtime, so for now we will ask users to enter circleCI and manually approve of requests to run these tests. A future improvement is to automatically decide which tests to run based on the modified files
Test Plan
Check CircleCI and make sure that under
test-e2e
, the slurm and hpc tests are not running and are awaiting approvalChecklist
docs/release-notes/
.See Release Note for details.