-
Notifications
You must be signed in to change notification settings - Fork 2k
Refactor CI flow. #3081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor CI flow. #3081
Conversation
tafsiri
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Na, I think I have the same question for most of these cloudbuilds (i reviewed the first two-but the questions are pretty much the same below). More things waitFor yarn-common rather than yarn that I would have thought. But I might be misunderstanding something. I can continue reviewing after understanding that structure.
Reviewed 4 of 21 files at r1.
Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov, @lina128, @nsthorat, @pyu10055, and @tafsiri)
tfjs-backend-cpu/cloudbuild.yml, line 22 at r1 (raw file):
entrypoint: 'yarn' args: ['build-deps-ci'] waitFor: ['yarn-common']
shouldn't this wait on yarn (and not yarn common) to finish? else the deps may not be present in not modules.
tfjs-backend-cpu/cloudbuild.yml, line 38 at r1 (raw file):
entrypoint: 'yarn' args: ['lint'] waitFor: ['build']
I don't think linting needs to wait on building as it just scans sources right?
tfjs-backend-cpu/cloudbuild.yml, line 46 at r1 (raw file):
id: 'test-backend-cpu' args: ['test-ci'] waitFor: ['lint']
shouldn't this directly wait for build and build-deps to be finished as well? (i think expressing all the needed dependencies is useful rather than relying on what lint wait for)
tfjs-backend-wasm/cloudbuild.yml, line 22 at r1 (raw file):
id: 'build-deps' args: ['build-deps-ci'] waitFor: ['yarn-common']
same as above should this be yarn?
tfjs-backend-wasm/cloudbuild.yml, line 38 at r1 (raw file):
id: 'lint' args: ['lint'] waitFor: ['build']
same as above
tfjs-backend-wasm/cloudbuild.yml, line 46 at r1 (raw file):
id: 'test-wasm' args: ['test-ci'] waitFor: ['lint']
same as above
tfjs-node-gpu/cloudbuild.yml, line 78 at r1 (raw file):
id: 'ensure-cpu-gpu-packages-align' args: ['ensure-cpu-gpu-packages-align'] waitFor: ['yarn-common']
why yarn common and not yarn?
lina128
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Yannick, I replied inline. Thank you for your review.
Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov, @lina128, @nsthorat, @pyu10055, and @tafsiri)
tfjs-backend-cpu/cloudbuild.yml, line 22 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
shouldn't this wait on
yarn(and not yarn common) to finish? else the deps may not be present in not modules.
yarn && yarn build-deps or yarn build-deps && yarn has the same effect. What yarn does is just symlink the package without checking whether the dist folder is there. yarn build-deps builds the package and generate dist, the symlink to the package (not the dist) can happen before it or after it. They can run completely in parallel.
tfjs-backend-cpu/cloudbuild.yml, line 38 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
I don't think linting needs to wait on building as it just scans sources right?
Changed to ['yarn', 'build-deps']
tfjs-backend-cpu/cloudbuild.yml, line 46 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
shouldn't this directly wait for build and build-deps to be finished as well? (i think expressing all the needed dependencies is useful rather than relying on what lint wait for)
This is a trick to save CI resource. lint is a lightweight test, whereas unit/integration tests are heavy tests, especially those involving browserstack. If lint fails, there's no reason to start the heavy tests. Each of the package has its own process, what I observed is that if one package has an error, the parent process that kicks off these children process gets cancelled, but the other processes don't know this cancellation, so they keep on going, although whether they pass or not is meaningless at that point. This trick doesn't help solving that problem, but at least it prevents the use of browserstack if we know that any of the lightweight tests will fail.
tfjs-node-gpu/cloudbuild.yml, line 78 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
why yarn common and not yarn?
It is just comparing package.json, no need to actually install anything. This again is a lightweight test, if it will fail, let it fail early, so that heavy tests don't need to run unnecessarily.
tafsiri
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 10 of 21 files at r1, 11 of 11 files at r2.
Reviewable status:complete! 1 of 1 approvals obtained (waiting on @dsmilkov, @lina128, @nsthorat, and @pyu10055)
tfjs-backend-cpu/cloudbuild.yml, line 22 at r1 (raw file):
Previously, lina128 (Na Li) wrote…
yarn && yarn build-depsoryarn build-deps && yarnhas the same effect. What yarn does is just symlink the package without checking whether the dist folder is there. yarn build-deps builds the package and generate dist, the symlink to the package (not the dist) can happen before it or after it. They can run completely in parallel.
Discussed offline. Understood.
tfjs-backend-cpu/cloudbuild.yml, line 46 at r1 (raw file):
Previously, lina128 (Na Li) wrote…
This is a trick to save CI resource. lint is a lightweight test, whereas unit/integration tests are heavy tests, especially those involving browserstack. If lint fails, there's no reason to start the heavy tests. Each of the package has its own process, what I observed is that if one package has an error, the parent process that kicks off these children process gets cancelled, but the other processes don't know this cancellation, so they keep on going, although whether they pass or not is meaningless at that point. This trick doesn't help solving that problem, but at least it prevents the use of browserstack if we know that any of the lightweight tests will fail.
Cool. Talked about it offline, i think adding build and build-deps is good for clarity, but i agree with you that keeping it lightweight and getting early failure is good. So as long as adding those doesn't make it slower then go ahead. Else feel free to keep it this way.
lina128
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 1 approvals obtained (waiting on @dsmilkov, @nsthorat, @pyu10055, and @tafsiri)
tfjs-backend-cpu/cloudbuild.yml, line 46 at r1 (raw file):
Previously, tafsiri (Yannick Assogba) wrote…
Cool. Talked about it offline, i think adding build and build-deps is good for clarity, but i agree with you that keeping it lightweight and getting early failure is good. So as long as adding those doesn't make it slower then go ahead. Else feel free to keep it this way.
Done. Added yarn && yarn build-deps for clarity.
Move flows from test-ci to cloudbuild for each package.
Background: Much of the flows were originally in cloudbuild, due to dependency structure change migration, some of the flows need to be moved to test-ci so that integration test can work again, for detail, see #2913. Now the unnecessary integration test is removed, flows can be moved back to cloudbuild. This will allow more parallel runs, therefore reducing test time.
This change is