Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix native builds on GitHub Actions using Bazel cache #240

Merged
merged 26 commits into from
Mar 12, 2021
Merged

Conversation

karllessard
Copy link
Collaborator

This PR fixes most of the issues we are facing when building/releasing on GitHub Actions. Important updates are:

  • Enable Bazel caching on a GCP bucket
  • Better cleanup of Cuda resources

Bazel cache will reuse the same compiled binaries (at the file level) that were built in a previous workflow if the whole native tree (TF) is detected as being unchanged (it basically hashes the content of the files + its configuration). Multiple runs might be required after modifying the TF source tree since it can still take over 6 hours in some cases, but on the second run the build will restart from where it left.

After the cache is setup, a full build on Linux with GPU will take about 25 minutes instead of 6 hours. See this run

Copy link
Collaborator

@Craigacp Craigacp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@karllessard
Copy link
Collaborator Author

Thanks @Craigacp , merging, let see how it goes with the deployment

@karllessard karllessard merged commit 1bc3aa6 into master Mar 12, 2021
@karllessard
Copy link
Collaborator Author

Bingo :) https://github.com/tensorflow/java/runs/2096897445?check_suite_focus=true

@rnett
Copy link
Contributor

rnett commented Mar 12, 2021

This broke local builds with build.sh: line 36: BUILD_EXTRA_FLAGS: unbound variable, BUILD_EXTRA_FLAGS doesn't seem to be sued by the CI anywhere though. Can we delete it?

@karllessard
Copy link
Collaborator Author

It's issued by the Maven plugin executing that script. Normally, you shouldn't run build.sh manually.

@rnett
Copy link
Contributor

rnett commented Mar 13, 2021

Yeah, I get that error when running maven install. Not sure why though, as that seems correct.

@karllessard
Copy link
Collaborator Author

Mmh, maybe the plugin don't expose the environment variable if it's empty? Possible, since it always has a value in the CI. Anyway I'm done building the artifacts of 0.3.0, which is not impacted by this other that users might have trouble if they want to rebuild 0.3.0 by their own... mmmh... ok I'll take a look this weekend, I might do a fix and rebuild everything again, I haven't published the artifacts so its fine, thanks

@karllessard karllessard deleted the bazel-cache branch March 13, 2021 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants