Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Hermetic GraalVM #35

Merged
merged 110 commits into from
Aug 23, 2023
Merged

Feature: Hermetic GraalVM #35

merged 110 commits into from
Aug 23, 2023

Conversation

sgammon
Copy link
Owner

@sgammon sgammon commented Aug 18, 2023

Summary

This changeset applies a few combined feature branches to fix a suite of issues ahead of 1.0.0. Altogether, the native-image build process should soon be fully hermetic.

Features:

  • Toolchain for cc now resolved via Bazel Toolchains
  • Ability to resolve native-image toolchain
  • New mapping generator alleviates need to manually keep mappings up to date
  • Support for component dependencies
  • Hermetic component downloads

Additional cleanup/fixes:

  • Fixes for some path issues on Windows
  • Locally-available strict/hermetic Bazel configurations
  • Fewer transitive module dependencies
  • Builds under most --strict mode flags, performed a flag migration
  • Support for projects on Bazel 6
  • Support for projects on Bazel 5
  • Support for projects on Bazel 4 (via legacy rules)
  • Inline split legacy rules
  • Integration tests for hermetic compiler note: coming soon
  • Light integration/unit tests for mapper

Known issues:

  • SDKROOT and VSINSTALLDIR included in env (hermeticity violation)
  • Windows builds still fail due to Visual Studio resolution issues within the Bazel sandbox

How it Works

The mapping generator

There is a new Python script at //tools/scripts/mapping_generator. This script is equipped to generate a file, internal/graalvm_bindist_map.bzl, when run from the command line. The script's logic roughly performs the following:

Tool awareness:

  • Platforms (OS and arch pairs as relates to GVM)
  • JDKs (for example, "Java 20")
  • Components (GraalWasm, GraalPython, Sulong, etc)
  • Distributions (Oracle GraalVM, GraalVM CE, EE, etc)
  • Versions (aligned GraalVM versions and specific releases, more on this later)

Script logic:

  • If needed, resolve available versions or latest version of GraalVM (via GitHub API)
  • Given inputs of (platforms, jdks, components, distributions, versions), generate artifact/hash URL pairs
  • Validate the artifact and hash URL liveness by performing a HEAD request against each
  • Download the resulting set of known-live artifact hash files
  • For each pair of (artifact, hash):
    • Generate a set of standard and extended Bazel constraint tags for each artifact
    • Render a Starlark mapping for the URL, hash, and compatible_with tags
  • Place rendered output in a generated file

But why?
The biggest maintenance burden on these rules is obtaining the artifact hashes for a new GraalVM release, and updating the rules to know about it. With this script, we can at least keep these up to date in a fully repeatable manner. Later we can maybe automate the generation of this file entirely in GitHub Actions, in response to a GraalVM release.

Running the generator
Pass -h to the tool to see flags. Example invocation:

bazel run -- //:generator --tags 20.0.2 17.0.8 -o - | pbcopy

This command will generate URLs for:

- Platforms: macos-x64, macos-aarch64, linux-x64, linux-aarch64, windows-x64
- JDKs: java8, java11, java17, java20
- Components: native-image, js, wasm, python, llvm, ruby
- Distributions: ce, oracle
- Versions: ce-20.0.2, ce-17.0.8, oracle-20.0.2, oracle-17.0.8

Hermetic compiler / toolchain support

Toolchains are now auto-registered according to the latest setup instructions, and can be registered easily from MODULE.bazel when using Bzlmod. The new GraalVM toolchain type (@rules_graalvm//graalvm/toolchain:toolchain_type) enables Bazel to properly resolve GraalVM tooling without resorting to hard-coded repositories or implicit dependencies.

With this step complete, most of the Native Image compilation process can be considered "hermetic," in the sense that inputs are largely controlled by Bazel and accessible to the end-user. There is one place where this remains a challenge, and that is the env for compiling native images. Read on for more.

Compilation environment

On Linux this is really not a problem, but on macOS and Windows, there is some required environment for building native images:

  • On macOS:
    • Either BAZEL_USE_CPP_ONLY_TOOLCHAIN=1 must be set, or
    • The following env variables are consulted and made available to native-image, which uses them to resolve toolchains:
    • DEVELOPER_DIR
    • SDKROOT
  • On Windows:
    • Bazel must be configured to properly connect to Visual Studio's toolchain
    • The following env variables are consulted and made available to native-image, which uses them to resolve toolchains:
      • VSINSTALLDIR
      • MSVC
      • LIB
      • INCLUDE

In some cases, it may be necessary to make these values available to the action execution environment via .bazelrc settings:

# Need these env vars on Windows
build --action_env=INCLUDE
build --action_env=MSVC
build --action_env=LIB

Generally speaking, Bazel is designed to resolve the Xcode toolchain without the BAZEL_USE_CPP_ONLY_TOOLCHAIN env var set, so you may not need to provide action_env flags for, say, DEVELOPER_DIR.

Use of PATH

The native-image tool sometimes has issues invoking the wrappers provided by Bazel for native compilers. In particular this is true on Windows, where Bazel provides a Batch file, which may actually be stubbed to unconditionally fail if Bazel cannot locate the Visual Studio toolchain properly.

Ultimately this means that configuring Bazel to speak properly to Visual Studio can be quite difficult, when in fact users of these rules may not care how their native binaries are compiled. Bazel doesn't need to invoke Visual Studio directly, but native-image does, and it expects to find it on the PATH.

For these reasons, the rules are set (by adjustable default) to unconditionally provide a full PATH to the native-image tool, and withhold the --native-compiler-path variable which normally points to Bazel's wrapper. This conveniently skips the wrapper script even if it is stubbed to fail. If you would prefer to use a fully hermetic toolchain, this functionality can be overridden with the native_image_tool or pass_compiler_path attributes for the native_image rule (both the legacy and modern rules support the new attributes).

@sgammon sgammon added feature Mainline feature work native-image Features and issues relating to the Native Image tool labels Aug 18, 2023
@sgammon sgammon added this to the 1.0.0 milestone Aug 18, 2023
@sgammon sgammon self-assigned this Aug 18, 2023
@sgammon sgammon added the 🚧 WIP Work-in-progress, do not merge label Aug 18, 2023
This was linked to issues Aug 18, 2023
@sgammon sgammon force-pushed the feat/mapping_generator branch 7 times, most recently from b96469a to 9e8be03 Compare August 20, 2023 23:38
@sgammon sgammon linked an issue Aug 20, 2023 that may be closed by this pull request
3 tasks
relates to #28

Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
- feat: add mapping generator in python
- feat: re-organize for split new/legacy mappings
- feat: hermetic control of compiler
- fix: fewer non-dev dependencies
- fix: resolution of native image toolchain
- chore: general cleanup and doc
- chore: strict flags, hermetic flags

Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
- feat: calculate set of Bazel platform tags for artifacts
- feat: include platform tags in geneated mappings
- feat: generate set of `rules_graalvm` tags and include
- fix: output mappings file write bug
- chore: general cleanup

Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
- feat: register gvm toolchains automatically in workspaces
- feat: use gvm toolchain to resolve native image bin
- feat: map entire set of gvm sdk files as deps

Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
@sgammon sgammon marked this pull request as ready for review August 22, 2023 23:46
Signed-off-by: Sam Gammon <sam@elide.ventures>
Signed-off-by: Sam Gammon <sam@elide.ventures>
- doc: add notes to component doc about dependencies
- doc: add `hermeticity` doc
- doc: add `windows` doc

Signed-off-by: Sam Gammon <sam@elide.ventures>
@codecov
Copy link

codecov bot commented Aug 23, 2023

Codecov Report

❗ No coverage uploaded for pull request base (main@406fc98). Click here to learn what that means.
Patch has no changes to coverable lines.

❗ Current head 659efb3 differs from pull request most recent head 9497084. Consider uploading reports for the commit 9497084 to get more accurate results

Additional details and impacted files
@@         Coverage Diff          @@
##             main   #35   +/-   ##
====================================
  Coverage        ?     0           
====================================
  Files           ?     0           
  Lines           ?     0           
  Branches        ?     0           
====================================
  Hits            ?     0           
  Misses          ?     0           
  Partials        ?     0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sonarcloud
Copy link

sonarcloud bot commented Aug 23, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@sgammon sgammon merged commit 13084e3 into main Aug 23, 2023
27 checks passed
@sgammon sgammon linked an issue Aug 23, 2023 that may be closed by this pull request
@johnynek
Copy link
Collaborator

Really exciting!

I'll try to upgrade our usage to this today!

@sgammon sgammon removed the 🚧 WIP Work-in-progress, do not merge label Sep 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Mainline feature work native-image Features and issues relating to the Native Image tool
Projects
2 participants