Skip to content

Commit

Permalink
add script to generate zinc native-images, with example usage (#8036)
Browse files Browse the repository at this point in the history
### Problem

A script to automatically create zinc native-images from pants for JVM code is currently being created in #7506. The script will infer macro usages from the compiler in an initial run with the native-image java agent, described in https://github.com/oracle/graal/blob/master/substratevm/CONFIGURE.md. This requires providing a library built from the graal repo on the runtime library search path.

### Solution

- Pipe in `LD_LIBRARY_PATH` or `DYLD_LIBRARY_PATH` into the hermetic zinc invocation.
- Pull down graal and other dependencies by `git clone` for now.
- Use `./pants classmap` to get the targets providing the classes corresponding to macros, and generate a `BUILD` file with a fake `jvm_app()` bundling all those targets.

### Result

A script works to generate zinc images for arbitrary scala code!
  • Loading branch information
cosmicexplorer committed Jul 24, 2019
1 parent 81993f4 commit b76a9d2
Show file tree
Hide file tree
Showing 7 changed files with 606 additions and 51 deletions.
72 changes: 72 additions & 0 deletions build-support/native-image/README.md
@@ -0,0 +1,72 @@
native-image `META-INF/`
===========================

**NOTE: currently, these instructions will only work when used from the script in this repo [^1]! Image building in general is currently blocked on https://github.com/oracle/graal/issues/1448.*

*This directory contains special configuration files recognized by the `native-image` tool when embedded in a jar. This embedded configuration allows any user with some version of the Graal VM to fetch the `org.pantsbuild:zinc-compiler` jar and run `native-image -jar` to produce a native executable of the pants zinc wrapper without any additional arguments.*

The Graal VM's `native-image` tool[^2] converts JVM bytecode to a native compiled executable[^3] executing via the Substrate VM. The `native-image` tool is not yet immediately compatible with all JVM code[^4], but the tool is stable and featureful enough to successfully build many codebases with a mixture of (mostly) automated and (some) manual configuration.

The `native-image` tool accepts JSON configuration files which overcome some of the current limitations[^4]:
1. `reflect-config.json`: The largest file, this would typically be generated automatically[^6].
- Some manual edits may be necessary[^7].
2. `resource-config.json`: Should also be generated automatically[^6].
- Some manual edits may be necessary[^8].
3. `substitutions.json`: Always manually generated, mocks out code that otherwise won't compile or run via `native-image`[^9].
4. `native-image.properties`: Read by the `native-image` tool to get command-line arguments to use when executing on the jar, and accepts a `${.}` template syntax[^5].
- `native-image --help` describes many high-level command-line options which are converted to longer-form options when `native-image` is executing. `native-image --expert-options-all` describes all options.
- The arguments `--enable-all-security-services`, `--allow-incomplete-classpath`, and `--report-unsupported-elements-at-runtime` are going to be desired for almost all builds.
- The argument `--delay-class-initialization-to-runtime` delays initialization of classes until runtime (the `native-image` tool otherwise executes all static initializers at build time). Determining the appropriate classes to mark in this way can sometimes be a manual process[^10].

Note that all json resource files can be inspected and transformed with the `jq` command-line tool [^11]. This is what is done in the script in [^1].

# Automatically generating a zinc native-image for your repo

**NOTE:** This will allow creating a zinc native-image of code containing macros, but the image currently has to be manually regenerated whenever adding or modifying macros!

The script in [^1] will run on OSX or Linux (the `ubuntu:latest` container on docker hub is known to work). The script can be run to test out native-image zinc compiles as follows:

``` bash
$ cd /your/pants/codebase
$ /path/to/your/pants/checkout/build-support/native-image/generate-native-image-for-pants-targets.bash ::
```

After a long bootstrap process, the arguments are forwarded to a pants invocation which runs with reflection tracing. The `NATIVE_IMAGE_EXTRA_ARGS` environment variable can be used to add any necessary arguments to the `native-image` invocation (the scalactic resource bundle is necessary for any repo using scalatest). The above will build an image suitable for all targets in the repo (`::`) for the current platform. The script will generate a different output file `zinc-pants-native-{Darwin,Linux}` depending upon whether it is run on OSX or Linux.

*Note:* if the native-image build fails, and you see the following in the output:
``` bash
Caused by: java.lang.VerifyError: class scala.tools.nsc.Global overrides final method isDeveloper.()Z
```

Please re-run the script at most two more times. This can occur nondeterministically for some reason right now. https://github.com/pantsbuild/pants/issues/7955 is intended to cover solving this issue, along with others.


## Updating the zinc native-image

This is a developing story. Currently, the script will idempotently create or update a directory in the pwd named `generated-reflect-config/` with the results of the reflection tracing. This directory will contain 5 files -- 4 json config files, and one `BUILD` file. This directory can be checked in and updated over time -- subsequent runs of the script will never remove information from previous runs.

*Note:* the script does *not* need to be run over the whole repo (`::`) at once! Since the compile run with reflection tracing has parallelism set to 1, this initial run can take a long time. Initially, it's possible to run the script over batched sections of your repo (e.g. using `./pants list` and `--target-spec-file`), until all targets are covered.

The image may begin failing to build -- this can happen e.g. if you used to use a macro in your repo, but no longer do, the build will fail when scanning reflect config entries. If this happens, you can always `rm -rfv generated-reflect-config/` and run the script again (although you will have to rebuild it for your whole repo again in this case).

[^1]: ./generate-native-image-for-pants-targets.bash

[^2]: https://github.com/oracle/graal/tree/master/substratevm

[^3]: https://www.graalvm.org/docs/reference-manual/aot-compilation/

[^4]: https://github.com/oracle/graal/blob/master/substratevm/LIMITATIONS.md

[^5]: https://medium.com/graalvm/simplifying-native-image-generation-with-maven-plugin-and-embeddable-configuration-d5b283b92f57

[^6]: https://github.com/oracle/graal/blob/master/substratevm/CONFIGURE.md

[^7]: https://github.com/oracle/graal/blob/master/substratevm/REFLECTION.md

[^8]: https://github.com/oracle/graal/blob/master/substratevm/RESOURCES.md

[^9]: https://github.com/pantsbuild/pants/tree/master/src/scala/org/pantsbuild/zinc/compiler/native-image-substitutions

[^10]: https://medium.com/graalvm/understanding-class-initialization-in-graalvm-native-image-generation-d765b7e4d6ed

[^11]: https://stedolan.github.io/jq/
132 changes: 132 additions & 0 deletions build-support/native-image/build-zinc-native-image.bash
@@ -0,0 +1,132 @@
# Functions to simplify the multiple complex bootstrapping techniques currently necessary to build a
# zinc native-image.
# TODO: This will be made automatic in pants via https://github.com/pantsbuild/pants/pull/6893.

# TODO: build off of a more recent graal sha (more recent ones fail to build scalac and more): see
# https://github.com/oracle/graal/issues/1448.

# shellcheck source=build-support/native-image/utils.bash
source "${SCRIPT_DIR}/utils.bash"

# FUNCTIONS

function _get_coursier_impl {
if [[ ! -f ./coursier ]]; then
curl -Lo coursier https://git.io/coursier-cli || return "$?"
chmod +x coursier
./coursier --help || return "$?"
fi >&2
normalize_path_check_file coursier
}

function get_coursier {
do_within_cache_dir _get_coursier_impl
}

function bootstrap_environment {
if is_osx; then
# Install `realpath`.
if ! hash realpath 2>/dev/null; then
ensure_has_executable \
'brew' \
"homebrew must be installed to obtain the 'coreutils' package, which contains 'realpath'." \
"Please see https://brew.sh/."
brew install coreutils
fi
else
# Install necessary tools for the ubuntu:latest container on docker hub.
apt-get update
apt-get -y install \
g{cc,++} git curl aptitude zlib1g-dev make python{,3} git python3-pip \
pkg-config libssl-dev libpython-dev openjdk-8-jdk

# FIXME: Otherwise pants will fail at bootstrap with an unrecognized symbol `distutils.spawn`.
pip3 install setuptools==40.0.0
fi
mkdir -pv "$NATIVE_IMAGE_BUILD_CACHE_DIR"
}

function get_base_native_image_build_script_graal_checkout {
# TODO(#7955): See https://github.com/oracle/graal/issues/1448 and
# https://github.com/pantsbuild/pants/issues/7955 to cover using a released graal instead of a
# fork!
# From https://github.com/cosmicexplorer/graal/tree/graal-make-zinc-again!
do_within_cache_dir clone_repo_somewhat_idempotently \
graal/ \
https://github.com/cosmicexplorer/graal \
ac6f6dd4783cece28f1696b413f02c3776753890
}

function clone_mx {
# From https://github.com/graalvm/mx/tree/master!
do_within_cache_dir clone_repo_somewhat_idempotently \
mx/ \
https://github.com/graalvm/mx \
c01eef6e31cd5655b1f0682c445f4ed50aa5c05e
}

function extract_openjdk_jvmci {
if is_osx; then
outdir="openjdk1.8.0_202-jvmci-0.59/Contents/Home"
url='https://github.com/graalvm/openjdk8-jvmci-builder/releases/download/jvmci-0.59/openjdk-8u202-jvmci-0.59-darwin-amd64.tar.gz'
else
outdir="openjdk1.8.0_202-jvmci-0.59"
url='https://github.com/graalvm/openjdk8-jvmci-builder/releases/download/jvmci-0.59/openjdk-8u202-jvmci-0.59-linux-amd64.tar.gz'
fi
do_within_cache_dir extract_tgz \
"$outdir" \
"$url"
}

function get_substratevm_dir {
echo "$(get_base_native_image_build_script_graal_checkout)/substratevm"
}

function build_native_image_tool {
get_substratevm_dir \
| pushd_into_command_line_with_side_effect \
mx build
}

function fetch_scala_compiler_jars {
# TODO: the scala version used to build the pants zinc wrapper must also be changed if this is!
version='2.12.8'
"$(get_coursier)" fetch \
org.scala-lang:scala-{compiler,library,reflect}:"$version"
}

function fetch_pants_zinc_wrapper_jars {
pants_zinc_compiler_version='0.0.15'
pants_underlying_zinc_dependency_version='1.1.7'
# TODO: `native-image` emits a warning on later protobuf versions, which the pantsbuild
# `zinc-compiler` artifact will pull in unless we exclude them here and also explicitly add a
# protobuf artifact. We should fix this by making the change to the org.pantsbuild:zinc-compiler
# artifact!
"$(get_coursier)" fetch \
"org.pantsbuild:zinc-compiler_2.12:${pants_zinc_compiler_version}" \
"org.scala-sbt:compiler-bridge_2.12:${pants_underlying_zinc_dependency_version}" \
--exclude com.google.protobuf:protobuf-java \
com.google.protobuf:protobuf-java:2.5.0
}

function create_zinc_image {
local scala_compiler_jars="$(fetch_scala_compiler_jars | merge_jars)"
local native_image_suite="$(do_within_cache_dir build_native_image_tool)"
local pants_zinc_wrapper_jars="$(fetch_pants_zinc_wrapper_jars | merge_jars)"

local expected_output="zinc-pants-native-$(uname)"

>&2 time mx -p "$native_image_suite" native-image \
-cp "${scala_compiler_jars}:${pants_zinc_wrapper_jars}" \
org.pantsbuild.zinc.compiler.Main \
-H:Name="$expected_output" \
-J-Xmx7g -O9 \
--verbose -H:+ReportExceptionStackTraces \
--no-fallback \
-Djava.io.tmpdir=/tmp \
"$@" \
|| return "$?"

normalize_path_check_file "$expected_output" \
'pants zinc native-image failed to generate!'
}

0 comments on commit b76a9d2

Please sign in to comment.