New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fall back to most recent known osx version for bootstrap binaries #6681

Merged
merged 5 commits into from Oct 28, 2018

Conversation

Projects
None yet
3 participants
@cosmicexplorer
Contributor

cosmicexplorer commented Oct 25, 2018

Problem

See the end of #6591 -- while we fall back to the most recent OSX version to download binaries for use with e.g. NativeTool, this fix does not apply for the binaries we fetch in shell scripts to bootstrap the pants rust code, leading to the following on OSX Mojave (10.14) in the pants repo:

Downloading https://binaries.pantsbuild.org/bin/cmake/mac/10.14/3.9.5/cmake.tar.gz ...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (22) The requested URL returned error: 404

Solution

  • Append a main() method at the bottom of binary_util.py which can be invoked in the bootstrap phase by download_binary.sh.

Result

I can run ./pants goals on my OSX 10.14 laptop in the pants repo after an rm -rf ~/.cache/pants.

@cosmicexplorer

This comment has been minimized.

Contributor

cosmicexplorer commented Oct 25, 2018

There's an open question of how we test this fix (and anything around the rust bootstrap scripts) in CI.

@jsirois

This comment has been minimized.

Member

jsirois commented Oct 25, 2018

There's an open question of how we test this fix (and anything around the rust bootstrap scripts) in CI.

It's crude, but nuke the appropriate Travis caches, which include the Rust bootstrapping cache dirs.

@jsirois

There is too much code in the heredoc, should be lifted into python source code and possibly tested, but this quick POC works and leverages all existing BinaryUtil tests:

diff --git a/build-support/bin/download_binary.sh b/build-support/bin/download_binary.sh
index 268b29113..c226ec7c2 100755
--- a/build-support/bin/download_binary.sh
+++ b/build-support/bin/download_binary.sh
@@ -1,4 +1,4 @@
-#!/bin/bash -eu
+#!/bin/bash -e
 
 # Downloads the specified binary at the specified version from the specified binary host if it's not already present.
 # Uses a reimplementation of the BinaryUtils mechanism.
@@ -11,40 +11,42 @@ REPO_ROOT=$(cd $(dirname "${BASH_SOURCE[0]}") && cd ../.. && pwd -P)
 
 # Defines:
 # + CACHE_ROOT: The pants cache directory, ie: ~/.cache/pants.
-source "${REPO_ROOT}/build-support/common.sh"
+source "${REPO_ROOT}/build-support/pants_venv"
 
-if (( $# != 3 && $# != 4 )); then
+if (( $# != 2 && $# != 3 )); then
   die "$(cat << USAGE
 Usage: $0 host util_name version [filename]
 Example: $0 binaries.pantsbuild.org go 1.7.3 go.tar.gz
 USAGE
 )"
 fi
-readonly host="$1"
-readonly util_name="$2"
-readonly version="$3"
-readonly filename="${4:-${util_name}}"
-
-os=$("${REPO_ROOT}/build-support/bin/get_os.sh")
-readonly path="bin/${util_name}/${os}/${version}/${filename}"
-readonly cache_path="${CACHE_ROOT}/${path}"
-
-if [[ ! -f "${cache_path}" ]]; then
-  mkdir -p "$(dirname "${cache_path}")"
-
-  readonly binary_url="https://${host}/${path}"
-  echo >&2 "Downloading ${binary_url} ..."
-  curl --fail "${binary_url}" > "${cache_path}".tmp
-  mv "${cache_path}"{.tmp,}
-  if [[ "${filename}" == *tar.gz ]]; then
-    tar -C "$(dirname "${cache_path}")" -xzf "${cache_path}"
-  else
-    chmod 0755 "${cache_path}"
-  fi
-fi
-
-to_output="${cache_path}"
-if [[ "${filename}" == *tar.gz ]]; then
-  to_output="$(dirname "${to_output}")"
-fi
-echo "${to_output}"
+readonly util_name="$1"
+readonly version="$2"
+readonly filename="${3:-${util_name}}"
+
+activate_pants_venv 1>&2
+PYTHONPATH="${REPO_ROOT}/src/python" python << SCRIPT
+import os
+from functools import reduce
+
+from pants.binaries.binary_util import BinaryUtil
+from pants.option.global_options import GlobalOptionsRegistrar
+from pants.option.options_bootstrapper import OptionsBootstrapper
+from pants.subsystem.subsystem import Subsystem
+
+
+options_bootstrapper = OptionsBootstrapper()
+subsystems = (GlobalOptionsRegistrar, BinaryUtil.Factory)
+known_scope_infos = reduce(set.union, (ss.known_scope_infos() for ss in subsystems), set())
+options = options_bootstrapper.get_full_options(known_scope_infos)
+for subsystem in subsystems:
+  subsystem.register_options_on_scope(options)
+Subsystem.set_options(options)
+
+binary_util = BinaryUtil.Factory.create()
+path = binary_util.select_binary(supportdir='bin/${util_name}',
+                                 version='${version}',
+                                 name='${filename}')
+
+print(os.path.dirname(path) if path.endswith('.tar.gz') else path)
+SCRIPT
diff --git a/build-support/bin/native/cargo.sh b/build-support/bin/native/cargo.sh
index 2bf4dcfb5..69bd03493 100755
--- a/build-support/bin/native/cargo.sh
+++ b/build-support/bin/native/cargo.sh
@@ -16,11 +16,11 @@ download_binary="${REPO_ROOT}/build-support/bin/download_binary.sh"
 
 # The following is needed by grpcio-sys and we have no better way to hook its build.rs than this;
 # ie: wrapping cargo.
-cmakeroot="$("${download_binary}" "binaries.pantsbuild.org" "cmake" "3.9.5" "cmake.tar.gz")"
-goroot="$("${download_binary}" "binaries.pantsbuild.org" "go" "1.7.3" "go.tar.gz")/go"
+cmakeroot="$("${download_binary}" "cmake" "3.9.5" "cmake.tar.gz")"
+goroot="$("${download_binary}" "go" "1.7.3" "go.tar.gz")/go"
 
 # Code generation in the bazel_protos crate needs to be able to find protoc on the PATH.
-protoc="$("${download_binary}" "binaries.pantsbuild.org" "protobuf" "3.4.1" "protoc")"
+protoc="$("${download_binary}" "protobuf" "3.4.1" "protoc")"
 
 export GOROOT="${goroot}"
 export PATH="${cmakeroot}/bin:${goroot}/bin:${CARGO_HOME}/bin:$(dirname "${protoc}"):${PATH}"
@cosmicexplorer

This comment has been minimized.

Contributor

cosmicexplorer commented Oct 25, 2018

Oh now this is much much better. I'll drop the python into a separate file and look at possible testing.

@Eric-Arellano

Thank you for fixing this 🙌

@@ -4,6 +4,11 @@
# Uses a reimplementation of the BinaryUtils mechanism.
# Outputs an absolute path to the binary, whether fetched or already present, to stdout.
# Use this version as a fallback for platform-specific binaries to download if we are on a version
# of OSX that we don't explicitly have binaries for yet.
# TODO: Keep this up to date automatically somehow!

This comment has been minimized.

@Eric-Arellano

Eric-Arellano Oct 25, 2018

Contributor

Should probably link to a GitHub issue here

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Oct 25, 2018

Contributor

Yes! This is obviated with the new python approach but this is correct.

if [[ ! -f "${cache_path}" ]]; then
mkdir -p "$(dirname "${cache_path}")"
readonly binary_url="https://${host}/${path}"
echo >&2 "Downloading ${binary_url} ..."
curl --fail "${binary_url}" > "${cache_path}".tmp
readonly output_tmp_file="${cache_path}.tmp"

This comment has been minimized.

@Eric-Arellano

Eric-Arellano Oct 25, 2018

Contributor

Woah didn't know about readonly #FunctionalBashScripts 😎

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Oct 25, 2018

Contributor

There are some downsides which actually make it incredibly hard to use readonly correctly because bash is a horrible piece of software -- see build-support/bin/check_shell.sh, which runs in the pre-commit hook. zsh has never had any of these issues.

curl_output_file_with_fail "${binary_url}" "$output_tmp_file" \
2> >(tee >&2 "$curl_err_tmp_file") \
|| if grep -q -F 'The requested URL returned error: 404' "$curl_err_tmp_file" \

This comment has been minimized.

@Eric-Arellano

Eric-Arellano Oct 25, 2018

Contributor

Do -q and -F have long flags?

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Oct 25, 2018

Contributor

They do (I think --quiet/--silent and --fixed-strings?). They should absolutely have been used here, as in any script in this repo.

|| if grep -q -F 'The requested URL returned error: 404' "$curl_err_tmp_file" \
&& [[ -n "$maybe_alternate_os" ]]; then
echo >&2 "Failed initial curl, trying an alternate url..."
alternate_intermediate_path="$(make_download_path_for_os "$maybe_alternate_os")"

This comment has been minimized.

@Eric-Arellano

Eric-Arellano Oct 25, 2018

Contributor

Should this also be readonly?

Suggested change Beta
alternate_intermediate_path="$(make_download_path_for_os "$maybe_alternate_os")"
readonly alternate_intermediate_path="$(make_download_path_for_os "$maybe_alternate_os")"

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Oct 25, 2018

Contributor

This is one of the things check_shell.sh disallows and leads to very strange correctness issues if you do. Like I said, bash is a horrible piece of software because it doesn't care about bugs like this (and there are 1300 versions in play at once).

readonly cache_path="${CACHE_ROOT}/${path}"
maybe_alternate_os="$([[ "$(uname)" == 'Darwin' ]] \

This comment has been minimized.

@Eric-Arellano

Eric-Arellano Oct 25, 2018

Contributor

This function only adds compatibility for macOS, right? It wouldn't help with Windows or some Linux distro? If so, consider renaming to something like maybe_macos to be more explicit.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Oct 25, 2018

Contributor

This is absolutely a correct comment. In the commit I just pushed, we use BinaryUtil to avoid having to duplicate this regardless, but this is definitely an edit I would have made.

@cosmicexplorer

This comment has been minimized.

Contributor

cosmicexplorer commented Oct 25, 2018

@Eric-Arellano I'll reply in full to your review because bash is strange and nasty but will be pushing a commit which uses @jsirois's approach in his earlier review which delegates to BinaryUtil, obviating some of the issues you pointed out.

@Eric-Arellano

This python code makes me much happier haha 🐍

path = binary_util.select(binary_request)
print(path)

This comment has been minimized.

@Eric-Arellano

Eric-Arellano Oct 25, 2018

Contributor

Is this supposed to be here, or leftover from debugging? Maybe print a message before the path to explain what it’s for.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Oct 25, 2018

Contributor

This is invoked by a shell script to read the output! There could be a comment here -- I will add that to make it clear.

def main():
util_name, version, filename = tuple(sys.argv[1:])

This comment has been minimized.

@Eric-Arellano

Eric-Arellano Oct 25, 2018

Contributor

argparse is much more robust and has the benefit of allowing —help. It’s fine for what we have now, but will make it harder to do things like extend this script down the road

Have you worked with it before? If not, I can give the code to get it working! (I’m on phone atm)

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Oct 25, 2018

Contributor

I have used it once years ago and not long after I ended up fixing the coffeescript argument parsing because I had been thinking about argument parsing so much: jashkenas/coffeescript#3946. I would love the code to do so -- that feels like it's probably half of the point of making this a python script. No rush to do that.

This comment has been minimized.

@jsirois

jsirois Oct 25, 2018

Member

Talk to your team-mates, in particular Yi: Fire.fire(a_method_with_the_obvious_signature)

This comment has been minimized.

@Eric-Arellano

Eric-Arellano Oct 25, 2018

Contributor

Argparse is one of my favorite parts of Python. It's so easy to make high quality scripts with it.

Below is a demo of how it works so that you can see things like optional arguments, which it looks like we don't actually use. Should be easy to convert to this actual program, which I'm happy to help with also.

import argparse

def create_parser():
  parser = argparse.create_parser(description="My script now has fancy CLI options!")
  parser.add_argument("util_name", help="...")
  parser.add_argument("-o", "--optional", help="...")
  return parser

def main():
  args = create_parser().parse_args()
  supportdir = 'bin/{}'.format(args.util_name)

If you don't like this line args = create_parser().parse_args(), you can have create_parser() instead be parse_args() and simply return the args. I only set it up this way to make create_parse() a pure function & keep side effects in main().

@jsirois

If bootstrap_binary_util.py was housed under src/python/pants/binaries/ - perhaps just at the bottom of binary_util.py - actually testing the script would be that much easier.

def main():
util_name, version, filename = tuple(sys.argv[1:])

This comment has been minimized.

@jsirois

jsirois Oct 25, 2018

Member

Talk to your team-mates, in particular Yi: Fire.fire(a_method_with_the_obvious_signature)

# BinaryRequest requires the `name` field to be provided without an extension, as it appends the
# archiver's extension if one is provided, so we have to remove it here.
filename = re.sub(
r'\.{}\Z'.format(re.escape(archiver_for_current_binary.extension)),

This comment has been minimized.

@jsirois

jsirois Oct 25, 2018

Member

I had to look up \Z, which was fun, I did not know $ had a synonym or vice-versa - but, if you got an acrhiver back, you know you can just lop off the filename verbatim - perhaps filename[:-(len(extension)+1)] or filename.rsplit('.' + extension, 1)[0]?

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Oct 28, 2018

Contributor

Defaults for multiline modes differ across languages and other languages such as emacs lisp have even more customized extensions for e.g. symbol boundaries so it's easier for me to be explicit than to wonder whether a stray $ is going to slurp a whole file in, or worse, make repeated regex applications start at increasingly more incorrect positions (realizing now that's also more commonly performed in elisp than elsewhere).

This particular regex replace, however, was definitely a 3-4 second solution -- either of what you suggested seem more appropriate and I will make that fix.

This comment has been minimized.

@Eric-Arellano

Eric-Arellano Oct 28, 2018

Contributor

I don’t fully follow this conversation, but if you go with regex consider a very brief comment explaining what Z means

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Oct 28, 2018

Contributor

Sorry -- "being explicit" refers to using \Z compulsively instead of the more recognizable $ (although if we wish to be truly pedantic, filenames can have newlines, so this should really also have a TODO to replace all such filename manipulations with methods in util or something so we can contain the pedantry).

cosmicexplorer added some commits Oct 28, 2018

@cosmicexplorer

This comment has been minimized.

Contributor

cosmicexplorer commented Oct 28, 2018

Moved the main method to the bottom of binary_util.py, then added an integration test.

@Eric-Arellano

Argparse code looks great! I like the nargs part. If you haven’t already, run the script with —help 😎

@cosmicexplorer

This comment has been minimized.

Contributor

cosmicexplorer commented Oct 28, 2018

I like the nargs part.

I would take credit but it's straight from https://docs.python.org/3/library/argparse.html#usage! Love python.

@cosmicexplorer cosmicexplorer merged commit f2e335a into pantsbuild:master Oct 28, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

cosmicexplorer added a commit that referenced this pull request Nov 16, 2018

fall back to most recent known osx version for bootstrap binaries (#6681
)

### Problem

See the end of #6591 -- while we fall back to the most recent OSX version to download binaries for use with e.g. NativeTool, this fix does not apply for the binaries we fetch in shell scripts to bootstrap the pants rust code, leading to the following on OSX Mojave (10.14) in the pants repo:

```
Downloading https://binaries.pantsbuild.org/bin/cmake/mac/10.14/3.9.5/cmake.tar.gz ...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (22) The requested URL returned error: 404
```

### Solution

- Append a `main()` method at the bottom of `binary_util.py` which can be invoked in the bootstrap phase by `download_binary.sh`.

### Result

I can run `./pants goals` on my OSX 10.14 laptop in the pants repo after an `rm -rf ~/.cache/pants`.

@cosmicexplorer cosmicexplorer changed the title from fall back to most recent known osx version for bootstrap binaries to Fall back to most recent known osx version for bootstrap binaries Nov 16, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment