Skip to content

Commit

Permalink
Implement bin/report (#1597)
Browse files Browse the repository at this point in the history
Implements the `bin/report` build report API, as a replacement for log
based metrics (which stopped working some time ago with the change in
internal observability vendor).

This will help with varies upcoming Python buildpack changes, such as the
EOL of Python 3.8, assessing usage levels of different package managers
(wrt Poetry addition, and future of Pipenv for CNB), removal of the legacy
sqlite feature etc.

This implementation is based on that used by the Node.js buildpack, which
is currently the only buildpack that implements `bin/report`. There are a
number of rough edges with the Node.js implementation that in an ideal
world we'd improve upon. However, given that that implementation works,
is actively used in production, and CNBs are our priority moving forwards,
I've chosen not to make many changes to that design.

See:
https://github.com/heroku/heroku-buildpack-nodejs/blob/main/bin/report
https://github.com/heroku/heroku-buildpack-nodejs/blob/main/lib/metadata.sh
https://github.com/heroku/heroku-buildpack-nodejs/blob/main/lib/kvstore.sh

GUS-W-8047826.
  • Loading branch information
edmorley committed Jun 17, 2024
1 parent beb3328 commit 8c46490
Show file tree
Hide file tree
Showing 19 changed files with 387 additions and 126 deletions.
17 changes: 17 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,3 +54,20 @@ jobs:
- name: Run Hatchet integration tests
# parallel_split_test runs rspec in parallel, with concurrency equal to PARALLEL_SPLIT_TEST_PROCESSES.
run: bundle exec parallel_split_test spec/hatchet/

container-test:
runs-on: ubuntu-latest
defaults:
run:
# Work around lack of TTY causing errors when using `docker run -it`:
# https://github.com/actions/runner/issues/241
shell: 'script -q -e -c "bash -eo pipefail {0}"'
steps:
- name: Checkout
uses: actions/checkout@v4
# These test both the local development `make run` workflow and that `bin/report` completes successfully
# for both passing and failing builds (since `bin/report` can't easily be tested via Hatchet tests).
- name: Run buildpack using default app fixture
run: make run
- name: Run buildpack using an app fixture that's expected to fail
run: make run FIXTURE=spec/fixtures/python_version_invalid/
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@

- Removed export of `Pipfile.lock` to `requirements.txt` during the build. ([#1593](https://github.com/heroku/heroku-buildpack-python/pull/1593))
- Removed internal `pipenv-to-pip` script that was unintentionally exposed onto `PATH`. ([#1593](https://github.com/heroku/heroku-buildpack-python/pull/1593))
- Stopped exposing the internal `BIN_DIR`, `EXPORT_PATH` and `PROFILE_PATH` environment variables to `bin/{pre,post}_compile` and other subprocesses. ([#1595](https://github.com/heroku/heroku-buildpack-python/pull/1595)).
- Stopped exposing the internal `BIN_DIR`, `BPLOG_PREFIX`, `EXPORT_PATH` and `PROFILE_PATH` environment variables to `bin/{pre,post}_compile` and other subprocesses. ([#1595](https://github.com/heroku/heroku-buildpack-python/pull/1595) and [#1597](https://github.com/heroku/heroku-buildpack-python/pull/1597))
- Implemented the `bin/report` build report API and removed log based metrics. ([#1597](https://github.com/heroku/heroku-buildpack-python/pull/1597))

## [v251] - 2024-06-07

Expand Down
19 changes: 13 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# These targets are not files
.PHONY: lint lint-scripts lint-ruby compile publish
.PHONY: lint lint-scripts lint-ruby run publish

STACK ?= heroku-24
FIXTURE ?= spec/fixtures/python_version_unspecified
Expand All @@ -15,11 +15,18 @@ lint-scripts:
lint-ruby:
@bundle exec rubocop

compile:
@echo "Running compile using: STACK=$(STACK) FIXTURE=$(FIXTURE)"
@echo
@docker run --rm -it --user root -v $(PWD):/src:ro -e "STACK=$(STACK)" -w /buildpack "$(STACK_IMAGE_TAG)" \
bash -c 'cp -r /src/{bin,requirements,vendor} /buildpack && cp -r /src/$(FIXTURE) /build && mkdir /cache /env && bin/compile /build /cache /env'
run:
@echo "Running buildpack using: STACK=$(STACK) FIXTURE=$(FIXTURE)"
@docker run --rm -it -v $(PWD):/src:ro --tmpfs /app -e "HOME=/app" -e "STACK=$(STACK)" "$(STACK_IMAGE_TAG)" \
bash -eo pipefail -c '\
mkdir /tmp/buildpack /tmp/build /tmp/cache /tmp/env \
&& cp -r /src/{bin,lib,requirements,vendor} /tmp/buildpack \
&& cp -rT /src/$(FIXTURE) /tmp/build \
&& cd /tmp/buildpack \
&& echo -e "\n~ Detect:" && bash ./bin/detect /tmp/build /tmp/cache /tmp/env \
&& echo -e "\n~ Compile:" && { ./bin/compile /tmp/build /tmp/cache /tmp/env || true; } \
&& echo -e "\n~ Report:" && ./bin/report /tmp/build /tmp/cache /tmp/env \
'
@echo

publish:
Expand Down
42 changes: 22 additions & 20 deletions bin/compile
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@

set -eo pipefail

# Used by buildpack-stdlib's metrics features.
export BPLOG_PREFIX="buildpack.python"
export BUILDPACK_LOG_FILE=${BUILDPACK_LOG_FILE:-/dev/null}

[[ -n "$BUILDPACK_XTRACE" ]] && set -o xtrace

BUILD_DIR="${1}"
Expand All @@ -18,6 +14,13 @@ ENV_DIR="${3}"
BUILDPACK_DIR=$(cd "$(dirname "$(dirname "${BASH_SOURCE[0]}")")" && pwd)

source "${BUILDPACK_DIR}/bin/utils"
source "${BUILDPACK_DIR}/lib/metadata.sh"

compile_start_time=$(nowms)

# Initialise metadata store.
meta_init "${CACHE_DIR}" "python"
meta_setup

# Prepend proper path for old-school virtualenv hackery.
# This may not be necessary.
Expand Down Expand Up @@ -141,16 +144,16 @@ if [[ -f runtime.txt ]]; then
# TODO: Refactor this and stop pipenv-python-version using runtime.txt as an API.
PYTHON_VERSION_SOURCE=${PYTHON_VERSION_SOURCE:-"runtime.txt"}
puts-step "Using Python version specified in ${PYTHON_VERSION_SOURCE}"
mcount "version.reason.python.specified"
meta_set "python_version_reason" "specified"
elif [[ -n "${CACHED_PYTHON_VERSION:-}" ]]; then
puts-step "No Python version was specified. Using the same version as the last build: ${CACHED_PYTHON_VERSION}"
echo " To use a different version, see: https://devcenter.heroku.com/articles/python-runtimes"
mcount "version.reason.python.cached"
meta_set "python_version_reason" "cached"
echo "${CACHED_PYTHON_VERSION}" > runtime.txt
else
puts-step "No Python version was specified. Using the buildpack default: ${DEFAULT_PYTHON_VERSION}"
echo " To use a different version, see: https://devcenter.heroku.com/articles/python-runtimes"
mcount "version.reason.python.default"
meta_set "python_version_reason" "default"
echo "${DEFAULT_PYTHON_VERSION}" > runtime.txt
fi

Expand All @@ -174,9 +177,9 @@ fi

# Download / Install Python, from pre-build binaries available on Amazon S3.
# This step also bootstraps pip / setuptools.
(( start=$(nowms) ))
install_python_start_time=$(nowms)
source "${BUILDPACK_DIR}/bin/steps/python"
mtime "python.install.time" "${start}"
meta_time "python_install_duration" "${install_python_start_time}"

# Install Pipenv dependencies, if a Pipfile was provided.
source "${BUILDPACK_DIR}/bin/steps/pipenv"
Expand All @@ -185,29 +188,30 @@ source "${BUILDPACK_DIR}/bin/steps/pipenv"
# This allows for people to ship a setup.py application to Heroku

if [[ ! -f requirements.txt ]] && [[ ! -f Pipfile ]]; then
meta_set "setup_py_only" "true"
echo "-e ." > requirements.txt
else
meta_set "setup_py_only" "false"
fi

# SQLite3 support.
# This sets up and installs sqlite3 dev headers and the sqlite3 binary but not the
# libsqlite3-0 library since that exists on the stack image.
(( start=$(nowms) ))
install_sqlite_start_time=$(nowms)
source "${BUILDPACK_DIR}/bin/steps/sqlite3"
buildpack_sqlite3_install
mtime "sqlite3.install.time" "${start}"
meta_time "sqlite_install_duration" "${install_sqlite_start_time}"

# pip install
# -----------

# Install dependencies with pip (where the magic happens).
(( start=$(nowms) ))
source "${BUILDPACK_DIR}/bin/steps/pip-install"
mtime "pip.install.time" "${start}"

# Support for NLTK corpora.
(( start=$(nowms) ))
nltk_downloader_start_time=$(nowms)
sub_env "${BUILDPACK_DIR}/bin/steps/nltk"
mtime "nltk.download.time" "${start}"
meta_time "nltk_downloader_duration" "${nltk_downloader_start_time}"

# Support for editable installations. Here, we are copying pip–created src directory,
# and copying it into the proper place (the logical place to do this was early, but it must be done here).
Expand All @@ -224,9 +228,9 @@ fi
# This is the cause for the majority of build failures on the Python platform.
# These failures are intentional — if collectstatic (which can be tricky, at times) fails,
# your build fails.
(( start=$(nowms) ))
collectstatic_start_time=$(nowms)
sub_env "${BUILDPACK_DIR}/bin/steps/collectstatic"
mtime "collectstatic.time" "${start}"
meta_time "django_collectstatic_duration" "${collectstatic_start_time}"


# Programmatically create .profile.d script for application runtime environment variables.
Expand Down Expand Up @@ -288,6 +292,4 @@ if [[ -d .heroku/src ]]; then
cp -R .heroku/src "$CACHE_DIR/.heroku/" &> /dev/null || true
fi

# Measure the size of the Python installation.
# shellcheck disable=SC2119
mmeasure 'python.size' "$(measure-size)"
meta_time "total_duration" "${compile_start_time}"
117 changes: 117 additions & 0 deletions bin/report
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
#!/usr/bin/env bash
# Usage: bin/report <build-dir> <cache-dir> <env-dir>

# Produces a build report containing metadata about the build, that's consumed by the build system.
# This script is run for both successful and failing builds, so it should not assume the build ran
# to completion (e.g. Python or other tools may not even have been installed).
#
# Metadata must be emitted to stdout as valid YAML key-value pairs. Any fields that should always
# be typed as a string must be explicitly quoted.
#
# Example valid stdout:
# python_version: 'X.Y.Z'
# python_install_duration: 1.234
#
# Failures in this script don't cause the overall build to fail (and won't appear in user
# facing build logs) to avoid breaking builds unnecessarily / causing confusion. To debug
# issues check the internal build system logs for `buildpack.report.failed` events, or
# when developing run `make compile` in this repo locally, which runs `bin/report` too.

set -euo pipefail

CACHE_DIR="${2}"

# The absolute path to the root of the buildpack.
BUILDPACK_DIR=$(cd "$(dirname "$(dirname "${BASH_SOURCE[0]}")")" && pwd)

# The build system doesn't source the `export` script before running this script, so we have to do
# so manually (if it exists) so that buildpack Python/Pip can be found (if the build succeeded).
# We have to disable Bash error on undefined variables for now, since not all env vars used in the
# export script will be set by default (eg `LIBRARY_PATH`).
EXPORT_FILE="${BUILDPACK_DIR}/export"
if [[ -f "${EXPORT_FILE}" ]]; then
set +u
# shellcheck source=/dev/null
source "${EXPORT_FILE}"
set -u
fi

source "${BUILDPACK_DIR}/lib/metadata.sh"
meta_init "${CACHE_DIR}" "python"

# Emit the key / value pair unquoted to stdout. Skips if the value is empty.
# Based on: https://github.com/heroku/heroku-buildpack-nodejs/blob/main/bin/report
kv_pair() {
local key="${1}"
local value="${2}"
if [[ -n "${value}" ]]; then
echo "${key}: ${value}"
fi
}

# Emit the key / value pair to stdout, safely quoting the string. Skips if the value is empty.
# Based on: https://github.com/heroku/heroku-buildpack-nodejs/blob/main/bin/report
# (Though instead uses single quotes instead of double to avoid escaping issues.)
kv_pair_string() {
local key="${1}"
local value="${2}"
if [[ -n "${value}" ]]; then
# Escape any existing single quotes, which for YAML means replacing `'` with `''`.
value="${value//\'/\'\'}"
echo "${key}: '${value}'"
fi
}

STRING_FIELDS=(
django_collectstatic
failure_reason
nltk_downloader
package_manager
pip_version
pipenv_version
python_version_major
python_version_reason
python_version
setuptools_version
wheel_version
)

# We don't want to quote numeric or boolean fields.
ALL_OTHER_FIELDS=(
dependencies_install_duration
django_collectstatic_duration
nltk_downloader_duration
pipenv_has_lockfile
post_compile_hook
post_compile_hook_duration
pre_compile_hook
pre_compile_hook_duration
python_install_duration
setup_py_only
sqlite_install_duration
total_duration
)

for field in "${STRING_FIELDS[@]}"; do
kv_pair_string "${field}" "$(meta_get "${field}")"
done

for field in "${ALL_OTHER_FIELDS[@]}"; do
kv_pair "${field}" "$(meta_get "${field}")"
done

# If the build failed, pip might not have been installed yet.
if command -v pip >/dev/null; then
# Determine pysqlite3 usage since it's the only package that requires the sqlite3 headers.
if pip show pysqlite3 &>/dev/null; then
kv_pair pysqlite3_installed true
else
kv_pair pysqlite3_installed false
fi

if pip show pysqlite3-binary &>/dev/null; then
kv_pair pysqlite3_binary_installed true
else
kv_pair pysqlite3_binary_installed false
fi
fi
24 changes: 14 additions & 10 deletions bin/steps/collectstatic
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,21 @@ set -eo pipefail
BUILDPACK_DIR=$(cd "$(dirname "$(dirname "$(dirname "${BASH_SOURCE[0]}")")")" && pwd)
source "${BUILDPACK_DIR}/bin/utils"

# Required for `meta_set`.
source "${BUILDPACK_DIR}/lib/metadata.sh"
# shellcheck disable=SC2153
meta_init "${CACHE_DIR}" "python"

if [[ -f .heroku/collectstatic_disabled ]]; then
puts-step "Skipping Django collectstatic since the file '.heroku/collectstatic_disabled' exists."
puts-warn "This approach is deprecated, please set the env var DISABLE_COLLECTSTATIC=1 instead."
mcount "warnings.collectstatic_disabled_file"
mcount "django.collectstatic.disabled"
meta_set "django_collectstatic" "disabled-file"
exit 0
fi

if [[ "${DISABLE_COLLECTSTATIC:-0}" != "0" ]]; then
puts-step "Skipping Django collectstatic since the env var DISABLE_COLLECTSTATIC is set."
mcount "django.collectstatic.disabled"
meta_set "django_collectstatic" "disabled-env-var"
exit 0
fi

Expand All @@ -41,11 +45,11 @@ MANAGE_FILE=${MANAGE_FILE:-fakepath}

if [[ ! -f "${MANAGE_FILE}" ]]; then
puts-step "Skipping Django collectstatic since no manage.py file found."
mcount "django.collectstatic.no_manage_py"
meta_set "django_collectstatic" "skipped-no-manage-py"
exit 0
fi

mcount "django.collectstatic.enabled"
meta_set "django_collectstatic" "enabled"

puts-step "$ python $MANAGE_FILE collectstatic --noinput"

Expand All @@ -66,15 +70,15 @@ fi

# Display a warning if collectstatic failed.
if grep -q 'SyntaxError' "$COLLECTSTATIC_LOG"; then
mcount "failure.collectstatic.syntax-error"
meta_set "failure_reason" "collectstatic-syntax-error"
elif grep -q 'ImproperlyConfigured' "$COLLECTSTATIC_LOG"; then
mcount "failure.collectstatic.improper-configuration"
meta_set "failure_reason" "collectstatic-improper-configuration"
elif grep -q 'The CSS file' "$COLLECTSTATIC_LOG"; then
mcount "failure.collectstatic.fancy-references"
meta_set "failure_reason" "collectstatic-fancy-references"
elif grep -q 'OSError' "$COLLECTSTATIC_LOG"; then
mcount "failure.collectstatic.missing-file"
meta_set "failure_reason" "collectstatic-missing-file"
else
mcount "failure.collectstatic.other"
meta_set "failure_reason" "collectstatic-other"
fi

echo " ! Error while running '$ python $MANAGE_FILE collectstatic --noinput'."
Expand Down
7 changes: 6 additions & 1 deletion bin/steps/hooks/post_compile
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
#!/usr/bin/env bash

if [[ -f bin/post_compile ]]; then
post_compile_hook_start_time=$(nowms)
meta_set "post_compile_hook" "true"
echo "-----> Running post-compile hook"
chmod +x bin/post_compile
sub_env bin/post_compile
fi
meta_time "post_compile_hook_duration" "${post_compile_hook_start_time}"
else
meta_set "post_compile_hook" "false"
fi
7 changes: 6 additions & 1 deletion bin/steps/hooks/pre_compile
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
#!/usr/bin/env bash

if [[ -f bin/pre_compile ]]; then
pre_compile_hook_start_time=$(nowms)
meta_set "pre_compile_hook" "true"
echo "-----> Running pre-compile hook"
chmod +x bin/pre_compile
sub_env bin/pre_compile
fi
meta_time "pre_compile_hook_duration" "${pre_compile_hook_start_time}"
else
meta_set "pre_compile_hook" "false"
fi
Loading

0 comments on commit 8c46490

Please sign in to comment.