Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new artifact sgr-osx-x86_64.tgz to releases, compiled with pyinstaller --onedir, and default to it in install.sh on Darwin #656

Merged
merged 3 commits into from Mar 17, 2022

Conversation

milesrichardson
Copy link
Contributor

@milesrichardson milesrichardson commented Mar 17, 2022

todo:

  • Rebase and cleanup the commits slightly - changes will stay the same though

follow-up PR:

  • Make sure sgr upgrade works with multi-file executable mode (preferably in a backwards compatible way so existing OS X installations can move seamlessly from single-file to multi-file exec)
  • Verify install.sh script works end to end

Summary

  • New releases will include an additional artifact sgr-osx-x86_64.tgz
  • Archive is a directory including executable sgr, shared libraries, and resources
  • For Darwin OS, install.sh will:
    • download sgr-osx-x86_64.tgz unless $FORCE_ONEFILE is non-empty
    • extract the multi-file directory to ~/.splitgraph/pkg/sgr (overwrite if exist)
    • create symlink ~/.splitgraph/sgr -> ~/.splitgraph/pkg/sgr/sgr
  • The recommended install method continues to be running install.sh
  • Add new commit message pragma [artifacts] to build all artifacts but not release them

Why

The single-file executable performs poorly on Mac OS. Most noticeably,
every invocation of a single-file sgr starts with a 6-10 second delay
while Apple verifies the binary (I think with gatekeeper). Whereas
with a multi-file sgr, only the first invocation has this delay,
and all subsequent invocations are < 1000ms without any verification
overhead.

The reason for this is because the act of verifying the binary
changes its headers in such a way that it needs to be re-verified
before the next invocation. At least, that's what I read in a GitHub
issue in one of the browser tabs that I've since closed.

How

  • In osx_binary workflow stage, add steps to build artifact sgr-osx/sgr.tgz
  • In upload_release workflow stage, release sgr-osx/sgr.tgz artifact as sgr-osx-x86_64.tgz
  • In splitgraph.spec, when --onedir is specified, use the COLLECTION() target

Also

  • Add NO_PUBLISH flag to .ci/build_wheel.sh to not configure any PyPi parameters
  • In build_and_test workflow stage, add step to build wheel without publish

Note on using pyinstaller locally on Mac with pyenv

If developing splitgraph.spec and iterating with pyinstaller, it helps
to have a local environment setup to closely simulate the CI environment,
so in general, try to match whatever steps the workflow is doing. Basically,
this is a summary of the gotchas:

  • Make sure to use two virtual environments
    • Use the normal splitgraph dev environment with Poetry to build the Splitgraph wheel and bin/sgr
    • Create a new virtualenv just for pyinstaller to create exe from the wheel and bin/sgr
  • Create the virtual environments from a build of Python with --enable-framework
    • See PyInstaller docs
      • Note: This page is wrong for OS X. On Darwin, flag is --enable-framework, not --enable-shared

Here is a rough set of post-hoc notes of the commands I used to get
this working locally. It might be missing some details – make sure
after each time switching a virtualenv, that you're in the right one:

# Build Python with the --enable-framework option, and create the dev venv from it
env PYTHON_CONFIGURE_OPTS="--enable-framework" pyenv install 3.7.12
pyenv shell 3.7.12
pyenv virtualenv splitgraph-dev-3.7.12
pyenv shell splitgraph-dev-3.7.12

# <<make sure poetry is installed>> (can be done out of band)
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python
. ~/.poetry/env

# Run commands that are basically copied from CI:
poetry export --dev -f requirements.txt --without-hashes -o /tmp/requirements.txt -E pandas
# note: doesn't appear to be needed (no @ in requirements.txt last checked).
# if needed, will need to change -i to be correct mac syntax (I think switch to -e)
# sed -i "/ @ \//d" /tmp/requirements.txt
python -m pip install -U pip
pip install --no-deps -r /tmp/requirements.txt
poetry install -E pandas

# Build the wheel
poetry build

# Create a separate venv for pyinstaller
pyenv shell 3.7.12
pyenv virtualenv sgr-pyinstaller-3.7.12
pyenv shell sgr-pyinstaller-3.7.12

# Install it (note: unsure if installing pip is required. But I omitted it once, and install
# took 100x more time due to building packages after since it was "missing package `wheel`")
python -m pip install -U pip
pip install dist/splitgraph-*-py3-none-any.whl
pip install pyinstaller

# Compile a single-file executable into `dist/sgr`:
pyinstaller --clean --noconfirm splitgraph.spec

# Compile a multi-file executable into `dist/sgr-pkg`
pyinstaller --clean --noconfirm --onedir splitgraph.spec

You can make sure that your Python installation was compiled with
the correct configuration flags by running the correct python3-config
for your environment (unfortunately not exposed as python3-config by pyenv):

~/.pyenv/versions/3.7.12/bin/python3.7-config --cflags

The output should contain an include path that looks like:

-I/Users/johnnyappleseed/.pyenv/versions/3.7.12/Library/Frameworks/Python.framework/Versions/3.7/include/python3.7m -I/Users/johnnyappleseed/.pyenv/versions/3.7.12/include/python3.7m -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/Library/Developer/CommandLineTools/SDKs/MacOSX12.1.sdk/usr/include -I/Library/Developer/CommandLineTools/SDKs/MacOSX12.1.sdk/usr/include

Note on gatekeeper

Problem

If you download an artifact from Mac using a web browser, then
you will not be able to execute the artifact, whether downloaded
directly as sgr-osx-x86_64 or extracted from sgr-osx-x86_64.tgz.
This restriction does not apply when downloading a file with curl.

Resolution

Since it only affects files downloaded from web browsers, it does not
affect the install.sh script, so that continues to be the recommended
installation method.

We could change install.sh to remove the com.apple.quarantine flag
from the downloaded binary if it exists, but we don't expect it to
exist when downloading via curl, so it wouldn't make sense.

We could join the Apple Developer Program and pay for the privilege
of signing executables with a certificate they will sign too.

We should add a note to each release, and should update the installation
documentation to warn about downloading binaries from a web browser.

Background

Background: On Mac, some files have "extra attributes" set. The xattr
command can inspect and modify them. One of these attributes
is com.apple.quarantine, which web browsers set for any downloaded file.

So if you use your web browser to download sgr-osx-x86_64 from
the releases page, then even after chmod +x, you will not be
able to execute it. Attempts to execute the file will cause the OS
to render a pop-up window refusing to allow it.

Troubleshooting com.apple.quarantine and gatekeeper

To check if a file has the com.apple.quarantine bit set, run xattr $filename,
for example, to check the sgr executable in the local directory. If you
see com.apple.quarantine, it's set:

❯ xattr sgr
com.apple.quarantine

You can remove it with the -d flag. If you remove the flag from a .tgz archive,
then none of the extracted files will have it set. If you do not remove it, then
all of the extracted files will each have it set:

xattr sgr -d com.apple.quarantine sgr

You can recursively remove the flag on all files in a directory:

xattr sgr -dr com.apple.quarantine sgr-pkg

You can also inspect it further:

# See value of specific flag
❯ xattr -p com.apple.quarantine sgr
0081;6232d9e5;Chrome;418DB2BC-FCB9-4806-852A-1144DFFC6CA0

# See value of all flags (note the metadata about original download location) (truncated by me)
❯ xattr -l com.apple.quarantine sgr
com.apple.metadata:kMDItemWhereFroms: bplist00_(https://objects.githubusercontent.com/github-production-release-asset-2e65be/153455875/08645532-6041-4acb-be59-40a5545b2b43?X-Amz-Algorithm=xxx&X-Amz-Credential=xxx&X-Amz-Date=xx&X-Amz-Expires=300&X-Amz-Signature=xx&X-Amz-SignedHeaders=host&actor_id=xx&key_id=0&repo_id=153455875&response-content-disposition=attachment%3B%20filename%3Dsgr-osx-x86_64&response-content-type=application%2Foctet-stream_<https://github.com/splitgraph/splitgraph/releases/tag/v0.3.7
com.apple.quarantine: 0081;6232d9e5;Chrome;418DB2BC-FCB9-4806-852A-1144DFFC6CA0

- Pass `--onedir` argument intead of `-F` when calling pyinstaller

Since spec file is just a Python file, add conditional logic
to use Python to create the appropriate `EXE()` and `COLLECT()`
TOCs in the spec file based on presence of `--onedir` flag.

Existing behavior is retained by default and will always build
a single-file executable, except when the `--onedir` flag is
contained in `sys.argv` of the `pyinstaller` command that is
processing the spec.
- Add steps to `osx_binary` stage to build multi-file executable
after also building the single-file executable, and also
upload that new artifact to the release

To support debugging, add pragma to workflow to build artifacts even if no tag

This will build all artifacts, but unless the commit is a tag,
it will not create a release or otherwise publish anything.

To enable this, also change `.ci/build_wheel.sh` to add support
for a `NO_PUBLISH` flag. When the `NO_PUBLISH` environment
variable is a non-empty value, do not error if PyPi
credentials are not included, and just run `poetry build`
- Default to downloading release at `sgr-osx-x86_64.tgz`
when on OS X. This downloads a tarball of the multi-file
executable creatd with `pyinstaller --onedir`, which means
a directory that includes the `sgr` executable and its
dependencies. This performs better than executables generated
with `--onefile` on OS X.
- Allow overriding and reverting to previous behavior (i.e.,
downloading the single-file executable which is still addressable
at `sgr-osx-x86_64` in the release bundle) with `FORCE_ONEFILE` flag
- Unzip the package to `~/.splitgraph/pkg/sgr` (overwrite if exists),
then symlink `~/.splitgraph/sgr -> ~/.splitgraph/pkg/sgr/sgr`, in
order to keep compatibility with existing scripts
- Note: The self-upgrade (`sgr upgrade`) code might still need
to be changed.
@milesrichardson
Copy link
Contributor Author

I rebased and cleaned up the commits. If it passes, I'd like to merge it into master. Then I'll follow-up with a separate PR for any changes that might be required for install.sh if it doesn't work e2e, and sgr upgrade to make it support multi-file executables.

@milesrichardson milesrichardson changed the title Draft: Add new artifact sgr-osx-x86_64.tgz to releases, compiled with pyinstaller --onedir, and default to it in install.sh on Darwin Add new artifact sgr-osx-x86_64.tgz to releases, compiled with pyinstaller --onedir, and default to it in install.sh on Darwin Mar 17, 2022
@milesrichardson
Copy link
Contributor Author

The build has passed. You may download the generated artifacts from the actions summary page. Note that all issues described in the description with com.apple.quarantine apply when downloading these artifacts from the browser, which you may feel forced to do, since curl cannot download them without a GitHub personal access token. You can grab the request from dev tools (cancel it before starting the download), and then execute it with curl if you do it within 60 seconds.

I have removed the Draft: prefix from this pull request, in order to signify it is ready to merge.

I will now continue with the operation of merging this pull request into the master branch.

@milesrichardson milesrichardson merged commit c96712b into master Mar 17, 2022
@milesrichardson milesrichardson deleted the bugfix/osx-pyinstaller-fixes branch March 17, 2022 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant