Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible build is not working. #7808

Closed
ngortheone opened this issue May 28, 2019 · 6 comments

Comments

Projects
None yet
2 participants
@ngortheone
Copy link

commented May 28, 2019

@Eric-Arellano @stuhood
I am using 1.17.0.dev1 (5/22/2019) version
https://www.pantsbuild.org/notes-master.html

[GLOBAL]
pants_version: 1.17.0.dev1

The release mentions the change I am interested in:

  • Turn on reproducible PEX builds, e.g. for ./pants binary command (#7734) PR #7734

But my python_binary pex archives still have pyc files, and have different sha every time

I have cleaned all caches and even re-installed pants. Please let me know if any additional changes are required to activate reproducible builds?

@ngortheone

This comment has been minimized.

Copy link
Author

commented May 28, 2019

I should mention that all files have Jan 1 1980 timestamp, which is another part of the change.

self._builder.build(safe_path, bytecode_compile=False, deterministic_timestamp=True)

It looks like bytecode_compile=False has no effect

@Eric-Arellano

This comment has been minimized.

Copy link
Contributor

commented May 29, 2019

Thank you @ngortheone for opening this. I have been out of the office this week due to a family death, but will return to work tomorrow and will take a look at this.

@Eric-Arellano

This comment has been minimized.

Copy link
Contributor

commented Jun 3, 2019

Should be fixed by #7841.

@Eric-Arellano

This comment has been minimized.

Copy link
Contributor

commented Jun 3, 2019

Actually, the PEXes will still not be reproducible due to PEX-INFO containing runtime information, such as the timestamp.

{"always_write_cache": false, "build_properties": {"branch": "no-pyc", "buildroot": "/Users/eric/DocsLocal/code/projects/pants", "class": "CPython", "cmd_line": "pants --cache-ignore binary build-support/bin:shellcheck", "datetime": "Monday Jun 03, 2019 12:27:55", "default_report": "/Users/eric/DocsLocal/code/projects/pants/.pants.d/reports/pants_run_2019_06_03_12_27_55_249_2fd25ab784514e618dc2f05af46dd824/html/build.html", "id": "pants_run_2019_06_03_12_27_55_249_2fd25ab784514e618dc2f05af46dd824", "machine": "Erics-MacBook-Pro.local", "path": "/Users/eric/DocsLocal/code/projects/pants", "pex_version": "1.6.7", "platform": "macosx_10_14_x86_64", "report_url": "http://localhost:49468/run/pants_run_2019_06_03_12_27_55_249_2fd25ab784514e618dc2f05af46dd824", "revision": "4948b9f87d4cf3856c89e3b456da3ce22c8c93f7", "timestamp": "1559590075.249843", "user": "eric", "version": "1.17.0rc0"}, "code_hash": "9e1954cf92346bc349859b93971d5be200a063f4", "distributions": {}, "emit_warnings": true, "entry_point": "shellcheck", "ignore_errors": false, "inherit_path": "false", "interpreter_constraints": ["CPython>=3.6"], "requirements": [], "zip_safe": true}

https://github.com/pantsbuild/pants/blob/d72fb81e0a2b8be81d22846047aefd480016dfa9/src/python/pants/backend/python/tasks/python_binary_create.py#L128-132

We have had that information since first introducing ./pants binary, so I think we need to introduce an option that allows us to toggle on and off the runtime information. Maybe --binary-py-include-runtime-information? Default to False, as I don't imagine many people are exploding their PEXes and inspecting the PEX-INFO for the runtime info like where the report was generated.

cc @stuhood @jsirois

@Eric-Arellano

This comment has been minimized.

Copy link
Contributor

commented Jun 3, 2019

Opened #7843. Let's move the conversation there.

Eric-Arellano added a commit that referenced this issue Jun 4, 2019

Fix .pyc files being included to partially get reproducible PEX builds (
#7841)

#7734 partially turned on reproducible PEX builds, but did not properly update every call site to avoid including `.pyc` files.

As decided there, we do not (at the moment) provide an opt-out to still include `.pyc` files at the expense of reproducible PEXes, because we deem this to be a sensible default.

Fixes part of #7808.

Eric-Arellano added a commit that referenced this issue Jun 4, 2019

No longer default to saving non-deterministic run information to PEX …
…in `./pants binary` (#7843)

### Problem
With `./pants binary`, from the start we have injected our own information into `build_properties` to include the run information, which includes things like the `timestamp` and `report_url`.

For example, a `PEX-INFO` would include these build properties:
 
```python
"build_properties": {"branch": "no-pyc", "buildroot": "/Users/eric/DocsLocal/code/projects/pants", "class": "CPython", "cmd_line": "pants --cache-ignore binary build-support/bin:shellcheck", "datetime": "Monday Jun 03, 2019 12:27:55", "default_report": "/Users/eric/DocsLocal/code/projects/pants/.pants.d/reports/pants_run_2019_06_03_12_27_55_249_2fd25ab784514e618dc2f05af46dd824/html/build.html", "id": "pants_run_2019_06_03_12_27_55_249_2fd25ab784514e618dc2f05af46dd824", "machine": "Erics-MacBook-Pro.local", "path": "/Users/eric/DocsLocal/code/projects/pants", "pex_version": "1.6.7", "platform": "macosx_10_14_x86_64", "report_url": "http://localhost:49468/run/pants_run_2019_06_03_12_27_55_249_2fd25ab784514e618dc2f05af46dd824", "revision": "4948b9f87d4cf3856c89e3b456da3ce22c8c93f7", "timestamp": "1559590075.249843", "user": "eric", "version": "1.17.0rc0"}
```

Not only are most of these irrelevant when shipping a PEX, but they also cause any PEX created via `./pants binary` to never be able to be reproducible, meaning that #7734 and #7841 will not fully fix reproducible builds (#7808).

### Solution
Default to not including this run information, as it is generally not relevant to built PEXes and we do not anticipate the average user exploding the `.pex` file to inspect this data.

However, we do introduce a new option `--binary-py-include-run-information` that allows the original behavior. Because we have included this information in `PEX-INFO` from the start, some users may have come to depend on it and should have the option to keep the behavior.

### Result
With #7841 also included, two built PEXes will be byte-for-byte identical, e.g.

```bash
$ ./pants --cache-ignore binary build-support/bin:shellcheck; mv dist/shellcheck.pex 1.pex
$ ./pants --cache-ignore binary build-support/bin:shellcheck; mv dist/shellcheck.pex 2.pex
$ cmp 1.pex 2.pex
```

[`--cache-ignore` is used to ensure here that a new PEX is being generated each time, rather than relying on the prior cached result.]

The `PEX-INFO` will now only include this much saner set of values for `build_properties`:

```python
 "build_properties": {"class": "CPython", "pex_version": "1.6.7", "platform": "macosx_10_14_x86_64", "version": [3, 6, 8]}
```
@ngortheone

This comment has been minimized.

Copy link
Author

commented Jun 10, 2019

@Eric-Arellano when can I expect these changes in dev release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.