Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible build is not working. #7808

Closed
ngortheone opened this issue May 28, 2019 · 6 comments
Closed

Reproducible build is not working. #7808

ngortheone opened this issue May 28, 2019 · 6 comments
Assignees

Comments

@ngortheone
Copy link

@Eric-Arellano @stuhood
I am using 1.17.0.dev1 (5/22/2019) version
https://www.pantsbuild.org/notes-master.html

[GLOBAL]
pants_version: 1.17.0.dev1

The release mentions the change I am interested in:

But my python_binary pex archives still have pyc files, and have different sha every time

I have cleaned all caches and even re-installed pants. Please let me know if any additional changes are required to activate reproducible builds?

@ngortheone
Copy link
Author

I should mention that all files have Jan 1 1980 timestamp, which is another part of the change.

self._builder.build(safe_path, bytecode_compile=False, deterministic_timestamp=True)

It looks like bytecode_compile=False has no effect

@Eric-Arellano
Copy link
Contributor

Thank you @ngortheone for opening this. I have been out of the office this week due to a family death, but will return to work tomorrow and will take a look at this.

@Eric-Arellano
Copy link
Contributor

Should be fixed by #7841.

@Eric-Arellano
Copy link
Contributor

Eric-Arellano commented Jun 3, 2019

Actually, the PEXes will still not be reproducible due to PEX-INFO containing runtime information, such as the timestamp.

{"always_write_cache": false, "build_properties": {"branch": "no-pyc", "buildroot": "/Users/eric/DocsLocal/code/projects/pants", "class": "CPython", "cmd_line": "pants --cache-ignore binary build-support/bin:shellcheck", "datetime": "Monday Jun 03, 2019 12:27:55", "default_report": "/Users/eric/DocsLocal/code/projects/pants/.pants.d/reports/pants_run_2019_06_03_12_27_55_249_2fd25ab784514e618dc2f05af46dd824/html/build.html", "id": "pants_run_2019_06_03_12_27_55_249_2fd25ab784514e618dc2f05af46dd824", "machine": "Erics-MacBook-Pro.local", "path": "/Users/eric/DocsLocal/code/projects/pants", "pex_version": "1.6.7", "platform": "macosx_10_14_x86_64", "report_url": "http://localhost:49468/run/pants_run_2019_06_03_12_27_55_249_2fd25ab784514e618dc2f05af46dd824", "revision": "4948b9f87d4cf3856c89e3b456da3ce22c8c93f7", "timestamp": "1559590075.249843", "user": "eric", "version": "1.17.0rc0"}, "code_hash": "9e1954cf92346bc349859b93971d5be200a063f4", "distributions": {}, "emit_warnings": true, "entry_point": "shellcheck", "ignore_errors": false, "inherit_path": "false", "interpreter_constraints": ["CPython>=3.6"], "requirements": [], "zip_safe": true}

https://github.com/pantsbuild/pants/blob/d72fb81e0a2b8be81d22846047aefd480016dfa9/src/python/pants/backend/python/tasks/python_binary_create.py#L128-132

We have had that information since first introducing ./pants binary, so I think we need to introduce an option that allows us to toggle on and off the runtime information. Maybe --binary-py-include-runtime-information? Default to False, as I don't imagine many people are exploding their PEXes and inspecting the PEX-INFO for the runtime info like where the report was generated.

cc @stuhood @jsirois

@Eric-Arellano
Copy link
Contributor

Opened #7843. Let's move the conversation there.

Eric-Arellano added a commit that referenced this issue Jun 4, 2019
#7841)

#7734 partially turned on reproducible PEX builds, but did not properly update every call site to avoid including `.pyc` files.

As decided there, we do not (at the moment) provide an opt-out to still include `.pyc` files at the expense of reproducible PEXes, because we deem this to be a sensible default.

Fixes part of #7808.
Eric-Arellano added a commit that referenced this issue Jun 4, 2019
…in `./pants binary` (#7843)

### Problem
With `./pants binary`, from the start we have injected our own information into `build_properties` to include the run information, which includes things like the `timestamp` and `report_url`.

For example, a `PEX-INFO` would include these build properties:
 
```python
"build_properties": {"branch": "no-pyc", "buildroot": "/Users/eric/DocsLocal/code/projects/pants", "class": "CPython", "cmd_line": "pants --cache-ignore binary build-support/bin:shellcheck", "datetime": "Monday Jun 03, 2019 12:27:55", "default_report": "/Users/eric/DocsLocal/code/projects/pants/.pants.d/reports/pants_run_2019_06_03_12_27_55_249_2fd25ab784514e618dc2f05af46dd824/html/build.html", "id": "pants_run_2019_06_03_12_27_55_249_2fd25ab784514e618dc2f05af46dd824", "machine": "Erics-MacBook-Pro.local", "path": "/Users/eric/DocsLocal/code/projects/pants", "pex_version": "1.6.7", "platform": "macosx_10_14_x86_64", "report_url": "http://localhost:49468/run/pants_run_2019_06_03_12_27_55_249_2fd25ab784514e618dc2f05af46dd824", "revision": "4948b9f87d4cf3856c89e3b456da3ce22c8c93f7", "timestamp": "1559590075.249843", "user": "eric", "version": "1.17.0rc0"}
```

Not only are most of these irrelevant when shipping a PEX, but they also cause any PEX created via `./pants binary` to never be able to be reproducible, meaning that #7734 and #7841 will not fully fix reproducible builds (#7808).

### Solution
Default to not including this run information, as it is generally not relevant to built PEXes and we do not anticipate the average user exploding the `.pex` file to inspect this data.

However, we do introduce a new option `--binary-py-include-run-information` that allows the original behavior. Because we have included this information in `PEX-INFO` from the start, some users may have come to depend on it and should have the option to keep the behavior.

### Result
With #7841 also included, two built PEXes will be byte-for-byte identical, e.g.

```bash
$ ./pants --cache-ignore binary build-support/bin:shellcheck; mv dist/shellcheck.pex 1.pex
$ ./pants --cache-ignore binary build-support/bin:shellcheck; mv dist/shellcheck.pex 2.pex
$ cmp 1.pex 2.pex
```

[`--cache-ignore` is used to ensure here that a new PEX is being generated each time, rather than relying on the prior cached result.]

The `PEX-INFO` will now only include this much saner set of values for `build_properties`:

```python
 "build_properties": {"class": "CPython", "pex_version": "1.6.7", "platform": "macosx_10_14_x86_64", "version": [3, 6, 8]}
```
@ngortheone
Copy link
Author

@Eric-Arellano when can I expect these changes in dev release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants