Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardise on pip_parse #807

Merged
merged 16 commits into from
Sep 24, 2022
Merged

Standardise on pip_parse #807

merged 16 commits into from
Sep 24, 2022

Conversation

groodt
Copy link
Collaborator

@groodt groodt commented Aug 27, 2022

Less extreme version of the proposal outlined here:
#757


What is changing?

This PR standardises on pip_parse behaviour as the only mechanism for installation of 3rd-party python packages. The internals of pip_install have been changed to use pip_parse behind the scenes, with a small compatibility shim over rule attribute names to ensure minimal disruption to any existing users of pip_install.

Why is it changing?

Historically, we had 2 rules to install packages from PyPI (pip_install and pip_parse), with pip_install being the original. It worked well, but had a significant flaw in that it would eagerly fetch and install all packages. This was slow and frustrating when there were a large number of dependencies, so pip_parse was born!

pip_parse (even though it is still a repository rule) lazily fetched and installed packages. This was adopted by most of the community based on evidence from issues raised, the bazel Slack channels and the rule maintainers.

Maintaining both versions should no longer be necessary. There is a lot of duplication in both versions and it requires extra effort and time. This increases maintenance burden and significantly slows progress to improve the rules.

Breaking Changes:

  • A small breaking change in this PR, will be for users (probably a small number) who still use a pip_install version prior to this PR and who DO NOT use the requirement() macro to reference dependencies. The naming style of the external repositories has changed, so this will require a sed across any .bazel files if these users upgrade. This isn't a big deal in my opinion.
  • Users of pip_install will need to add the calls as you do with pip_parse e.g.
load("@pip//:requirements.bzl", "install_deps")

install_deps()

@groodt
Copy link
Collaborator Author

groodt commented Aug 29, 2022

@f0rmiga for early review before opening up to other maintainers.

@f0rmiga
Copy link
Collaborator

f0rmiga commented Aug 30, 2022

The breaking change is acceptable. Gazelle can produce the right label for users who rely on it. I have not reviewed the code yet, thought. Will do next.

docs/pip.md Outdated Show resolved Hide resolved
python/pip.bzl Outdated Show resolved Hide resolved
python/pip.bzl Outdated Show resolved Hide resolved
# wheel, by default, enables debug symbols in GCC. This incidentally captures the build path in the .so file
# We can override this behavior by disabling debug symbols entirely.
# https://github.com/pypa/pip/issues/6505
if "CFLAGS" in os.environ:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be careful with environment variables for the C and C++ compilers. Ideally, we should instruct users to set these instead of doing them ourselves. Setting -g0 is only part of making them deterministic.

If we determine that we actually want to mess with them, then setting CPPFLAGS is where we want to start.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I'm not really confident in these either. My personal preference would be to not have any sdist being built in repository rules at all and encourage "binary-only" wheel installations instead.

All I've done here is to bring over the existing function that was already being called in both pip_install and pip_parse that was previously in extract_wheels.py (which will be deleted in this PR).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to keep it this way for now. I thought I had commented on this already. We can make improvements to this part on a follow-up PR.

@groodt groodt requested a review from f0rmiga August 31, 2022 10:39
@groodt
Copy link
Collaborator Author

groodt commented Sep 9, 2022

Any further comments @f0rmiga ?

@alexeagle
Copy link
Collaborator

Will users run into a problem with passing uncompiled requirements to pip_parse? Do you get an error if your requirements.txt just contains "pep8" or something?

@groodt
Copy link
Collaborator Author

groodt commented Sep 13, 2022

Will users run into a problem with passing uncompiled requirements to pip_parse? Do you get an error if your requirements.txt just contains "pep8" or something?

Yes, folks will get a clear error message instructing them what to do:

The `requirements_lock` file must be fully pinned. See `compile_pip_requirements`.
Alternatively, use `pip-tools` or a similar mechanism to produce a pinned lockfile.

The following requirements were not pinned:

)
# buildifier: disable=print
print("pip_install is deprecated. Please switch to pip_parse. pip_install will be removed in a future release.")
pip_parse(requirements = requirements, name = name, **kwargs)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do something like pip_parse(requirements_lock = requirements, ... here so that you don't need to add a requirements parameter to pip_parse?

I'm not at all involved with rules_python, just excited about the feature!

@hrfuller
Copy link
Contributor

Yes, folks will get a clear error message instructing them what to do:

The requirements_lock file must be fully pinned. See compile_pip_requirements.
Alternatively, use pip-tools or a similar mechanism to produce a pinned lockfile.

The following requirements were not pinned:

The pinned-ness isn't the only issue. pip_parse does an intransitive resolve. So its almost certain a requirements.txt that was used with pip_install won't work with pip_parse, if pip_install is an alias of pip_parse.

I'm not necessarily opposed to the change but it is as breaking as the original proposal AFAICT.

@groodt
Copy link
Collaborator Author

groodt commented Sep 16, 2022

So its almost certain a requirements.txt that was used with pip_install won't work with pip_parse, if pip_install is an alias of pip_parse.

Can you explain what you mean? This is literally what we did at $dayjob and it worked fine for us. The recommendation has always been to use a fully pinned requirements.txt and I was the person who added the checks for it in pip_parse.

It's unlikely somebody will have a fully-pinned, but not fully-locked requirements.txt and we can easily add that to the release notes or error message. Whats your recommendation?

@hrfuller
Copy link
Contributor

hrfuller commented Sep 16, 2022

My point has to do with fully-transitively resolved and pinned lock file (what we expect in pip_parse) vs a fully-pinned requirements.txt that only has top level distributions. pip_install would run pip to resolve the transitive set, where pip_parse expects the requirements file to already be fully transitively resolved, hence the compile-pip-requirements thing. See https://github.com/bazelbuild/rules_python/blob/main/python/pip_install/extract_wheels/extract_single_wheel.py#L41 --no-deps is the difference.

My recommendation is that we consider that this change will not be transparent to most pip_install users unless we also plumb up some compile_pip_requirements to convert the top level requirements.txt into a real requirements_lock.txt. Builds will fail at analysis time with undefined labels pointing to other external repos for transitive deps otherwise.

@groodt
Copy link
Collaborator Author

groodt commented Sep 16, 2022

a fully-pinned requirements.txt that only has top level distributions

Got it. Do you expect those to be common? It's quite unlikely those individuals will have a good bazel experience because their results would be non-deterministic as resolution results would change when different package versions are published (of the non-direct dependencies). I would go so far as to say that it's a bug if pip_install doesn't require a transitively locked file at the moment.

compile-pip-requirements isn't only for pip_parse. It was added as a convenience for both pip_parse and pip_install.

Seems like a possible, but unlikely (and generally unsupportable?) situation to me.

@groodt
Copy link
Collaborator Author

groodt commented Sep 16, 2022

Is the suggestion you are making @hrfuller to rather NOT try to reuse the existing requirements.txt (even though many will be transitively locked)? I can get behind that. I don't mind. I just thought it could help users with existing requirements.txt that are locked (which has also been the recommendation and most should have or they would see lots of cache misses and non-determinism).

@hrfuller
Copy link
Contributor

hrfuller commented Sep 17, 2022

Is the suggestion you are making @hrfuller to rather NOT try to reuse the existing requirements.txt

Not really, just pointing out that the current workaround won't work for some users. Maybe we should just document that fact and try to give the nice helpful message about compile-pip-requirements to them, and move on. I'm sure it's not common at large companies to use an unlocked requirements file, but it is much more tenable for small repos.

@groodt
Copy link
Collaborator Author

groodt commented Sep 20, 2022

Maybe we should just document that fact

@hrfuller Ok, that works for me. Happy if I just add a note to the deprecation paragraph in README.md?

@groodt
Copy link
Collaborator Author

groodt commented Sep 22, 2022

@hrfuller I've added a small note to the README. LGTY?

The maintainers have taken all reasonable efforts to faciliate a smooth transition, but some users of `pip_install` will
need to replace their existing `requirements.txt` with a fully resolved set of dependencies using a tool such as
`pip-tools` or the `compile_pip_requirements` repository rule.

@groodt groodt merged commit f0efec5 into main Sep 24, 2022
@groodt groodt deleted the groodt-only-pip-parse branch September 24, 2022 20:04
@alexeagle
Copy link
Collaborator

Cool, @groodt we should make sure the next release includes a breaking change section in the release notes as well to make this discoverable

oncilla added a commit to scionproto/scion that referenced this pull request Oct 3, 2022
Update the rules_python to the latest release and register the hermetic python interpreter with version 3.10.
This also removes the usage of the deprecated pip_install (see bazelbuild/rules_python#807)
Note that rules_python is still requires an interpreter on the host to bootstrap (see bazelbuild/rules_python#691).

This should remove our reliance on the host interpreter as much as possible if the python tools are executed with bazel.
E.g., when running an acceptance test, when linting, or when running the topology generator with bazel run.
benthor pushed a commit to benthor/scion that referenced this pull request Nov 24, 2022
Update the rules_python to the latest release and register the hermetic python interpreter with version 3.10.
This also removes the usage of the deprecated pip_install (see bazelbuild/rules_python#807)
Note that rules_python is still requires an interpreter on the host to bootstrap (see bazelbuild/rules_python#691).

This should remove our reliance on the host interpreter as much as possible if the python tools are executed with bazel.
E.g., when running an acceptance test, when linting, or when running the topology generator with bazel run.
ianpegg-bc pushed a commit to ianpegg-bc/rules_python that referenced this pull request May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants