Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to Cross Compile #585

Closed
cancan101 opened this issue Oct 24, 2017 · 23 comments · Fixed by #651
Closed

Ability to Cross Compile #585

cancan101 opened this issue Oct 24, 2017 · 23 comments · Fixed by #651
Labels
enhancement Improvements to functionality

Comments

@cancan101
Copy link

Support the ability to run pip-compile specifying the OS / architecture that should be used for resolving dependencies. Currently it uses the OS where the pip-compile is run. This causes issues such as #333. It also means that if a package does not exist on the current OS (eg for tensorflow-gpu on MacOS`), then compile fails.

Environment Versions
  1. OS Type
    MacOS
  2. Python version: $ python -V
    Python 3.5.3
  3. pip version: $ pip --version
    pip 9.0.1
  4. pip-tools version: $ pip-compile --version
    pip-compile, version 1.9.0
Steps to replicate
  1. Add tensorflow-gpu>=1.2 to requirements.in
  2. pip-compile
Expected result

requirements.txt file with pinned deps. (assuming that --arch manylinux1_x86_64 was set)

Actual result
Could not find a version that matches tensorflow-gpu>=1.2
Tried: 0.12.0rc1, 0.12.0, 0.12.1, 1.0.0, 1.0.1, 1.1.0rc1, 1.1.0rc2, 1.1.0
@vphilippon
Copy link
Member

vphilippon commented Oct 24, 2017

It also means that if a package does not exist on the current OS (eg for tensorflow-gpu on MacOS`), then compile fails.

For the record, it's the responsibility of the package requiring tensorflow-gpu>=1.2 to specify its only a linux/windows dependency if it doesn't exist on MacOS (assuming it supports MacOS itself), and pip-compile would respect that (except in 1.10.0 and 1.10.1, where its broken. Its fixed on master and should be part of 1.10.2, when a release will be possible).

About having the ability to compile for a specific environment, its interesting, but really hard to do well. That likely means having to trick pip to believe its running in a given environment. And then we have the case of sdist packages (.zip, .tar.gz, etc) that need to be built, and could definitely be unbuildable on the current OS (as in, running the setup.py could be impossible on the current OS).

In other word, I wouldn't expect this to be done soon. Contributions are always welcomed, but I would point toward supporting the upcomming pip 10 first 😄.

@taion
Copy link

taion commented Oct 26, 2017

Yeah, TensorFlow's packaging is a little weird. What this ends up looking like is that we logically want to specify something in requirements.in like:

tensorflow-gpu==1.3.0; 'linux' in sys_platform
tensorflow==1.3.0; 'linux' not in sys_platform

But pip-compile then fails on OS X, because there's no tensorflow-gpu==1.3.0 there.

@vphilippon
Copy link
Member

vphilippon commented Nov 22, 2017

You should be able to do that currently (or something alike, environment markers are allowed in requirements.in. I'm not familiar with OS X: what's its sys_platform value?

@taion
Copy link

taion commented Nov 22, 2017

There is no tensorflow-gpu==1.3.0 at all for OS X on PyPI, so something weird happens. The second line works, though (at least with Pipenv).

From this and other issues on Pipenv, this isn't really addressable without some even more invasive hackery, so this is probably a CANTFIX.

I might poke around at this a bit on my own but it's not immediately obvious that there's a solution here that isn't ridiculously gnarly.

@vphilippon
Copy link
Member

But, if pip-compile respects the environment marker here (first line), then it shouldn't try to install that tensorflow-gpu==1.3.0 package on OS X.

pip-tools is supposed to respect the environment markers explicitely given in the requirements.in, so this really strikes me as odd.

Would you give me the pip-compile --rebuild --verbose output of that?

(Am I "fighting" to keep an issue open? I think I need to consult a professional....)

@taion
Copy link

taion commented Nov 23, 2017

Hah, not a problem. Here's what happens:

$ cat requirements.in
tensorflow-gpu==1.3.0; 'linux' in sys_platform
$ pip-compile --version
pip-compile, version 1.10.2
$ pip-compile --rebuild --verbose
Using indexes:
  https://pypi.python.org/simple

                          ROUND 1
Current constraints:
  tensorflow-gpu==1.3.0

Finding the best candidates:
  found candidate tensorflow-gpu==1.3.0 (constraint was ==1.3.0)

Finding secondary dependencies:
  tensorflow-gpu==1.3.0 not in cache, need to check index
Could not find a version that satisfies the requirement tensorflow-gpu==1.3.0 (from versions: 0.12.1, 1.0.0, 1.0.1, 1.1.0rc0, 1.1.0rc1, 1.1.0rc2, 1.1.0)
Traceback (most recent call last):
  File "/usr/local/bin/pip-compile", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/piptools/scripts/compile.py", line 184, in cli
    results = resolver.resolve(max_rounds=max_rounds)
  File "/usr/local/lib/python3.6/site-packages/piptools/resolver.py", line 102, in resolve
    has_changed, best_matches = self._resolve_one_round()
  File "/usr/local/lib/python3.6/site-packages/piptools/resolver.py", line 199, in _resolve_one_round
    for dep in self._iter_dependencies(best_match):
  File "/usr/local/lib/python3.6/site-packages/piptools/resolver.py", line 285, in _iter_dependencies
    dependencies = self.repository.get_dependencies(ireq)
  File "/usr/local/lib/python3.6/site-packages/piptools/repositories/pypi.py", line 152, in get_dependencies
    self._dependencies_cache[ireq] = reqset._prepare_file(self.finder, ireq)
  File "/usr/local/lib/python3.6/site-packages/pip/req/req_set.py", line 554, in _prepare_file
    require_hashes
  File "/usr/local/lib/python3.6/site-packages/pip/req/req_install.py", line 278, in populate_link
    self.link = finder.find_requirement(self, upgrade)
  File "/usr/local/lib/python3.6/site-packages/pip/index.py", line 514, in find_requirement
    'No matching distribution found for %s' % req
pip.exceptions.DistributionNotFound: No matching distribution found for tensorflow-gpu==1.3.0

pip-tools can deal with the package itself just fine, but it fails when it tries to grab the package to resolve dependencies.

@taion
Copy link

taion commented Nov 23, 2017

It's the same sort of problem as https://github.com/kennethreitz/pipenv/issues/857, though the same problems there don't come up given that pip-tools itself runs in the virtualenv rather than outside of it.

One mitigation in this case could be that, for packages that do upload their dependencies to PyPI (are these the packages that use twine?), we just use the stated dependencies from PyPI rather than download the package to resolve it.

This wouldn't solve the problem in full generality, but it would fix things for e.g. tensorflow-gpu. This would also fix @mpolden's specific problem in https://github.com/kennethreitz/pipenv/issues/857, actually, since APScheduler does in fact publish its install requirements to PyPI, though again it wouldn't fix the general case.

Though frankly Pipenv is a bit of a no-go for us anyway due to https://github.com/kennethreitz/pipenv/issues/966.

@vphilippon
Copy link
Member

My bad, the environment markers are simply copied to the resulting requirements.txt. It looks like it will still do the lookup and fail here. I have a hunch of how this could be fixed, maybe it wouldn't be so hard (famous last words) in our case. Although, don't hold your breath.

I would need to check if PyPi actually provide an API to get those dependencies, but I doubt it.

@taion
Copy link

taion commented Nov 23, 2017

It's on the JSON payload. See info.requires_dist:

@taion
Copy link

taion commented Nov 23, 2017

I'm not sure if this API really lets you distinguish between "no dependencies" and "dependencies not published to PyPI", though. Maybe not that important in practice.

@vphilippon
Copy link
Member

Note to self: stop making "guesses" past 1:00 AM.
Thank you for the info, its good to know, maybe we can make something out of this.

@taion
Copy link

taion commented Nov 23, 2017

Ah, I see it's not so straightforward in the code given how you hook into RequirementSet from Pip to do the lookup.

@flaub
Copy link

flaub commented Nov 30, 2017

Actually, there's an evaluate() method on the markers attribute of an InstallRequirement. I don't know the best place for this call to be made, but my best guess is in scripts/compile.py you could add a line like:

constraints = [x for x in constraints if not x.markers or x.markers.evaluate()]

This line could go just after collecting all the constraints for parsing requirements and just before Resolver.check_constraints(constraints)

Here?

Additionally, the evaluate() method takes an environment arg, which presumably means that a command-line arg to pip-compile could be used to specify the target environment (I don't know exactly what form the environment arg takes at this time).

@taion
Copy link

taion commented Nov 30, 2017

I think there's still the problem that we literally can't evaluate the transitive dependencies for a package that we can't install/download, though.

The bottleneck here isn't really the evaluation – it's that unless we try to read the deps from the PyPI API (instead of using pip's approach), we don't have a way to get transitive deps at all for non-installable packages.

@flaub
Copy link

flaub commented Nov 30, 2017

No, check this out. Say I have a requirements.txt like such:

cairocffi
editdistance
h5py>=2.7.0
keras==2.0.8
pillow; platform_machine == 'armv7l'
pillow-simd; platform_machine != 'armv7l'
requests==2.18.4
scikit-learn[alldeps]
sklearn
tensorflow-gpu==1.3.0; platform_machine != 'armv7l' and platform_system != 'Darwin'
theano

Now let's say I use the following code snippet (which is pieced together from a REPL session and basically emulates what pip-compile is doing):

import optparse

import pip
from pip.req import parse_requirements

from piptools.repositories.pypi import PyPIRepository
from piptools.resolver import Resolver


class PipCommand(pip.basecommand.Command):
    name = 'PipCommand'


def main():
    pip_command = get_pip_command()
    pip_args = []
    pip_options, _ = pip_command.parse_args(pip_args)

    session = pip_command._build_session(pip_options)
    repository = PyPIRepository(pip_options, session)

    constraints = list(
        parse_requirements(
            'requirements.txt',
            finder=repository.finder,
            session=repository.session,
            options=pip_options))

    Resolver.check_constraints(constraints)
    resolver = Resolver(constraints, repository)
    results = resolver.resolve()
    
    import pprint
    pprint.pprint(results)


def get_pip_command():
    # Use pip's parser for pip.conf management and defaults.
    # General options (find_links, index_url, extra_index_url, trusted_host,
    # and pre) are defered to pip.
    pip_command = PipCommand()
    index_opts = pip.cmdoptions.make_option_group(
        pip.cmdoptions.index_group,
        pip_command.parser,
    )
    pip_command.parser.insert_option_group(0, index_opts)
    pip_command.parser.add_option(optparse.Option('--pre', action='store_true', default=False))

    return pip_command


if __name__ == '__main__':
    main()

If you run this, you get the exact same error as reported. However, if we now filter out the constraints that don't match the specified markers in requirements.txt, the resolver is happy (and so is the respository). This is accomplished with this, just before the call to Resolver.check_constraints(constraints):

constraints = [x for x in constraints if not x.markers or x.markers.evaluate()]

We are telling pip-compile to honor the markers specified in the top-level requirements passed in. This doesn't solve any markers on transitive dependencies that might not match the platform, but that doesn't matter if the top-level ones are properly specified.

@taion
Copy link

taion commented Nov 30, 2017

Oh, hey, scikit-learn[alldeps]! Adding that was probably among the least favorite PRs I've ever made 🤣

So, that does work, but it's not exactly what I want. Ideally, I'd like for this package (and its exclusive dependencies) to show up in my generated requirements.txt, with the appropriate markers.

Imagine I started with:

tensorflow-gpu; 'linux' in sys_platform
tensorflow; 'linux' not in sys_platform

I'd want something like:

bleach==1.5.0             # via bleach, tensorflow-tensorboard
enum34==1.1.6             # via enum34, tensorflow
html5lib==0.9999999       # via bleach, html5lib, tensorflow-tensorboard
markdown==2.6.9           # via markdown, tensorflow-tensorboard
numpy==1.13.3             # via numpy, tensorflow, tensorflow-tensorboard
protobuf==3.5.0.post1     # via protobuf, tensorflow, tensorflow-tensorboard
six==1.11.0               # via bleach, html5lib, protobuf, six, tensorflow, tensorflow-tensorboard
tensorflow-gpu==1.4.0; 'linux' in sys_platform
tensorflow-tensorboard==0.4.0rc3  # via tensorflow, tensorflow-tensorboard
tensorflow==1.4.0; 'linux' not in sys_platform
werkzeug==0.12.2          # via tensorflow-tensorboard, werkzeug
wheel==0.30.0             # via tensorflow, tensorflow-tensorboard, wheel

For carrying through dependencies transitively, suppose I had:

six
tensorflow; 'linux' not in sys_platform

Then I would want something like:

bleach==1.5.0; 'linux' not in sys_platform
enum34==1.1.6; 'linux' not in sys_platform
html5lib==0.9999999; 'linux' not in sys_platform
markdown==2.6.9; 'linux' not in sys_platform
numpy==1.13.3; 'linux' not in sys_platform
protobuf==3.5.0.post1; 'linux' not in sys_platform
six==1.11.0
tensorflow-tensorboard==0.4.0rc3; 'linux' not in sys_platform
tensorflow==1.4.0; 'linux' not in sys_platform
werkzeug==0.12.2; 'linux' not in sys_platform
wheel==0.30.0; 'linux' not in sys_platform

@flaub
Copy link

flaub commented Nov 30, 2017

I see, sorry for the rat hole, carry on :)

@vphilippon
Copy link
Member

Proper environment marker handling from the requirements.in was added in 2.0.0, and I'm currently documenting the "official" stance of pip-tools regarding cross-environment usage.

In short, pip-compile must be executed for each environment. We have the same issues described in this article about pypi regarding the execution of setup.py. We cannot safely and consistently know the dependencies required for a linux installation while on a windows installation, as an example.

So in the current state of things, it's a dead end. If someday there's a deterministic way to know the dependencies of any package without ever having to execute possibly environment-dependent code, then it'll be doable.

@altendky
Copy link

altendky commented Jul 2, 2019

I decided to solve this for myself by just dumping the locking into Azure Pipelines and keeping a per-platform requirements.txt output. I also happen to have multiple groupings (base, testing, dev, for example). boots takes care of the pip-syncing from the proper platform/group file and also delegates the remote locking to romp which basically allows arbitrary execution in Azure without building and committing and pushing a custom CI config.

Obviously it would be nice for packages to be processable on all platforms but I decided not to wait for that to happen.

https://github.com/altendky/boots
https://github.com/altendky/romp

@karypid
Copy link

karypid commented Nov 30, 2019

Hi all.

I came across this issue while looking for info on how to use pip-tools across mac/win32/linux. I started following the approach of running pip-compile on each platform and maintaining separate .txt files, for example:

pip-compile --allow-unsafe --upgrade --build-isolation --generate-hashes --output-file .\requirements\win32-py3.7-main.txt .\requirements\main.in

and

pip-compile --allow-unsafe --upgrade --build-isolation --generate-hashes --output-file .\requirements\linux-py3.7-main.txt .\requirements\main.in

What is the suggestion on compiling an add-on dev.in and constraining it to the set of main.in requirements for the applicable platform. I am now forced to use multiple files, as in:

# This is: linux-py3.7-dev.in:
-c linux-py3.7-main.txt
pylint

# This is: win32-py3.7-dev.in:
-c win32-py3.7-main.txt
pylint

I have resorted to having a single dev.in file WITHOUT the -c {platform}-{python_ver}-main.txt line, and using a script that detects the running platform and creates a temporary file (linux-py3.7-dev.in/win32-py3.7-dev.in/...) which contains the appropriate line for referencing the proper main.txt file.

Any suggestions/plans on how to best approach this?

@AndydeCleyre
Copy link
Contributor

I'm inviting anyone still interested in multi-environment compilation and sync workflows to pick up discussion @ #826.

@cheind
Copy link

cheind commented May 5, 2022

@karypid I'm using the following approach to cross-platform compatible requirements files with constraint support:

Assuming the following requirements files

# requirements.in
django

# dev-requirements.in
-c {python}-{platform}-{machine}-requirements.txt
django-debug-toolbar

(Note special constraint syntax).

I then invoke this (my) platform-generate.py script on each platform as follows

python platform_generate.py requirements.in dev-requirements.in

to get platform specific .in files, e.g.

py3.9-linux-x86_64-requirements.in
py3.9-linux-x86_64-dev-requirements.in

Inspecting py3.9-linux-x86_64-dev-requirements.in reveals

-c py3.9-linux-x86_64-requirements.txt
django-debug-toolbar

From here on use pip-compile/pip-sync. Alternatively platform-generate.py also supports a --compile switch to automatically call pip-compile once the platform specific .in files are generated.

@asparagusbeef
Copy link

asparagusbeef commented Aug 4, 2024

We are developing on Windows but deploying in a docker linux environment. My solution was to create a container that imitates our production environment, mount it locally, and generate the requirements.txt in it. Basically:
Dockerfile:

FROM python:3.11-slim-bullseye

RUN pip install pip-tools

WORKDIR /app

COPY requirements.in /app/

CMD ["pip-compile", "--output-file=requirements.txt", "--strip-extras", "requirements.in"]

Makefile:

.PHONY: dev compile

dev:
	pip-compile --output-file=requirements-dev.txt --strip-extras requirements-dev.in requirements.in && \
	pip-sync requirements-dev.txt && \
	black . && \
	isort . --profile black

compile:
	docker build -t pip-compile-env -f ../../setup/Dockerfile.compile .
	powershell -command "docker run --rm -v \"$$(Get-Location):/app\" pip-compile-env"
	docker rmi pip-compile-env
	@echo "requirements.txt has been generated in the current directory."

To prevent the image from building every time:

compile:
	@powershell -Command "if (-Not (docker images -q pip-compile-env)) { \
		Write-Output 'Image pip-compile-env not found. Building...'; \
		docker build -t pip-compile-env -f ../../setup/Dockerfile.compile .; \
	} else { \
		Write-Output 'Image pip-compile-env already exists. Skipping build...'; \
	}"
	@powershell -command "docker run --rm -v \"$$(Get-Location):/app\" pip-compile-env"
	@echo "requirements.txt has been generated in the current directory."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvements to functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants