Fix sdist installation #3080

ddelange · 2023-06-22T07:05:05Z

This PR ~~removes~~ amends the 'pip inside pip' anti-pattern and updates the PyPI description:

To install, please execute the following:
pip install tensorrt --extra-index-url https://pypi.nvidia.com
Or add the index URL to the (space-separated) PIP_EXTRA_INDEX_URL environment variable:
export PIP_EXTRA_INDEX_URL='https://pypi.nvidia.com'
pip install tensorrt
When the extra index url does not contain https://pypi.nvidia.com, a nested pip install will run with the proper extra index url hard-coded.

I tried to find the source code of ERROR.txt inside tensorrt_libs sdist on official PyPI (so I could update the message with the PIP_EXTRA_INDEX_URL env var), but couldn't find it. Can you update that error message accordingly? Similarly this one will also need an update: https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

zeroepoch · 2023-06-24T05:20:15Z

@ddelange, this is going backwards in time to what we did before we published to pypi.org and the opposite of what we want. It would be good to know why it fails to install in you CI/CD environment, but this change now breaks everyone who has tensorrt in their requirements.txt file without an extra index URL our anyone who is simply using pip install tensorrt. There were other "magic" ways to accomplish forcing certain dependencies to install from another server but the version specifier with URL included is not allowed on pypi.org. This is really an issue with the python package index and not allowing external hosting to support larger packages or a way to reference another server without adding new indexes for all packages. They really need something like docker has with their pull syntax. This is why PyTorch has to use their complex install procedure to work around the package size limits of 1gb. I 100% agree this is a total hack to work around the limits of pypi.org, but for the common case where an user just uses pip install tensorrt for their system environment or python3 venv it magically selects the right servers for them.

zeroepoch · 2023-06-24T05:26:16Z

If the issue is that this pip-inception flow is breaking due to the internal pip invocation then maybe we can do something like try to detect if you already set the correct environment variables to the right values and don't do this hackiness?

ddelange · 2023-06-24T07:46:37Z

Ah, if that's the issue: I think you can simply request they increase the file size cap for your specific package: pypa/packaging-problems#109

torch also has 600+ MB wheels on pypi: https://pypi.org/project/torch/2.0.1/#files

ddelange · 2023-06-24T10:39:32Z

@zeroepoch please see the latest commit: I switched to PEP440 direct references based on the wheels I can see at https://pypi.nvidia.com/tensorrt-bindings/ and https://pypi.nvidia.com/tensorrt-libs/.

Like this, there is no more need for the index url, as it basically mimics pip's mechanics of finding the right wheel.

How is that?

zeroepoch · 2023-06-24T13:35:45Z

@ddelange this was one of the approaches we first considered, but then another team member mentioned PyPI would reject it. See this paragraph.

Public index servers SHOULD NOT allow the use of direct references in uploaded distributions. Direct references are intended as a tool for software integrators rather than publishers.

It's annoying because this would be perfect for this problem.

We did request an increase for our file size and it was approved. Then we ran into the project limit which was 10gb. We requested and increase for that and it was approved. Then we ran into the project size limit again. That wasn't really a sustainable solution and the libraries keep getting bigger each release unfortunately. So we took the next step which was to break out the libraries to avoid duplication and reduce storage size, but then we have these hacky sdist files to piece it all together depending on Python version.

I appreciate you looking into alternatives since we're not happy with this solution either.

ddelange · 2023-06-24T13:50:08Z

wow, that's a journey for the books... have you considered hosting a zip attached to each GitHub Release?

and then for instance employ a mechanic similar to nltk, it auto-downloads on import (by default kwarg), or you can run their entry_points to nltk download after install

zeroepoch · 2023-06-24T13:54:52Z

If we had some way to prevent the internal pip call so you could manage the index URL or installation of dependencies yourself, would that resolve the issue for you? Normal users who just want to install TensorRT using the system Python would see a simple experience. Advanced users who have issues with the internal pip call breaking their installation or CI/CD can apply some workaround to skip this internal step and then you can just pip install the 3 packages yourself.

ddelange · 2023-06-24T19:04:18Z

I had a deeper look at the logs linked in the issue. The environment variables look good, pip should (and does) pick up the pip 'symlink' from the venv. Maybe the env vars get dropped in your subprocess, and it would help if the invocation explicitly propagates the current env to sys.executable?

zeroepoch · 2023-06-24T19:25:45Z

Normally a subprocess should inherit its parent's environment but maybe it's filtering certain things out or it's the way pip is invoked. Maybe we could explicitly copy the environment and set it for the sub-pip call.

ddelange · 2023-06-24T23:09:50Z

Normally a subprocess should inherit its parent's environment

yes, it should and it does. I even rely on it in pipgrip and invoke the same way as TensorRT: https://github.com/ddelange/pipgrip/blob/0.10.4/src/pipgrip/pipper.py#L106

the logs from autogluon CI simply don't make sense to me. food for thought...

ddelange · 2023-06-24T23:17:03Z

and what a catch 22 by the way pypa/pip#6301

zeroepoch · 2023-06-25T00:46:50Z

and what a catch 22 by the way pypa/pip#6301

Not surprised to see others wanting this feature as well and I also don't understand the security argument. I guess if you only trust pypi.org then maybe you could say that, but what TensorRT is doing as an alternative shows it's not really that secure to begin with. I feel like this could properly be supported with a pedantic or strict flag if you didn't want any additional sources. If the URLs are directly in the requirements it's easier to audit and restrict.

FYI pypi/support#2609 (comment). We could have taken an alternative approach and downloaded the libraries only but doing it with pip allows more flexibility. You can actually install just the bindings from the NVIDIA index if you have the libraries installed already another way. Before that required the tar package to get those wheels.

https://peps.python.org/pep-0440/#direct-references Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

This reverts commit 5334f0d. ref NVIDIA#3080 (comment) Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

ddelange · 2023-06-26T05:13:30Z

I've reverted the direct references commit per your comment.

Maybe you want to keep this PR open as reference PR / proper implementation for the issue, or for the next major version release (where some action of the user may be expected).

I've notified the aws/autogluon team about the current impasse, still no idea why 'python says no'...

zeroepoch · 2023-06-26T14:18:55Z

@pranavm-nvidia any ideas on better ways to solve this problem for autogluon? I was thinking maybe an environment variable to disable the internal pip call and then users can use --extra-index-url to install the 3 modules themselves if they run into problems.

pranavm-nvidia · 2023-06-26T16:51:29Z

An alternative is to install with --no-deps and then install tensorrt-libs and tensorrt-bindings separately with the --extra-index-url option. That wouldn't require any changes in the TRT wheels.

ddelange · 2023-06-26T17:50:18Z

does the current hack respect --no-deps? I thought that flag only leads to ignoring install_requires, but I haven't tested that on tensorrt 8.6.1 tbh

pranavm-nvidia · 2023-06-26T17:55:30Z

Ah you're right, my mistake. An environment variable would work then, though it would be a bit nicer if we could somehow intercept the --no-deps option and respect that.

ddelange · 2023-06-26T18:00:32Z

if the user has exported PIP_EXTRA_INDEX_URL=pypi.abc.de,pypi.nvidia.com, and setup.py detects that, it could have install_requires like in this PR. if not, the pip hack could kick in?

ddelange · 2023-06-26T18:07:59Z

if "pypi.nvidia.com" in os.environ.get("PIP_EXTRA_INDEX_URL", ""):
    install_requires = [...]
    cmdclass = {}
else:
    install_requires = []
    cmdclass = {"install": ...}

python/packaging/frontend_sdist/setup.py

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

ddelange · 2023-07-04T21:32:06Z

@zeroepoch anything from your side?

zeroepoch · 2023-07-06T04:20:18Z

python/packaging/frontend_sdist/setup.py

+    pid = os.getppid()
+    # try retrieval using psutil
+    try:
+        import psutil


I don't know what the install base looks like for this Python module, but it does look like the best solution if available.

it has zero dependencies, but not all platform specific wheels are available and building it from source will fail e.g. on graviton processors ref giampaolo/psutil#2103

zeroepoch · 2023-07-06T04:27:33Z

python/packaging/frontend_sdist/setup.py

+    # fall back to shell
+    try:
+        return (
+            subprocess.check_output(["ps", "-p", str(pid), "-o", "command"])


Adding --no-headers avoids the need for the split(). Checking both CentOS 7 and Ubuntu 18.04 containers this option has been supported since at least then and TRT supports those OSes and newer.

python/packaging/frontend_sdist/setup.py

zeroepoch · 2023-07-06T04:42:49Z

python/packaging/frontend_sdist/setup.py

+
+
+# use pip-inside-pip hack only if the nvidia index is not set in the environment
+if "pypi.nvidia.com" in pip_config_list() or "pypi.nvidia.com" in parent_command_line():


A recent internal (and unrelated) discussion revealed we may want one additional condition here. In order to test different versions or from staging servers (other than pypi.nvidia.com) a full override may be desired. Basically a "manage the source and installation of dependencies yourself" option. I think one way to handle this would be to check an environment variable when set assumes you have set --extra-index-url to something you know works, which could be something other than pypi.nvidia.com. This may be useful externally as well if someone is running their own internal PyPI server and mirroring our wheel/sdist files as-is. @pranavm-nvidia what do you think about this approach and calling this environment variable something like NVIDIA_TENSORRT_DISABLE_INTERNAL_PIP?

how about we make NVIDIA_PIP_INDEX_URL configurable? defaulting to https://pypi.nvidia.com, but it could be overwritten with custom index urls or set it to an empty string to always pass the test?

I think NVIDIA_TENSORRT_DISABLE_INTERNAL_PIP is still useful even if we make the NVIDIA pip index URL configurable. There could be a case where there are no extra indexes used (not sure how it would work, but I imagine it would be possible somehow).

zeroepoch · 2023-07-06T04:45:12Z

@zeroepoch anything from your side?

Sorry, I was out of the office last week so I was replying from my phone. I finally got a chance to fully review the changes and leave some suggested changes.

Co-authored-by: Eric Work <work.eric@gmail.com> Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

ddelange · 2023-07-06T05:11:05Z

no worries! diff since your review: https://github.com/ddelange/TensorRT/compare/fc53814..fix-sdist

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

zeroepoch

The usage of the environment variables LGTM. @rajeevsrao can you help to merge this PR?

rajeevsrao · 2023-07-07T07:42:02Z

@ttyio can you please cherry-pick this change internally and include for the next release?

ddelange · 2023-07-07T12:35:15Z

oops write /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_cuda_cpp.so: no space left on device

ttyio · 2023-07-10T04:52:12Z

/blossom-ci

ttyio · 2023-07-12T17:53:44Z

python/packaging/frontend_sdist/setup.py

+        super().run()
+
+
+def pip_config_list():


Could we replace the function with below code, to solve some issue we see in internal CI/CD, thanks!

def run_pip_command(args, call_func): try: return call_func([sys.executable, "-m", "pip"] + args) except subprocess.CalledProcessError: return call_func([os.path.join(sys.exec_prefix, "bin", "pip")] + args) def pip_config_list(): """Get the current pip config (env vars, config file, etc).""" return run_pip_command(["config", "list"], subprocess.check_output).decode()

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

…e-sdist * 'main' of https://github.com/NVIDIA/TensorRT: Fix sdist installation (NVIDIA#3080)

zeroepoch · 2023-08-26T06:27:01Z

@ddelange, just to follow up on this PR, these changes + some more that were required are now live with version 8.6.1.post1. If you're curious what the final changes were you can pull out the setup.py file.

Fix sdist installation

8f412dd

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

ddelange force-pushed the fix-sdist branch from c4cdbfa to 8f412dd Compare June 22, 2023 07:05

ddelange mentioned this pull request Jun 22, 2023

Add support for python 3.11 autogluon/autogluon#3190

Merged

Unify project description with tensorrt_libs

c89ff0e

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

ddelange mentioned this pull request Jun 23, 2023

tensorrt sdist install fails #3078

Closed

ddelange force-pushed the fix-sdist branch from f8d16ac to 6e0d24e Compare June 24, 2023 10:36

ddelange closed this Jun 24, 2023

ddelange reopened this Jun 24, 2023

ddelange added 2 commits June 26, 2023 07:07

Switch requirements to PEP440 direct references

5334f0d

https://peps.python.org/pep-0440/#direct-references Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

Revert "Switch requirements to PEP440 direct references"

92791a9

This reverts commit 5334f0d. ref NVIDIA#3080 (comment) Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

ddelange force-pushed the fix-sdist branch from 4809249 to 92791a9 Compare June 26, 2023 05:07

pranavm-nvidia reviewed Jul 3, 2023

View reviewed changes

python/packaging/frontend_sdist/setup.py Outdated Show resolved Hide resolved

pranavm-nvidia reviewed Jul 3, 2023

View reviewed changes

python/packaging/frontend_sdist/setup.py Outdated Show resolved Hide resolved

Use pip config list to detect index url

fc53814

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

ddelange force-pushed the fix-sdist branch from f98a98f to fc53814 Compare July 3, 2023 18:36

pranavm-nvidia approved these changes Jul 3, 2023

View reviewed changes

zeroepoch reviewed Jul 6, 2023

View reviewed changes

PR Suggestion

6458be2

Co-authored-by: Eric Work <work.eric@gmail.com> Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

ddelange force-pushed the fix-sdist branch from 5fa73e1 to 6458be2 Compare July 6, 2023 05:01

ddelange added 3 commits July 6, 2023 07:01

Add --no-headers to ps command

e89d6b5

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

Introduce NVIDIA_PIP_INDEX_URL env var

0f13ecd

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

Typo in description

49641c8

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

ddelange added 2 commits July 6, 2023 07:13

Run black

1bd83cd

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

Replace f-string with .format, add python_requires

9a8e0ce

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

ddelange force-pushed the fix-sdist branch from ee74d62 to 9a8e0ce Compare July 6, 2023 05:17

Introduce NVIDIA_TENSORRT_DISABLE_INTERNAL_PIP env var

2c382eb

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

zeroepoch approved these changes Jul 7, 2023

View reviewed changes

ttyio reviewed Jul 12, 2023

View reviewed changes

PR Suggestion

c0c7e57

Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>

ttyio merged commit ba459b4 into NVIDIA:main Jul 14, 2023
1 of 2 checks passed

ddelange added a commit to ddelange/TensorRT that referenced this pull request Jul 14, 2023

Merge branch 'main' of https://github.com/NVIDIA/TensorRT into verbos…

d0b6a09

…e-sdist * 'main' of https://github.com/NVIDIA/TensorRT: Fix sdist installation (NVIDIA#3080)

ddelange mentioned this pull request Aug 24, 2023

[python] pip install fails starting with tensorrt 8.6.1 release #2933

Closed

pranavm-nvidia mentioned this pull request Oct 4, 2023

Improve frontend Python packaging #3362

Closed

prateekdesai04 mentioned this pull request Mar 27, 2024

[DRAFT][CI][Fix] tenssort installation fix, capping pip autogluon/autogluon#4011

Closed



		# use pip-inside-pip hack only if the nvidia index is not set in the environment
		if "pypi.nvidia.com" in pip_config_list() or "pypi.nvidia.com" in parent_command_line():

Fix sdist installation #3080

Fix sdist installation #3080

Conversation

ddelange commented Jun 22, 2023 • edited Loading

zeroepoch commented Jun 24, 2023

zeroepoch commented Jun 24, 2023

ddelange commented Jun 24, 2023

ddelange commented Jun 24, 2023

zeroepoch commented Jun 24, 2023

ddelange commented Jun 24, 2023

zeroepoch commented Jun 24, 2023 • edited Loading

ddelange commented Jun 24, 2023 • edited Loading

zeroepoch commented Jun 24, 2023

ddelange commented Jun 24, 2023

ddelange commented Jun 24, 2023

zeroepoch commented Jun 25, 2023

ddelange commented Jun 26, 2023

zeroepoch commented Jun 26, 2023

pranavm-nvidia commented Jun 26, 2023

ddelange commented Jun 26, 2023

pranavm-nvidia commented Jun 26, 2023

ddelange commented Jun 26, 2023

ddelange commented Jun 26, 2023

ddelange commented Jul 4, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ddelange Jul 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zeroepoch commented Jul 6, 2023

ddelange commented Jul 6, 2023

zeroepoch left a comment

Choose a reason for hiding this comment

rajeevsrao commented Jul 7, 2023

ddelange commented Jul 7, 2023

ttyio commented Jul 10, 2023

ttyio Jul 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zeroepoch commented Aug 26, 2023

ddelange commented Jun 22, 2023 •

edited

Loading

zeroepoch commented Jun 24, 2023 •

edited

Loading

ddelange commented Jun 24, 2023 •

edited

Loading

ddelange Jul 6, 2023 •

edited

Loading

ttyio Jul 12, 2023 •

edited

Loading