The fast-deps feature is not a fast way to obtain dependencies #8670

McSinyx · 2020-07-31T09:51:04Z

Originally posted over Python discuss:

Despite its name, at the time of writing, for most of the cases (where the wheels are small) it does not make the pip install/pip download process any faster. The only case that it might be an optimization is where pip runs into a lot of dependencies conflicts and has to perform a series of backtracking.

Moreover, unlike the normal wheel download, the lazy implementation performs multiple requests. On unstable networks like mine, this makes it a lot slower than downloading the same size of data but in one single response (citation needed, I know this is generally believed but I'd love to read an article explain the reason in details). The first step to tackle this that I have in mind is to refuse to use range requests when a wheel is smaller than a certain size (a few times of chunk size perhaps?). There might need to be more experiments to further optimize this but this is the first thing I can think of. I'd really appreciate any suggestion, even just mere ideas of what to explore.

If possible, please assign this issue to me so I can better keep track of my GSoC project.

McSinyx · 2020-08-02T08:22:51Z

By applying this patch

diff --git a/src/pip/_internal/network/lazy_wheel.py b/src/pip/_internal/network/lazy_wheel.py
index c2371bf5..c9244bb5 100644
--- a/src/pip/_internal/network/lazy_wheel.py
+++ b/src/pip/_internal/network/lazy_wheel.py
@@ -60,6 +60,7 @@ class LazyZipOverHTTP(object):
 
     def __init__(self, url, session, chunk_size=CONTENT_CHUNK_SIZE):
         # type: (str, PipSession, int) -> None
+        self._count = 0
         head = session.head(url, headers=HEADERS)
         raise_for_status(head)
         assert head.status_code == 200
@@ -158,6 +159,7 @@ class LazyZipOverHTTP(object):
 
     def __exit__(self, *exc):
         # type: (*Any) -> Optional[bool]
+        print(self._count, 'requests to fetch metadata from', self._url[107:])
         return self._file.__exit__(*exc)
 
     @contextmanager
@@ -192,6 +194,7 @@ class LazyZipOverHTTP(object):
     def _stream_response(self, start, end, base_headers=HEADERS):
         # type: (int, int, Dict[str, str]) -> Response
         """Return HTTP response to a range request from start to end."""
+        self._count += 1
         headers = {'Range': 'bytes={}-{}'.format(start, end)}
         headers.update(base_headers)
         return self._session.get(self._url, headers=headers, stream=True)

I obtained the number of requests to fetch the metadata from each wheel:

$ pip install tensorflow --no-cache-dir | grep 'requests to fetch metadata from' | sort -k7
10 requests to fetch metadata from astunparse-1.6.3-py2.py3-none-any.whl
17 requests to fetch metadata from cachetools-4.1.1-py3-none-any.whl
1 requests to fetch metadata from gast-0.3.3-py2.py3-none-any.whl
4 requests to fetch metadata from google_auth-1.20.0-py2.py3-none-any.whl
1 requests to fetch metadata from google_auth_oauthlib-0.4.1-py2.py3-none-any.whl
4 requests to fetch metadata from google_pasta-0.2.0-py3-none-any.whl
13 requests to fetch metadata from grpcio-1.30.0-cp38-cp38-manylinux2010_x86_64.whl
14 requests to fetch metadata from h5py-2.10.0-cp38-cp38-manylinux1_x86_64.whl
16 requests to fetch metadata from Keras_Preprocessing-1.1.2-py2.py3-none-any.whl
4 requests to fetch metadata from Markdown-3.2.2-py3-none-any.whl
23 requests to fetch metadata from numpy-1.18.5-cp38-cp38-manylinux1_x86_64.whl
17 requests to fetch metadata from oauthlib-3.1.0-py2.py3-none-any.whl
10 requests to fetch metadata from opt_einsum-3.3.0-py3-none-any.whl
16 requests to fetch metadata from protobuf-3.12.4-cp38-cp38-manylinux1_x86_64.whl
7 requests to fetch metadata from pyasn1-0.4.8-py2.py3-none-any.whl
20 requests to fetch metadata from pyasn1_modules-0.2.8-py2.py3-none-any.whl
7 requests to fetch metadata from requests_oauthlib-1.3.0-py2.py3-none-any.whl
1 requests to fetch metadata from rsa-4.6-py3-none-any.whl
14 requests to fetch metadata from scipy-1.4.1-cp38-cp38-manylinux1_x86_64.whl
1 requests to fetch metadata from six-1.15.0-py2.py3-none-any.whl
20 requests to fetch metadata from tensorboard-2.3.0-py3-none-any.whl
17 requests to fetch metadata from tensorboard_plugin_wit-1.7.0-py3-none-any.whl
20 requests to fetch metadata from tensorflow-2.3.0-cp38-cp38-manylinux2010_x86_64.whl
14 requests to fetch metadata from tensorflow_estimator-2.3.0-py2.py3-none-any.whl
1 requests to fetch metadata from Werkzeug-1.0.1-py2.py3-none-any.whl
1 requests to fetch metadata from wheel-0.34.2-py2.py3-none-any.whl

I've yet to know what to do with this information, but it might help deciding the multiplier (to chunk size) to avoid using range requests.

Edit: attempt to download up to chunk size seems to be better:

diff --git a/src/pip/_internal/network/lazy_wheel.py b/src/pip/_internal/network/lazy_wheel.py
index c2371bf5..d5967057 100644
--- a/src/pip/_internal/network/lazy_wheel.py
+++ b/src/pip/_internal/network/lazy_wheel.py
@@ -60,6 +60,7 @@ class LazyZipOverHTTP(object):
 
     def __init__(self, url, session, chunk_size=CONTENT_CHUNK_SIZE):
         # type: (str, PipSession, int) -> None
+        self._count = 0
         head = session.head(url, headers=HEADERS)
         raise_for_status(head)
         assert head.status_code == 200
@@ -109,8 +110,10 @@ class LazyZipOverHTTP(object):
         all bytes until EOF are returned.  Fewer than
         size bytes may be returned if EOF is reached.
         """
+        download_size = max(size, self._chunk_size)
         start, length = self.tell(), self._length
-        stop = start + size if 0 <= size <= length-start else length
+        stop = length if size < 0 else min(start+download_size, length)
+        start = max(0, stop-download_size)
         self._download(start, stop-1)
         return self._file.read(size)
 
@@ -158,6 +161,7 @@ class LazyZipOverHTTP(object):
 
     def __exit__(self, *exc):
         # type: (*Any) -> Optional[bool]
+        print(self._count, 'requests to fetch metadata from', self._url[107:])
         return self._file.__exit__(*exc)
 
     @contextmanager
@@ -192,6 +196,7 @@ class LazyZipOverHTTP(object):
     def _stream_response(self, start, end, base_headers=HEADERS):
         # type: (int, int, Dict[str, str]) -> Response
         """Return HTTP response to a range request from start to end."""
+        self._count += 1
         headers = {'Range': 'bytes={}-{}'.format(start, end)}
         headers.update(base_headers)
         return self._session.get(self._url, headers=headers, stream=True)

pip install tensorflow --no-cache-dir | grep 'requests to fetch metadata from' | sort -k7
2 requests to fetch metadata from astunparse-1.6.3-py2.py3-none-any.whl
2 requests to fetch metadata from cachetools-4.1.1-py3-none-any.whl
1 requests to fetch metadata from gast-0.3.3-py2.py3-none-any.whl
3 requests to fetch metadata from google_auth-1.20.0-py2.py3-none-any.whl
2 requests to fetch metadata from google_auth_oauthlib-0.4.1-py2.py3-none-any.whl
2 requests to fetch metadata from google_pasta-0.2.0-py3-none-any.whl
2 requests to fetch metadata from grpcio-1.30.0-cp38-cp38-manylinux2010_x86_64.whl
6 requests to fetch metadata from h5py-2.10.0-cp38-cp38-manylinux1_x86_64.whl
2 requests to fetch metadata from Keras_Preprocessing-1.1.2-py2.py3-none-any.whl
2 requests to fetch metadata from Markdown-3.2.2-py3-none-any.whl
19 requests to fetch metadata from numpy-1.18.5-cp38-cp38-manylinux1_x86_64.whl
3 requests to fetch metadata from oauthlib-3.1.0-py2.py3-none-any.whl
2 requests to fetch metadata from opt_einsum-3.3.0-py3-none-any.whl
12 requests to fetch metadata from protobuf-3.12.4-cp38-cp38-manylinux1_x86_64.whl
2 requests to fetch metadata from pyasn1-0.4.8-py2.py3-none-any.whl
4 requests to fetch metadata from pyasn1_modules-0.2.8-py2.py3-none-any.whl
2 requests to fetch metadata from requests_oauthlib-1.3.0-py2.py3-none-any.whl
2 requests to fetch metadata from rsa-4.6-py3-none-any.whl
15 requests to fetch metadata from scipy-1.4.1-cp38-cp38-manylinux1_x86_64.whl
2 requests to fetch metadata from six-1.15.0-py2.py3-none-any.whl
16 requests to fetch metadata from tensorboard-2.3.0-py3-none-any.whl
2 requests to fetch metadata from tensorboard_plugin_wit-1.7.0-py3-none-any.whl
11 requests to fetch metadata from tensorflow-2.3.0-cp38-cp38-manylinux2010_x86_64.whl
5 requests to fetch metadata from tensorflow_estimator-2.3.0-py2.py3-none-any.whl
2 requests to fetch metadata from Werkzeug-1.0.1-py2.py3-none-any.whl
2 requests to fetch metadata from wheel-0.34.2-py2.py3-none-any.whl

With exit right after complete of resolution, the latter version only uses 20s while the current implementation would take 30s. I'm filing a PR for this.

$ git diff
diff --git a/src/pip/_internal/resolution/resolvelib/resolver.py b/src/pip/_internal/resolution/resolvelib/resolver.py
index 43ea2486..aad532df 100644
--- a/src/pip/_internal/resolution/resolvelib/resolver.py
+++ b/src/pip/_internal/resolution/resolvelib/resolver.py
@@ -125,6 +125,7 @@ class Resolver(BaseResolver):
             error = self.factory.get_installation_error(e)
             six.raise_from(error, e)
 
+        exit()
         req_set = RequirementSet(check_supported_wheels=check_supported_wheels)
         for candidate in self._result.mapping.values():
             ireq = candidate.get_install_requirement()

McSinyx · 2020-08-03T09:44:07Z

Noting JSON (redesigned) and simple (pypi/warehouse#8254) API may provide a more efficient and proformance-wise deterministic way to fetch metadata.

ofek · 2020-08-13T00:05:29Z

Closed by #8681?

pradyunsg · 2020-08-13T07:01:18Z

Should be!

McSinyx · 2020-08-13T07:09:11Z

I'm not exactly sure about closing this: whilst GH-8681 does make fast-deps faster, I don't have any benchmark to prove (at least to myself) that it's always/mostly faster than downloading the whole wheels, especially when most pure Python packages are rather small. I'd prefer to keep this one open and close it myself when I'm more certain about the performance.

pradyunsg · 2020-08-30T06:32:28Z

I'd prefer to keep this one open and close it myself when I'm more certain about the performance.

Here ya go. :)

McSinyx · 2020-08-30T06:37:18Z

Yay! I'm running some benchmarks lately and funny enough dependency resolution with fast-deps is not much faster than without, making the overall process (even with parallel download) much slower 😞

pradyunsg · 2020-08-30T06:41:50Z

How does it perform in cases w/ backtracking though? Try pip install pyrax==1.9.8 --log log.txt (and waiting for a long time) -- also, consider sharing the log.txt here once that is done. :)

pradyunsg · 2020-08-30T06:42:51Z

the overall process (even with parallel download) much slower 😞

How much slower?

McSinyx · 2020-08-30T16:07:07Z

Please checkout the benchmark here. I couldn't get pyrax==1.9.8 to finish (it got stuck after downloading PyYAML 3.12, with or without fast-deps) though.

dholth · 2022-04-20T16:36:55Z

@McSinyx are you still interested in this?

I modified lazy-wheel in pip 22.0.3 to print range requests. For a particular zip I wanted one file but got a suspicious number of range requests.

fetch range {'Accept-Encoding': 'identity', 'Range': 'bytes=184320-190434'}
fetch range {'Accept-Encoding': 'identity', 'Range': 'bytes=180195-184319'}
fetch range {'Accept-Encoding': 'identity', 'Range': 'bytes=74-10313'}
fetch range {'Accept-Encoding': 'identity', 'Range': 'bytes=10314-10343'}
fetch range {'Accept-Encoding': 'identity', 'Range': 'bytes=10344-10392'}

In lazy_wheel.py read has a max() preventing it from downloading more than CHUNK_SIZE at once. It seems like CHUNK_SIZE should be the minimum size?

https://github.com/pypa/pip/blob/main/src/pip/_internal/network/lazy_wheel.py#L95

I changed it to min

    def read(self, size: int = -1) -> bytes:
        """Read up to size bytes from the object and return them.

        As a convenience, if size is unspecified or -1,
        all bytes until EOF are returned.  Fewer than
        size bytes may be returned if EOF is reached.
        """
        download_size = min(size, self._chunk_size) # was max()
        start, length = self.tell(), self._length
        stop = length if size < 0 else min(start + download_size, length)
        start = max(0, stop - download_size)
        self._download(start, stop - 1)
        return self._file.read(size)

This gets us down to four range requests:

fetch range {'Accept-Encoding': 'identity', 'Range': 'bytes=184320-190434'}
fetch range {'Accept-Encoding': 'identity', 'Range': 'bytes=74-103'}
fetch range {'Accept-Encoding': 'identity', 'Range': 'bytes=104-152'}
fetch range {'Accept-Encoding': 'identity', 'Range': 'bytes=153-8117'}

As a further optimization, I read all bytes between the header offsets of the file I want, and the next header offset. Instead of letting ZipFile make a small read for the file header and a second read for the file contents. This is even possible with a second ZipFile object on the same LazyZipOverHTTP. Note this code is broken if the file we want is the last file:

info, after = next(
    (inf, n)
    for (inf, n) in zip(zf.infolist(), zf.infolist()[1:])
    if inf.filename.startswith("info-")
)
zf.fp.seek(info.header_offset)
zf.fp.read(
    after.header_offset - info.header_offset
)

Gets us down to the optimal two requests. (We should emit up to two Range requests for the footer, depending on whether the footer is smaller than the chunk size or not, and then we can emit one per compressed file).

fetch range {'Accept-Encoding': 'identity', 'Range': 'bytes=184320-190434'}
fetch range {'Accept-Encoding': 'identity', 'Range': 'bytes=74-8117'}

dholth · 2022-04-20T19:08:04Z

The feature also makes an unnecessary HEAD request, when it could get the total Content-Length from the first Range request, preemptively fetching a certain number of bytes from the end of the file.

dholth · 2022-04-21T13:43:29Z

Unfortunately download_size = min(size, self._chunk_size) # was max() breaks the download for some reason (it differs from the original file)

McSinyx · 2022-04-21T15:08:18Z

I think that checks out because at least size needs to be downloaded for reading and min gives chunk size in case it is smaller. That does not explain how no byte within 8118-10392 or 180195-184319 is asked for after patching though. I am rather curious what segments are read.

dholth · 2022-04-21T15:26:53Z

I'm re-using this code to look at .conda zip files. The zip directory is at the end and the metadata I want is a single file near the beginning of the file. So it is expected that no other data is fetched.

It looks like this code limits Range requests to no more than CHUNK_SIZE bytes? Since metadata tends to be small it doesn't usually matter but it would be nice if there was no upper limit.

dholth · 2022-04-21T15:28:13Z

Code to eliminate the HEAD request

headers["Range"] = f"bytes=-{CONTENT_CHUNK_SIZE}"

# if CONTENT_CHUNK_SIZE is bigger than the file:
# In [8]: response.headers["Content-Range"]
# Out[8]: 'bytes 0-3133374/3133375'

tail = session.get(url, headers=headers, stream=True)
# e.g. {'accept-ranges': 'bytes', 'content-length': '10240',
# 'content-range': 'bytes 12824-23063/23064', 'last-modified': 'Sat, 16
# Apr 2022 13:03:02 GMT', 'date': 'Thu, 21 Apr 2022 11:34:04 GMT'}

if tail.status_code != 206:
    raise HTTPRangeRequestUnsupported("range request is not supported")

self._session, self._url, self._chunk_size = session, url, chunk_size
self._length = int(tail.headers["Content-Range"].partition("/")[-1])
self._file = NamedTemporaryFile()
self.truncate(self._length)

# length is also in Content-Length and Content-Range header
with self._stay():
    self.seek(self._length - len(tail.content))
    self._file.write(tail.content)
self._left: List[int] = [self._length - len(tail.content)]
self._right: List[int] = [self._length - 1]

dholth · 2022-04-21T15:38:00Z

CONTENT_CHUNK_SIZE comes from requests, it happens to be 10k, which is a great guess for fetching wheel .zip directories in a single request.

dholth · 2022-10-03T13:44:04Z

Lazier wheel is here. It should be compatible but I haven't tested it yet. It avoids the HEAD request.

If you find that the last 10k of the file isn't a good enough heuristic, you can also lookup the content range for the desired METADATA file, in the zip index, and guarantee 2 or 3 requests maximum.

#11447 #11481

dholth · 2022-10-04T15:28:03Z

I had a chance to test it. I had to handle 416 errors for wheels that were smaller than the chunk size, and I added eager fetch of the entire .dist-info section of the wheel (but we could fetch only METADATA and WHEEL; this would help if they are in a single convenient range).

This reduces the # of requests.

% python -m pip install --no-cache-dir tensorflow --use-feature=fast-deps --dry-run
WARNING: pip is using lazily downloaded wheels using HTTP range requests to obtain dependency information. This experimental feature is enabled through --use-feature=fast-deps and it is not ready for production.
Collecting tensorflow
  Obtaining dependency information from tensorflow 2.10.0
prefetch dist-info 239935539-240305560
3 requests to fetch metadata from tensorflow-2.10.0-cp39-cp39-macosx_10_14_x86_64.whl
Collecting google-pasta>=0.1.1
  Obtaining dependency information from google-pasta 0.2.0
prefetch dist-info 49328-55329
1 requests to fetch metadata from google_pasta-0.2.0-py3-none-any.whl
Collecting astunparse>=1.6.0
  Obtaining dependency information from astunparse 1.6.3
prefetch dist-info 7309-11951
1 requests to fetch metadata from astunparse-1.6.3-py2.py3-none-any.whl
Collecting wrapt>=1.11.0
  Obtaining dependency information from wrapt 1.14.1
prefetch dist-info 30540-34428
1 requests to fetch metadata from wrapt-1.14.1-cp39-cp39-macosx_10_9_x86_64.whl
Collecting keras-preprocessing>=1.1.1
  Obtaining dependency information from keras-preprocessing 1.1.2
prefetch dist-info 38292-41164
1 requests to fetch metadata from Keras_Preprocessing-1.1.2-py2.py3-none-any.whl
Collecting keras<2.11,>=2.10.0
  Obtaining dependency information from keras 2.10.0
prefetch dist-info 1606097-1634571
3 requests to fetch metadata from keras-2.10.0-py2.py3-none-any.whl
Collecting tensorboard<2.11,>=2.10
  Obtaining dependency information from tensorboard 2.10.1
prefetch dist-info 5827781-5847436
3 requests to fetch metadata from tensorboard-2.10.1-py3-none-any.whl
Collecting packaging
  Obtaining dependency information from packaging 21.3
prefetch dist-info 28518-39361
2 requests to fetch metadata from packaging-21.3-py3-none-any.whl
Collecting gast<=0.4.0,>=0.2.1
  Obtaining dependency information from gast 0.4.0
prefetch dist-info 6988-9134
2 requests to fetch metadata from gast-0.4.0-py3-none-any.whl
Collecting absl-py>=1.0.0
  Obtaining dependency information from absl-py 1.2.0
prefetch dist-info 114391-121307
1 requests to fetch metadata from absl_py-1.2.0-py3-none-any.whl
Collecting grpcio<2.0,>=1.24.3
  Obtaining dependency information from grpcio 1.49.1
prefetch dist-info 4524206-4538044
2 requests to fetch metadata from grpcio-1.49.1-cp39-cp39-macosx_10_10_x86_64.whl
Collecting six>=1.12.0
  Obtaining dependency information from six 1.16.0
prefetch dist-info 8485-10605
1 requests to fetch metadata from six-1.16.0-py2.py3-none-any.whl
Collecting tensorflow-io-gcs-filesystem>=0.23.1
  Obtaining dependency information from tensorflow-io-gcs-filesystem 0.27.0
prefetch dist-info 1629750-1639868
2 requests to fetch metadata from tensorflow_io_gcs_filesystem-0.27.0-cp39-cp39-macosx_10_14_x86_64.whl
Collecting typing-extensions>=3.6.6
  Obtaining dependency information from typing-extensions 4.3.0
prefetch dist-info 18021-25162
1 requests to fetch metadata from typing_extensions-4.3.0-py3-none-any.whl
Collecting termcolor>=1.1.0
  Obtaining dependency information from termcolor 2.0.1
prefetch dist-info 2196-4877
2 requests to fetch metadata from termcolor-2.0.1-py3-none-any.whl
Collecting numpy>=1.20
  Obtaining dependency information from numpy 1.23.3
prefetch dist-info 0-18055764
3 requests to fetch metadata from numpy-1.23.3-cp39-cp39-macosx_10_9_x86_64.whl
Collecting h5py>=2.9.0
  Obtaining dependency information from h5py 3.7.0
prefetch dist-info 3175427-3181569
2 requests to fetch metadata from h5py-3.7.0-cp39-cp39-macosx_10_9_x86_64.whl
Collecting tensorflow-estimator<2.11,>=2.10.0
  Obtaining dependency information from tensorflow-estimator 2.10.0
prefetch dist-info 420686-426115
3 requests to fetch metadata from tensorflow_estimator-2.10.0-py2.py3-none-any.whl
Requirement already satisfied: setuptools in /Users/dholth/opt/py3x86/lib/python3.9/site-packages (from tensorflow) (58.1.0)
Collecting protobuf<3.20,>=3.9.2
  Obtaining dependency information from protobuf 3.19.6
prefetch dist-info 973147-976250
1 requests to fetch metadata from protobuf-3.19.6-cp39-cp39-macosx_10_9_x86_64.whl
Collecting libclang>=13.0.0
  Obtaining dependency information from libclang 14.0.6
prefetch dist-info 13223909-13231552
1 requests to fetch metadata from libclang-14.0.6-py2.py3-none-macosx_10_9_x86_64.whl
Collecting opt-einsum>=2.3.2
  Obtaining dependency information from opt-einsum 3.3.0
prefetch dist-info 58104-63148
1 requests to fetch metadata from opt_einsum-3.3.0-py3-none-any.whl
Collecting flatbuffers>=2.0
  Obtaining dependency information from flatbuffers 22.9.24
prefetch dist-info 24140-25598
1 requests to fetch metadata from flatbuffers-22.9.24-py2.py3-none-any.whl
Collecting wheel<1.0,>=0.23.0
  Obtaining dependency information from wheel 0.37.1
prefetch dist-info 30516-33723
1 requests to fetch metadata from wheel-0.37.1-py2.py3-none-any.whl
Collecting markdown>=2.6.8
  Obtaining dependency information from markdown 3.4.1
prefetch dist-info 85305-90374
1 requests to fetch metadata from Markdown-3.4.1-py3-none-any.whl
Collecting google-auth-oauthlib<0.5,>=0.4.1
  Obtaining dependency information from google-auth-oauthlib 0.4.6
prefetch dist-info 11092-17255
1 requests to fetch metadata from google_auth_oauthlib-0.4.6-py2.py3-none-any.whl
Collecting tensorboard-plugin-wit>=1.6.0
  Obtaining dependency information from tensorboard-plugin-wit 1.8.1
prefetch dist-info 773914-776703
1 requests to fetch metadata from tensorboard_plugin_wit-1.8.1-py3-none-any.whl
Collecting google-auth<3,>=1.6.3
  Obtaining dependency information from google-auth 2.12.0
prefetch dist-info 156009-164848
2 requests to fetch metadata from google_auth-2.12.0-py2.py3-none-any.whl
Collecting requests<3,>=2.21.0
  Obtaining dependency information from requests 2.28.1
prefetch dist-info 54220-61243
1 requests to fetch metadata from requests-2.28.1-py3-none-any.whl
Collecting werkzeug>=1.0.1
  Obtaining dependency information from werkzeug 2.2.2
prefetch dist-info 223166-228709
1 requests to fetch metadata from Werkzeug-2.2.2-py3-none-any.whl
Collecting tensorboard-data-server<0.7.0,>=0.6.0
  Obtaining dependency information from tensorboard-data-server 0.6.1
prefetch dist-info 3544582-3545791
1 requests to fetch metadata from tensorboard_data_server-0.6.1-py3-none-macosx_10_9_x86_64.whl
Collecting pyparsing!=3.0.5,>=2.0.2
  Obtaining dependency information from pyparsing 3.0.9
prefetch dist-info 93808-97206
1 requests to fetch metadata from pyparsing-3.0.9-py3-none-any.whl
Collecting cachetools<6.0,>=2.0.0
  Obtaining dependency information from cachetools 5.2.0
prefetch dist-info 5427-8670
2 requests to fetch metadata from cachetools-5.2.0-py3-none-any.whl
Collecting pyasn1-modules>=0.2.1
  Obtaining dependency information from pyasn1-modules 0.2.8
prefetch dist-info 140434-147136
2 requests to fetch metadata from pyasn1_modules-0.2.8-py2.py3-none-any.whl
Collecting rsa<5,>=3.1.4
  Obtaining dependency information from rsa 4.9
prefetch dist-info 29553-33054
1 requests to fetch metadata from rsa-4.9-py3-none-any.whl
Collecting requests-oauthlib>=0.7.0
  Obtaining dependency information from requests-oauthlib 1.3.1
prefetch dist-info 16437-22125
1 requests to fetch metadata from requests_oauthlib-1.3.1-py2.py3-none-any.whl
Collecting importlib-metadata>=4.4
  Obtaining dependency information from importlib-metadata 5.0.0
prefetch dist-info 13730-20583
1 requests to fetch metadata from importlib_metadata-5.0.0-py3-none-any.whl
Collecting urllib3<1.27,>=1.21.1
  Obtaining dependency information from urllib3 1.26.12
prefetch dist-info 117715-137216
2 requests to fetch metadata from urllib3-1.26.12-py2.py3-none-any.whl
Collecting charset-normalizer<3,>=2
  Obtaining dependency information from charset-normalizer 2.1.1
prefetch dist-info 31240-38208
1 requests to fetch metadata from charset_normalizer-2.1.1-py3-none-any.whl
Collecting idna<4,>=2.5
  Obtaining dependency information from idna 3.4
prefetch dist-info 55146-60675
1 requests to fetch metadata from idna-3.4-py3-none-any.whl
Collecting certifi>=2017.4.17
  Obtaining dependency information from certifi 2022.9.24
prefetch dist-info 157623-160372
1 requests to fetch metadata from certifi-2022.9.24-py3-none-any.whl
Collecting MarkupSafe>=2.1.1
  Obtaining dependency information from markupsafe 2.1.1
prefetch dist-info 9668-12762
1 requests to fetch metadata from MarkupSafe-2.1.1-cp39-cp39-macosx_10_9_x86_64.whl
Collecting zipp>=0.5
  Obtaining dependency information from zipp 3.8.1
prefetch dist-info 2551-5196
2 requests to fetch metadata from zipp-3.8.1-py3-none-any.whl
Collecting pyasn1<0.5.0,>=0.4.6
  Obtaining dependency information from pyasn1 0.4.8
prefetch dist-info 70813-74145
1 requests to fetch metadata from pyasn1-0.4.8-py2.py3-none-any.whl
Collecting oauthlib>=3.0.0
  Obtaining dependency information from oauthlib 3.2.1
prefetch dist-info 137993-145152
2 requests to fetch metadata from oauthlib-3.2.1-py3-none-any.whl

dholth · 2022-10-04T16:02:30Z

I added in the exit() call right before req_set = RequirementSet(check_supported_wheels=check_supported_wheels), to time dependency resolution only.

With minimal requests (prefetch): python -m pip install --no-cache-dir tensorflow --use-feature=fast-deps 3.91s user 0.34s system 50% cpu 8.480 total
With more requests (no prefetch): python -m pip install --no-cache-dir tensorflow --use-feature=fast-deps 4.61s user 0.33s system 41% cpu 12.007 total

% ping files.pythonhosted.org
PING dualstack.r.ssl.global.fastly.net (151.101.1.63): 56 data bytes
64 bytes from 151.101.1.63: icmp_seq=0 ttl=59 time=28.996 ms
64 bytes from 151.101.1.63: icmp_seq=1 ttl=59 time=21.039 ms
64 bytes from 151.101.1.63: icmp_seq=2 ttl=59 time=21.088 ms

triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label Jul 31, 2020

McSinyx changed the title ~~The fast-deps feature may take longer to~~ The fast-deps feature is not a fast way to obtain dependencies Jul 31, 2020

pfmoore assigned McSinyx Jul 31, 2020

McSinyx mentioned this issue Aug 2, 2020

[fast-deps] Make range requests closer to chunk size #8681

Merged

pradyunsg closed this as completed Aug 13, 2020

pradyunsg reopened this Aug 30, 2020

pradyunsg added state: needs discussion This needs some more discussion and removed S: needs triage Issues/PRs that need to be triaged labels Jun 21, 2022

pradyunsg mentioned this issue Sep 30, 2022

Splitting RequirementPreparer #7049

Open

pradyunsg mentioned this issue Jul 31, 2023

pull preparer logic out of the resolver to consume metadata-only dists in commands #12186

Open

This was referenced Aug 6, 2023

lazier lazy_wheel #11481

Open

perform 1-3 HTTP requests for each wheel using fast-deps #12208

Open

jsirois mentioned this issue Feb 22, 2024

Expose Pip's --use-feature. pex-tool/pex#2375

Closed

cobya mentioned this issue Jun 18, 2024

Investigate usage of --use-feature=fast-deps in pip install --report microsoft/component-detection#1182

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The fast-deps feature is not a fast way to obtain dependencies #8670

The fast-deps feature is not a fast way to obtain dependencies #8670

McSinyx commented Jul 31, 2020

McSinyx commented Aug 2, 2020 •

edited

Loading

McSinyx commented Aug 3, 2020

ofek commented Aug 13, 2020

pradyunsg commented Aug 13, 2020

McSinyx commented Aug 13, 2020

pradyunsg commented Aug 30, 2020

McSinyx commented Aug 30, 2020

pradyunsg commented Aug 30, 2020 •

edited

Loading

pradyunsg commented Aug 30, 2020

McSinyx commented Aug 30, 2020 •

edited

Loading

dholth commented Apr 20, 2022 •

edited

Loading

dholth commented Apr 20, 2022

dholth commented Apr 21, 2022

McSinyx commented Apr 21, 2022

dholth commented Apr 21, 2022 •

edited

Loading

dholth commented Apr 21, 2022 •

edited

Loading

dholth commented Apr 21, 2022 •

edited

Loading

dholth commented Oct 3, 2022 •

edited

Loading

dholth commented Oct 4, 2022

dholth commented Oct 4, 2022 •

edited

Loading

The fast-deps feature is not a fast way to obtain dependencies #8670

The fast-deps feature is not a fast way to obtain dependencies #8670

Comments

McSinyx commented Jul 31, 2020

McSinyx commented Aug 2, 2020 • edited Loading

McSinyx commented Aug 3, 2020

ofek commented Aug 13, 2020

pradyunsg commented Aug 13, 2020

McSinyx commented Aug 13, 2020

pradyunsg commented Aug 30, 2020

McSinyx commented Aug 30, 2020

pradyunsg commented Aug 30, 2020 • edited Loading

pradyunsg commented Aug 30, 2020

McSinyx commented Aug 30, 2020 • edited Loading

dholth commented Apr 20, 2022 • edited Loading

dholth commented Apr 20, 2022

dholth commented Apr 21, 2022

McSinyx commented Apr 21, 2022

dholth commented Apr 21, 2022 • edited Loading

dholth commented Apr 21, 2022 • edited Loading

dholth commented Apr 21, 2022 • edited Loading

dholth commented Oct 3, 2022 • edited Loading

dholth commented Oct 4, 2022

dholth commented Oct 4, 2022 • edited Loading

McSinyx commented Aug 2, 2020 •

edited

Loading

pradyunsg commented Aug 30, 2020 •

edited

Loading

McSinyx commented Aug 30, 2020 •

edited

Loading

dholth commented Apr 20, 2022 •

edited

Loading

dholth commented Apr 21, 2022 •

edited

Loading

dholth commented Apr 21, 2022 •

edited

Loading

dholth commented Apr 21, 2022 •

edited

Loading

dholth commented Oct 3, 2022 •

edited

Loading

dholth commented Oct 4, 2022 •

edited

Loading