Skip to content

fix(pypi): do not fail on indexes without root index#3799

Open
aignas wants to merge 5 commits into
bazel-contrib:mainfrom
aignas:aignas.fix.pyipi-single-index-perf
Open

fix(pypi): do not fail on indexes without root index#3799
aignas wants to merge 5 commits into
bazel-contrib:mainfrom
aignas:aignas.fix.pyipi-single-index-perf

Conversation

@aignas
Copy link
Copy Markdown
Collaborator

@aignas aignas commented May 28, 2026

With this PR we are changing the strategy of assuming that
all of the packages that we have here will be available
through the index and we stop calling the root index if
only a single index is specified. This will improve the
performance for public-only hubs (like the twine deps)
and should in generally reflect better the pre-2.0 behaviour.

Fixes #3769
Closes #3770

With this PR we are changing the strategy of assuming that
all of the packages that we have here will be available
through the index and we stop calling the root index if
only a single index is specified. This will improve the
performance for public-only hubs (like the twine deps)
and should in generally reflect better the pre-2.0 behaviour.

Fixes bazel-contrib#3769
Closes bazel-contrib#3770
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request optimizes _get_dist_urls by skipping index downloads when there is only a single index. The reviewer pointed out that this change will cause existing unit tests to fail because they assert the root index is downloaded, and provided instructions to update them. Additionally, the reviewer highlighted an existing design limitation where specifying overrides alongside multiple indexes causes extra indexes to be ignored, and suggested a refactoring approach to handle this correctly.

index_urls.append(default_index)

index_url_overrides = index_url_overrides or {}
if index_url_overrides or len(index_urls) == 1:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Unit Test Failures

This change will cause existing unit tests in tests/pypi/simpleapi_download/simpleapi_download_tests.bzl to fail. Specifically, _test_download_url and _test_download_envsubst_url both assert that the root index (e.g., https://example.com/main/simple/) is downloaded. Since this PR successfully skips querying the root index when there is only a single index, those assertions will fail.

Please update those tests to remove the assertions for the root index download. For example, in _test_download_url, the expected downloads dict should be updated to:

    env.expect.that_dict(downloads).contains_exactly({
        "https://example.com/main/simple/bar/": "path/for/https___example_com_main_simple_bar.html",
        "https://example.com/main/simple/baz/": "path/for/https___example_com_main_simple_baz.html",
        "https://example.com/main/simple/foo/": "path/for/https___example_com_main_simple_foo.html",
    })

And similarly for _test_download_envsubst_url.

if default_index not in index_urls:
index_urls.append(default_index)

index_url_overrides = index_url_overrides or {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Design Limitation: Overrides and Multiple Indexes

There is an existing design limitation/bug that is preserved and made more prominent by this change.

When index_url_overrides is specified (non-empty) and there are multiple index_urls (i.e., extra_index_urls is not empty), the condition if index_url_overrides or len(index_urls) == 1: will evaluate to True. This causes the function to immediately return and map all non-overridden packages to the default_index, completely ignoring the extra_index_urls.

Ideally, we should only skip querying if there is only a single index or if all packages in sources are explicitly overridden. Otherwise, we should query the indexes for the remaining (non-overridden) packages and merge the results with the overrides.

Since the rest of the function is not in the diff, we cannot easily fix this here without a larger refactoring, but we should consider refactoring _get_dist_urls in a future change to support both overrides and multiple indexes correctly. For example:

def _get_dist_urls(ctx, *, default_index, index_urls, index_url_overrides, sources, read_simpleapi, attr, block, _fail = fail, **kwargs):
    index_urls = [] + (index_urls or [])
    if default_index not in index_urls:
        index_urls.append(default_index)

    index_url_overrides = index_url_overrides or {}
    remaining_sources = {pkg: sources[pkg] for pkg in sources if pkg not in index_url_overrides}

    if len(index_urls) == 1 or not remaining_sources:
        return {
            pkg: _normalize_url("{}/{}/".format(
                index_url_overrides.get(pkg, default_index),
                pkg.replace("_", "-"),
            ))
            for pkg in sources
        }

    # ... query indexes only for remaining_sources ...
    # ... and then merge the results with index_url_overrides ...

aignas added 4 commits May 28, 2026 23:03
When there is only a single index URL (no overrides, no extra_index_urls),
_get_dist_urls skips downloading the index page since all packages must
come from that single index. Update the test to reflect this.
…zation

Same as previous fix: when there is only a single index URL,
_get_dist_urls skips downloading the index page. Update this test
to match the optimized behavior.
@aignas aignas marked this pull request as ready for review May 28, 2026 14:10
@aignas aignas requested a review from rickeylev as a code owner May 28, 2026 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[pypi] Module extension fails if an extra index returns 404 on root path, even if simpleapi_skip is set

1 participant