oidc: Refactor lookup strategies into single functions #18169

facutuesca · 2025-05-21T22:03:47Z

This PR refactors the Trusted Publishing "lookup strategies" pattern into a single lookup_by_claims() method for each of the publishers.

Context

A strategy is a way of, given a set of OIDC claims, query the database for a matching Trusted Publisher. Concretely, a strategy is a function that takes a set of claims and returns a Query object. Each publisher has a list of these strategies, ordered from specific to general. When trying to find a Trusted Publisher, the more specific strategies are tried first, and if they fail the more general ones are tried.

For example, for the given set of claims for a GitHub OIDC token:

{
    "repository": "foo/bar",
    "job_workflow_ref": "foo/bar/.github/workflows/release.yml@refs/heads/main",
    "environment": "my_environment",
}

first we ran a strategy that tried to find publishers with exactly those values (in particular, environment==my_environment). If that strategy failed, we tried the second strategy, where we tried to find publishers with environment==None (that is, allowing any environment).
The "specific to general" order in this case meant going from "Publishers that only allow my_environment as an environment" to "Publishers that allow any environment".

New implementation

This PR changes the above approach to a single function per provider called lookup_by_claims() which takes a set of claims and returns a Publisher.

The multiple strategies are collapsed into a single query: we query for all publishers where the non-optional fields match the claims. In the example above, this means our single query looks for all publishers that match repository and job_workflow_ref, ignoring the environment value.

We then look at the resulting Python Publisher objects, and select the most specific one.

Rationale

The reasons for this change are:

Removing the multiple lookup strategies, which are unnecessary since a single (general) query is enough
Preparing for adding support for GitHub reusable workflows (as sketched in Trusted publishing: Support for GitHub reusable workflows #11096 (comment)).

cc @woodruffw @miketheman @di

Signed-off-by: Facundo Tuesca <facundo.tuesca@trailofbits.com>

miketheman

I like the move from using a list-in-subclass to a function-that-contains-the-complexity, as that simplifies the overall design and allows individual publishers to decide their own lookup method.

I'm also a little wary of how well-tested these publishers are. There's one functional test for provenance that creates a GitHub publisher, but everything else in theOIDC publishers stack seems heavily unit/mocked tests, so would it make sense to add some more functional tests around these interactions? The amount of mixins and abstractions make me a little cautions, being less familiar with the code behaviors.

miketheman · 2025-06-12T20:25:33Z

warehouse/oidc/models/_core.py

@@ -174,20 +174,10 @@ class OIDCPublisherMixin:
    # but there are a few problems: those claim sets don't map to their
    # "equivalent" column (only to an instantiated property), and may not
    # even have an "equivalent" column.


I think this comment is now invalid and probably should be written to express the new design of "implement the checks in a subclass' lookup_by_claims()

good catch, fixed

miketheman · 2025-06-12T20:51:35Z

warehouse/oidc/models/google.py

+            specific_publishers = [p for p in publishers if p.sub == sub]
+            if specific_publishers:
+                return specific_publishers[0]


Optimize: This might be cleaner with more_itertools.first_true, like so:

Suggested change

specific_publishers = [p for p in publishers if p.sub == sub]

if specific_publishers:

return specific_publishers[0]

if specific_publisher := first_true(publishers, pred=lambda p: p.sub == sub):

return specific_publisher

And in other places that are using this iteration pattern. Definitely worth thinking about other selector patterns from more-itertools, since we have it available for use.

Not a sticking point, but wanted to call out the pattern while I noticed it.

facutuesca · 2025-06-18T11:51:41Z

@miketheman

so would it make sense to add some more functional tests around these interactions

I have a question that came up while trying to write the functional tests: I started the tests by POSTing to /_/oidc/mint-token, since the token exchange is where the trusted publisher lookup happens. However, since I'm not mocking the request object (like the unit tests do), I get a failure here:

warehouse/warehouse/oidc/services.py

Lines 343 to 346 in 79691b4

    
           def __call__(self, _context, request): 
        
               cache_url = request.registry.settings["oidc.jwk_cache_url"] 
        
               audience = request.registry.settings["warehouse.oidc.audience"] 
        
               metrics = request.find_service(IMetricsService, context=None)

since the oidc.jwk_cache_url settings key does not exist:

>       cache_url = request.registry.settings["oidc.jwk_cache_url"]
E       KeyError: 'oidc.jwk_cache_url'

Do we have a way to set it from the tests, without mocking the whole request object?

oidc: Refactor lookup strategies into single functions

5ea66ff

Signed-off-by: Facundo Tuesca <facundo.tuesca@trailofbits.com>

facutuesca requested a review from a team as a code owner May 21, 2025 22:03

woodruffw added the trusted-publishing label May 21, 2025

views: Test adding duplicated GH TP after normalization

6f12521

Signed-off-by: Facundo Tuesca <facundo.tuesca@trailofbits.com>

miketheman reviewed Jun 12, 2025

View reviewed changes

facutuesca added 2 commits June 18, 2025 12:54

Fix outdated comment

d158b26

Use more_itertools.first_true in publisher lookup

00be88b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

oidc: Refactor lookup strategies into single functions #18169

oidc: Refactor lookup strategies into single functions #18169

Uh oh!

facutuesca commented May 21, 2025

Uh oh!

miketheman left a comment

Uh oh!

miketheman Jun 12, 2025

Uh oh!

facutuesca Jun 18, 2025

Uh oh!

miketheman Jun 12, 2025

Uh oh!

facutuesca Jun 18, 2025

Uh oh!

facutuesca commented Jun 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

oidc: Refactor lookup strategies into single functions #18169

Are you sure you want to change the base?

oidc: Refactor lookup strategies into single functions #18169

Uh oh!

Conversation

facutuesca commented May 21, 2025

Context

New implementation

Rationale

Uh oh!

miketheman left a comment

Choose a reason for hiding this comment

Uh oh!

miketheman Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

facutuesca Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

miketheman Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

facutuesca Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

facutuesca commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

facutuesca commented Jun 18, 2025 •

edited

Loading