Skip to content

Conversation

c-thiel
Copy link
Contributor

@c-thiel c-thiel commented May 15, 2025

Rationale for this change

Currently pyiceberg remote singing only works if the sign endpoint is shared by all tables in a REST Catalog.
However, some Catalogs use table specific endpoints.

If table specific endpoints are used, Pyiceberg sends the sign request for the second table that is queried to the sign endpoint of the first table.
The reason for this is, that although we re-register() a new signer with different properties, this statement has no effect the second time it runs because of the unique_id, even if the properties are different.
https://github.com/boto/botocore/blob/8c517320c6a40cd91e8e7fbb05e27183ba2f6dce/botocore/hooks.py#L310-L312

This PR first unregisters the old handler, before adding the new one.

Are these changes tested?

No. Any idea how we could test them?
Just tested against LAKEKEEPER in a notebook while implementing table specific endpoints.

Are there any user-facing changes?

It works now!

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @c-thiel for fixing this. Unregistering the old ones seems the right thing to do 🚀

@Fokko Fokko merged commit 6456a8d into apache:main May 15, 2025
10 checks passed
gabeiglio pushed a commit to Netflix/iceberg-python that referenced this pull request Aug 13, 2025
<!--
Thanks for opening a pull request!
-->

<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->

# Rationale for this change
Currently pyiceberg remote singing only works if the sign endpoint is
shared by all tables in a REST Catalog.
However, some Catalogs use table specific endpoints.

If table specific endpoints are used, Pyiceberg sends the sign request
for the second table that is queried to the sign endpoint of the first
table.
The reason for this is, that although we [re-register() a new signer
with different
properties](https://github.com/apache/iceberg-python/blob/996a7ba4dbf4afdb3d46689f1715206b1c355f2a/pyiceberg/io/fsspec.py#L166),
this statement has no effect the second time it runs because of the
unique_id, even if the properties are different.

https://github.com/boto/botocore/blob/8c517320c6a40cd91e8e7fbb05e27183ba2f6dce/botocore/hooks.py#L310-L312

This PR first unregisters the old handler, before adding the new one.

# Are these changes tested?
No. Any idea how we could test them?
Just tested against LAKEKEEPER in a notebook while implementing table
specific endpoints.

# Are there any user-facing changes?
It works now!
<!-- In the case of user-facing changes, please add the changelog label.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants