Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-39865: [C++] Strip extension metadata when importing a registered extension #39866

Merged
merged 2 commits into from Feb 5, 2024

Conversation

pitrou
Copy link
Member

@pitrou pitrou commented Jan 31, 2024

Rationale for this change

When importing an extension type from the C Data Interface and the extension type is registered, we would still leave the extension-related metadata on the storage type.

What changes are included in this PR?

Strip extension-related metadata on the storage type if we succeed in recreating the extension type.
This matches the behavior of the IPC layer and allows for more exact roundtripping.

Are these changes tested?

Yes.

Are there any user-facing changes?

No, unless people mistakingly rely on the presence of said metadata.

Copy link
Member

@bkietz bkietz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, just one nit

cpp/src/arrow/c/bridge_test.cc Show resolved Hide resolved
@@ -188,7 +186,7 @@ std::vector<std::pair<std::string, std::string>> KeyValueMetadata::sorted_pairs(
return pairs;
}

int KeyValueMetadata::FindKey(const std::string& key) const {
int KeyValueMetadata::FindKey(std::string_view key) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting review Awaiting review labels Feb 2, 2024
@pitrou pitrou force-pushed the gh39865-c-data-interface-extensions branch from 3c3033d to 7d025df Compare February 5, 2024 15:32
@pitrou
Copy link
Member Author

pitrou commented Feb 5, 2024

CI failures are unrelated, will merge.

@pitrou pitrou merged commit 56951fe into apache:main Feb 5, 2024
29 of 32 checks passed
@pitrou pitrou removed the awaiting merge Awaiting merge label Feb 5, 2024
@pitrou pitrou deleted the gh39865-c-data-interface-extensions branch February 5, 2024 16:16
Copy link

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 56951fe.

There were 2 benchmark results indicating a performance regression:

The full Conbench report has more details. It also includes information about 5 possible false positives for unstable benchmarks that are known to sometimes produce them.

dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…tered extension (apache#39866)

### Rationale for this change

When importing an extension type from the C Data Interface and the extension type is registered, we would still leave the extension-related metadata on the storage type.

### What changes are included in this PR?

Strip extension-related metadata on the storage type if we succeed in recreating the extension type.
This matches the behavior of the IPC layer and allows for more exact roundtripping.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No, unless people mistakingly rely on the presence of said metadata.
* Closes: apache#39865

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
raulcd pushed a commit that referenced this pull request Feb 20, 2024
…extension (#39866)

### Rationale for this change

When importing an extension type from the C Data Interface and the extension type is registered, we would still leave the extension-related metadata on the storage type.

### What changes are included in this PR?

Strip extension-related metadata on the storage type if we succeed in recreating the extension type.
This matches the behavior of the IPC layer and allows for more exact roundtripping.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No, unless people mistakingly rely on the presence of said metadata.
* Closes: #39865

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Feb 28, 2024
…tered extension (apache#39866)

### Rationale for this change

When importing an extension type from the C Data Interface and the extension type is registered, we would still leave the extension-related metadata on the storage type.

### What changes are included in this PR?

Strip extension-related metadata on the storage type if we succeed in recreating the extension type.
This matches the behavior of the IPC layer and allows for more exact roundtripping.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No, unless people mistakingly rely on the presence of said metadata.
* Closes: apache#39865

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
thisisnic pushed a commit to thisisnic/arrow that referenced this pull request Mar 8, 2024
…tered extension (apache#39866)

### Rationale for this change

When importing an extension type from the C Data Interface and the extension type is registered, we would still leave the extension-related metadata on the storage type.

### What changes are included in this PR?

Strip extension-related metadata on the storage type if we succeed in recreating the extension type.
This matches the behavior of the IPC layer and allows for more exact roundtripping.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No, unless people mistakingly rely on the presence of said metadata.
* Closes: apache#39865

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++] Schema C Data Interface import keeps metadata for registered extension type
2 participants