Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

awswrangler.s3.read_parquet_metadata does not support fixed_size_binary datatype #2773

Closed
simonprydden opened this issue Apr 13, 2024 · 0 comments · Fixed by #2775
Closed
Labels
bug Something isn't working needs-triage

Comments

@simonprydden
Copy link

Describe the bug

Storing parquet data in S3 which contains the fixed_size_binary datatype, running awswrangler.s3.read_parquet_metadata(..) on that file results in error Unsupported Pyarrow type: fixed_size_binary[16].

Data type as mentioned by the pyarrow spec: : https://arrow.apache.org/docs/python/generated/pyarrow.FixedSizeBinaryType.html

How to Reproduce

import awswrangler

# Simple parquet file on S3 with fixed_size_binary  datatype
awswrangler.s3.read_parquet_metadata('s3://<bucket>/path/to/file.parquet')

Expected behavior

Expected behaviour would be that fixed_size_binary is mapped to UUID.

Your project

No response

Screenshots

No response

OS

Win

Python version

3.8

AWS SDK for pandas version

3.7.2

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant