Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parquet read : Invalid decimal encoding in Parquet file #12621

Closed
2 tasks done
Jbcot opened this issue Jun 20, 2024 · 2 comments · Fixed by #12655
Closed
2 tasks done

Parquet read : Invalid decimal encoding in Parquet file #12621

Jbcot opened this issue Jun 20, 2024 · 2 comments · Fixed by #12655

Comments

@Jbcot
Copy link

Jbcot commented Jun 20, 2024

What happens?

I encounter the following error when I read a parquet that contain a DECIMAL(17, 4) column with a python script running on Linux.
The same error occurs on windows using DBeaver tool to read the Parquet file ith a DuckDB connection.
I precise that the file has been extracted from a DB2/AS400 system with Talend ETL tool.
I have no issue to open the file with another parquet viewer.

duckdb.duckdb.InvalidInputException: Invalid Input Error: Attempting to execute an unsuccessful or closed pending query result
Error: Invalid Input Error: Invalid decimal encoding in Parquet file

The file to test is attached to the issue.
sample.zip

To Reproduce

FROM 'sample.parquet';

OS:

Debian GNU/Linux 11

DuckDB Version:

1.0.0

DuckDB Client:

Python

Full Name:

Jean-Blaise Cottenceau

Affiliation:

Manitou Group

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have
@szarnyasg
Copy link
Collaborator

Hi @Jbcot, thanks! I could reproduce the issue with a plain SQL code.

@fedefrancescon
Copy link
Contributor

Hi, I've tried fixing this with #12655
Hope that sounds good. Would be nice to add a smaller file to the test as just a few rows should be sufficient.
Unfortunately I'm not sure on how to "strip" the file size without importing/exporting the datas.

As I'm pretty new here, any suggestion and enhancement would be much appreciated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants