Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parquet stats_min_value/stats_max_value do not respect timestamp configuration #5533

Closed
2 tasks done
greaka opened this issue Nov 29, 2022 · 1 comment · Fixed by #5540
Closed
2 tasks done

Parquet stats_min_value/stats_max_value do not respect timestamp configuration #5533

greaka opened this issue Nov 29, 2022 · 1 comment · Fixed by #5540
Assignees

Comments

@greaka
Copy link

greaka commented Nov 29, 2022

What happens?

When reading parquet files and/or their metadata, timestamp column metadata statistics stats_min_value and stats_max_value seem to not respect the timestamps configuration of ms, resulting in where filters ignoring entire chunks.

To Reproduce

Example file using ms precision in stats:
2022-11-27T17 42 35.480941703Z.parquet.txt
select path_in_schema, stats_min_value, stats_max_value from 'file.parquet';
select * from 'file.parquet';
select * from 'file.parquet' where timestamp < '2022-01-01 00:00:00.000';
select * from 'file.parquet' where timestamp > '2022-01-01 00:00:00.000';
I've verified the raw values in the parquet file before filing this issue.

OS:

Linux-x64 & shell.duckdb.org

DuckDB Version:

0.6.0

DuckDB Client:

rust & wasm

Full Name:

Michl Steglich

Affiliation:

drf.rs

Have you tried this on the latest master branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree
@Mytherin
Copy link
Collaborator

Thanks for the report! I have pushed a fix in #5540

hannes added a commit that referenced this issue Nov 30, 2022
Fix #5533: correctly use timestamp logical type unit in Parquet stats reader
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants