Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check_for_invalid_stats_in_isnull #16776

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

deanm0000
Copy link
Collaborator

This is intended to be in conjunction with #16766 to fix #16683 and #15323

There are really two things to fix with those issues. The first, which @nameexhaustion addressed is to stop writing bad stats but the other thing to fix is to have the reader recognize bad stats (ie when the min > max) and then ignore those stats since those files will continue to exist.

I only did one boolean function because I'm hoping there's a way to hoist the check up higher rather than having copy/paste the check in every single function. Also, I probably didn't do it optimally and wanted to get any pointers before doing anymore. I don't think I'll be able to put anymore work into this until at least next week so if someone else wants to take this over then that's fine with me.

Copy link

codecov bot commented Jun 6, 2024

Codecov Report

Attention: Patch coverage is 76.92308% with 3 lines in your changes missing coverage. Please review.

Project coverage is 81.32%. Comparing base (8370e3c) to head (07a301c).
Report is 16 commits behind head on main.

Files Patch % Lines
crates/polars-expr/src/expressions/apply.rs 76.92% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #16776      +/-   ##
==========================================
- Coverage   81.45%   81.32%   -0.13%     
==========================================
  Files        1413     1424      +11     
  Lines      185954   187215    +1261     
  Branches     2729     2714      -15     
==========================================
+ Hits       151464   152254     +790     
- Misses      33999    34464     +465     
- Partials      491      497       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle parquet files with incorrect statistics in scan_parquet
1 participant