fix: use pd.to_numeric in df_metrics_to_num to handle string-encoded numerics from ClickHouse#40190
Conversation
Code Review Agent Run #27c151Actionable Suggestions - 0Additional Suggestions - 1
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #40190 +/- ##
==========================================
- Coverage 64.58% 63.73% -0.86%
==========================================
Files 2564 2586 +22
Lines 133576 137510 +3934
Branches 31033 31631 +598
==========================================
+ Hits 86271 87636 +1365
- Misses 45813 48338 +2525
- Partials 1492 1536 +44
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
56b724d to
7e3cb89
Compare
|
The flagged issue is valid. Using superset/common/utils/dataframe_utils.py |
Code Review Agent Run #8c3520Actionable Suggestions - 0Additional Suggestions - 1
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
7e3cb89 to
dd2a0d2
Compare
Code Review Agent Run #bbb707Actionable Suggestions - 0Additional Suggestions - 1
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
|
Hi @hainenber, all checks are passing and the fix has been updated Thank you! |
hainenber
left a comment
There was a problem hiding this comment.
This fix is a bit broad and can potentially affect other DB sources as well. Is there a way to limit it to ClickHouse instead?
|
Hi @hainenber, updated the fix to be safer, now it only applies |
dd2a0d2 to
4379aea
Compare
Code Review Agent Run #6844e8Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
|
Hi @hainenber, I've updated the fix based on your feedback. The implementation now only converts a column when ALL non-null values successfully convert to numeric, making it safe for mixed-type columns from any backend: converted = pd.to_numeric(df[col], errors="coerce")
if converted.notna().eq(df[col].notna()).all():
df[col] = convertedThis is safer than the original |
hainenber
left a comment
There was a problem hiding this comment.
Let's add a unit test to prevent regression since this is quite impactful.
4379aea to
a650c9f
Compare
|
Hi @hainenber, added a unit test to
Could you re-review? Thank you! |
Code Review Agent Run #fd79c1Actionable Suggestions - 0Additional Suggestions - 1
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
a650c9f to
e6c1532
Compare
Code Review Agent Run #1b821bActionable Suggestions - 0Filtered by Review RulesBito filtered these suggestions based on rules created automatically for your feedback. Manage rules.
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
48435ee to
12c9a4a
Compare
|
Hi @hainenber, the only failing check is All other required checks are passing. Could you re-review and approve? Thank you! |
There was a problem hiding this comment.
Code Review Agent Run #fc4dc1
Actionable Suggestions - 1
-
tests/unit_tests/common/test_dataframe_utils.py - 1
- Incorrect test data construction · Line 60-63
Review Details
-
Files reviewed - 2 · Commit Range:
12c9a4a..12c9a4a- superset/common/utils/dataframe_utils.py
- tests/unit_tests/common/test_dataframe_utils.py
-
Files skipped - 0
-
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful
Bito Usage Guide
Commands
Type the following command in the pull request comment and save the comment.
-
/review- Manually triggers a full AI review. -
/pause- Pauses automatic reviews on this pull request. -
/resume- Resumes automatic reviews. -
/resolve- Marks all Bito-posted review comments as resolved. -
/abort- Cancels all in-progress reviews.
Refer to the documentation for additional commands.
Configuration
This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.
Documentation & Help
Fixes apache#39951 The exclude_mount_points patterns used glob syntax (e.g. /dev/*) with match_type: regexp. Replace with properly anchored regexp patterns. Co-authored-by: Đỗ Trọng Hải <41283691+hainenber@users.noreply.github.com>
12c9a4a to
ebdcd9c
Compare
Code Review Agent Run #86e44dActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Fixes #39951
Problem
ClickHouse (and some other backends) return numeric SUM() columns as
strings. The existing
infer_objects()call does not convert these tonumeric types, causing
pandas/core/nanops.pyto raise:TypeError: Could not convert string '655' to numericFix
Replace
infer_objects()withpd.to_numeric(errors="ignore")whichcorrectly converts string-encoded numerics while leaving truly
non-numeric columns unchanged.
Testing
The existing
df_metrics_to_numtests cover this path.