-
-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: properly coerce dtypes for columns with regex=True #1602
fix: properly coerce dtypes for columns with regex=True #1602
Conversation
59d0f45
to
46096bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! Looks good to me, see comment below
tests/core/test_schema_components.py
Outdated
@@ -53,6 +56,35 @@ def test_column_coerce() -> None: | |||
assert Engine.dtype(validated.a.dtype) == Engine.dtype(int) | |||
|
|||
|
|||
def test_config_coerce() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One recommendation I have is to convert these into DataFrameSchema
s instead of using DataFrameModel
because the latter is essentially converted into the former when validation happens, i.e. DataFrameSchema
the "true" representation of a schema.
They can live in test_schemas.py
if you decide to change this. Also happy to accept these tests using DataFrameModel
, in which case they ought to be in test_model.py
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1602 +/- ##
===========================================
- Coverage 94.29% 83.11% -11.18%
===========================================
Files 91 116 +25
Lines 7024 8536 +1512
===========================================
+ Hits 6623 7095 +472
- Misses 401 1441 +1040 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Tess Linden <tess.linden@gmail.com>
Signed-off-by: Tess Linden <tess.linden@gmail.com>
Signed-off-by: Tess Linden <tess.linden@gmail.com>
0b00a29
to
ccc70f2
Compare
Signed-off-by: Tess Linden <tess.linden@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @tesslinden 🚀 and congrats on your first PR to pandera 🎉
Awesome! Thank you! |
Fixes #1182. I ran into this bug myself, then found it was previously reported.
See the included tests for a minimal example of the bug:
test_config_coerce()
passes onmain
;test_config_coerce_with_regex()
fails onmain
, but passes with this fix.The change I've submitted here is the minimal change necessary to fix the bug. With this fix, some code is duplicated between the regex and non-regex blocks of the
_coerce_dtype_helper()
function. I considered separating it into helper functions like_should_coerce()
or_override_and_try_coercion()
, but there are several ways one could split it up, so I figured reviewers can decide which of those would be preferred.Also, I wasn't sure which file the tests should go in -- let me know if they should be moved.