-
-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix/1616: Polars data container validation #1623
Conversation
Signed-off-by: Connor Stabnick <cstabnick@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! See inline comments.
You can also set up pre-commit
to make sure these pass:
https://pandera--1618.org.readthedocs.build/en/1618/CONTRIBUTING.html#set-up-pre-commit
def check( | ||
self, | ||
pandera_dtype: DataType, | ||
data_container: Optional[polars_engine.PolarsDataContainer] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the type needs to be PolarsData
no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, was following how I saw other dtypes take the parameter in.
pandera/pandera/engines/polars_engine.py
Lines 370 to 371 in 612d25c
data_container: Optional[PolarsDataContainer] = None, | |
) -> Union[bool, Iterable[bool]]: |
It looks like when it needs to be treated as PolarsData, it may be wrapped in the class. Think I'm going to handle mypy error similar to
pandera/pandera/engines/polars_engine.py
Lines 632 to 635 in 612d25c
def coerce(self, data_container: PolarsDataContainer) -> pl.LazyFrame: | |
"""Coerce data container to the data type.""" | |
if isinstance(data_container, pl.LazyFrame): | |
data_container = PolarsData(data_container) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decided not to take this approach due to it just being for a specific test, updated to PolarsData
data_container: Optional[polars_engine.PolarsDataContainer] = None, | ||
) -> Union[bool, Iterable[bool]]: | ||
if key := data_container.key: | ||
ldf = data_container.lazyframe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mypy error here should be fixed if data_container
type is fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, precommit passes now
pandera_dtype: DataType, | ||
data_container: Optional[polars_engine.PolarsDataContainer] = None, | ||
) -> Union[bool, Iterable[bool]]: | ||
if key := data_container.key: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mypy error:
tests/polars/test_polars_components.py:206: error: Missing return statement [return]
due to lack of return statement in the outer if statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, pre commit now passes mypy and pylint
pandera_dtype: DataType, | ||
data_container: Optional[polars_engine.PolarsDataContainer] = None, | ||
) -> Union[bool, Iterable[bool]]: | ||
if key := data_container.key: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
key
is really only used once in the function body: I'd recommend just using data_container.key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1623 +/- ##
===========================================
- Coverage 94.29% 83.09% -11.20%
===========================================
Files 91 116 +25
Lines 7024 8538 +1514
===========================================
+ Hits 6623 7095 +472
- Misses 401 1443 +1042 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Connor Stabnick <cstabnick@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! codecov error looks like a false positive.
thanks for the contribution, and congrats on your first PR to pandera @cstabnick !
Adds support for polars data container validation in custom dtypes
Adds test demonstrating positive and negative case