Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extends Pandera validator to handle more DF types #596

Merged
merged 1 commit into from
Dec 21, 2023

Conversation

skrawcz
Copy link
Collaborator

@skrawcz skrawcz commented Dec 19, 2023

We were assuming only pandas annotated functions. This changes that and ensure that functions annotated with a dask datatype will work.

Note, added pyspark without adding a test.

Changes

  • pandera validator

How I tested this

  • locally via unit tests

Notes

  • this uses the registered dataframe and column types from registry filtering on an explicit list that pandera supports.

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

Copy link
Contributor

sweep-ai bot commented Dec 19, 2023

Apply Sweep Rules to your PR?

  • Apply: All new business logic should have corresponding unit tests.
  • Apply: Refactor large functions to be more modular.
  • Apply: Add docstrings to all functions and file headers.

@skrawcz skrawcz force-pushed the add_dask_pandera_type_support branch from 4c34d89 to feb5954 Compare December 19, 2023 19:54
We were assuming only pandas annotated functions. This changes
that and ensure that functions annotated with a dask datatype will work.

Note, added pyspark without adding a test. Don't want to require having pyspark
for unit tests just yet...
Copy link
Collaborator

@elijahbenizzy elijahbenizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@skrawcz skrawcz merged commit 71b4f2d into main Dec 21, 2023
22 checks passed
@skrawcz skrawcz deleted the add_dask_pandera_type_support branch December 21, 2023 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants