Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow converting the Dataframe according to the defined schema #63

Open
mjspier opened this issue May 8, 2024 · 0 comments
Open

Allow converting the Dataframe according to the defined schema #63

mjspier opened this issue May 8, 2024 · 0 comments

Comments

@mjspier
Copy link
Contributor

mjspier commented May 8, 2024

Description

Pandera can not only be used to validate the Dataframe but also to convert the dtypes in the Dataframe accoding to the schema.

The schema.validate function returns the validated Dataframe with the converted dtypes. When can update the input dataframe with the validated dataframe so in the nodes we will get a validated and converted dataframe accorting to the schema.

Context

Possible Implementation

Add additional configuration parameter which allows per dataset to define if only want to validate or also to convert the dataset.
If it is also configuted to convert the dataset we can forward the converted the dataset in the hook.

A global parameter can be defined which allows to specify the default behaviout for all datasets which use a pandera schema.

@mjspier mjspier mentioned this issue May 10, 2024
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant