Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for dataframe schema transformations: add_column, remove_column #6

Closed
cosmicBboy opened this issue Nov 19, 2018 · 1 comment

Comments

@cosmicBboy
Copy link
Collaborator

these should be methods that correspond with pandas dataframe operations.

For example, if the user adds a column to a dataframe, also support changing the corresponding schema to account for that change:

df = pd.DataFrame({"a": [1, 2, 3]})

schema = DataFrameSchema([Column("a", PandasDtype.Int)])
df = schema.validate(df)

# add a column to the dataframe
df["b"] = ["x", "y", "z"]

# add column to the dataframe schema
schema = schema.add_column(Column("b", PandasDtype.String))
df = schema.validate(df)

# same with removing columns
df = df.dropna("a", axis=1)
schema = schema.remove_column("a")

df = schema.validate(df)


# or reflecting changes in an existing column
df["a"] = df["a"].astype(float)
schema = schema.change_column(Column("a", PandasDtype.Float))

df = schema.validate(df)
@cosmicBboy
Copy link
Collaborator Author

this may obfuscate the code and be counter-productive to the entire point of pandera, which is to make the code more readable.

cosmicBboy pushed a commit that referenced this issue May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant