Skip to content

Releases: unionai-oss/pandera

0.6.1: coercion and required column bugfixes

07 Jan 01:03
bfdb118
Compare
Choose a tag to compare

Bugfix Release

This release contains two bugfixes:

  • coerce nullable str column handles all na (#366)
  • non-required columns that are not in dataframe are not coerced (#368)

0.6.0: Data Synthesis Strategies, Schema Enhancements

17 Dec 20:07
Compare
Choose a tag to compare

🎉🎉🎉 Thanks to @jeffzi, @ktroutman, @m1so for your contributions! 🎉🎉🎉

Enhancements

  • Improve memory efficiency of validation process (#360)
  • Add column order validation (#352)
  • Implement data synthesis strategies using hypothesis (#344)
  • Add support for aliases in SchemaModel (#329)
  • Add support for optional name validation of single-index (#326)
  • Move columns to multiindex: add reset_index, set_index method to DataFrameSchema (#319)
  • Add support for Python 3.9 (#307)

Bugfixes

  • typing.DataFrame should expect annotation input (#318)

Deprecations

  • SchemaErrors.schema_errors has been changed to failure_cases, and the schema_errors attribute now contains a list of dicts containing schema errors and reason codes. This is a breaking change, but is a minor part of the API and is fairly straightforward to fix (#360).

Documentation Improvements

  • Add required columns documentation for schema models (#362)
  • Fix docs: schema examples (#347)
  • Add documentation for dataframeschema transformations (#333)
  • Fix deprecated SchemaErrorReport references in docs (#310)
  • Fix SchemaModel dtype example (#309)

Repo Improvements

  • Update logo 69c6e56
  • Add flynt to pre-commit hooks (#325)
  • Use generic zenodo link for citation information c4f4fe7

0.5.1: bugfix - add packaging dependency

28 Nov 21:15
Compare
Choose a tag to compare

pandera relied on the packaging package to get version information to determine pandas legacy status. This was an implicit sub-dependency of one of pandera's dependencies, which was apparently dropped and led to a bug: #335. This bugfix version explicitly adds packaging.

0.5.0: Class-based API for DataFrame Typing

25 Oct 13:24
db31b10
Compare
Choose a tag to compare

Enhancements

  • Implement class-based API for pydantic-style schema definitions 786b504. Big thanks to @jeffzi 🎉
  • Add inplace=False argument to schema.validate method to prevent mutation of original dataframe 586ebf3.
  • Make pandera optional extensions [hypothesis], [io], [all] available c4716a0. Thanks @amitripshtos and @jeffzi 🎉
  • Add support for complex number data types 50e86e4 thanks @ferhah 🎉
  • Add support for numpy scalar types a519db5
  • Add check_io decorator for check inputs and outputs of a function 913cbd7
  • Throw SchemaError with column name instead of ValueError for nulls in int series f7b03e3 thanks @TheCleric 🎉

Bugfixes

  • Bugfix io.to_script and to_yaml: Ignoring serializing Checks with lambda functions da9c3a5 thanks @ferhah 🎉

Deprecations

  • Drop support for Python 3.5 91e21a2
  • Deprecate transformers argument in DataFrameSchema init 89c3c91

Documentation Improvements

Repo Improvements

0.4.5: additional type support, SeriesSchema index support, built-in check Aliases, bugfixes

22 Sep 00:53
038876d
Compare
Choose a tag to compare

Enhancements

  • improve failure case reporting more intuitive #232
  • rename internal decorator for setting check statistics #235 thanks @Aditya1001001
  • from_yaml supports all column properties #240 thanks @d33bs
  • support for nullable integer string aliases and dtypes #244
  • add check_output to the CheckResult namedtuple #251
  • built-in python scalar types are supported: int, float, str, bool #263
  • Use Check.name in Check.repr #265 thanks @JacobHayes
  • add comparison operator aliases to built-in checks #269
  • add support for SeriesSchema index specification #270

Bugfixes

  • io serialization can handle Index.name = None #248
  • pandas_dtype can be correctly set in Column object #256
  • fix check_input decorator when df passed in kwargs #257 thanks @vshulyak

Documentation Improvements

0.4.4: bugfixes in yaml serialization, error reporting, refactor internals

01 Jul 19:15
8efc536
Compare
Choose a tag to compare

New Features

  • DataFrameSchema provides rename_columns method #226 @baskervilski
  • Failure case reporting is more intuitive as a tidy dataframe #232

Bugfixes

0.4.3: bugfixes handle scalar False check_fn, yaml schema supports strict kwarg

16 Jun 22:10
a34f4fd
Compare
Choose a tag to compare

Bugfixes

  • lazy validation handles check returning scalar False value 3bf8e72,
  • add strict keyword to yaml 3bd2655 - first contribution by @staylorx 🎉🎉🎉

bugfix: conda build failure, use version.py file

05 Jun 14:41
Compare
Choose a tag to compare

package uses version.py file for single source of truth of package version.

add check ignore_na arg, bugfixes

05 Jun 04:31
Compare
Choose a tag to compare

New Feature

  • ignore_na keyword argument to Check class 180072e
    drops null columns within the check function before passing to check_fn. The
    SeriesSchemaBase.validate method no longer does this.

Bugfixes

  • one sample hypothesis tests in the context of Column shouldn't need samples arg
    0b0d6cc
  • add lazy kwarg to check_input and check_output 923a197
  • fix lazy validation with nullable columns f5051fa

Schema inference, serialization, and lazy validation

15 May 20:25
Compare
Choose a tag to compare

Schema Inference, Serialization, and Lazy Validation

New Features

  • schema inference #202 #209
  • schema yaml and script serialization #203
  • lazy validation on schema validation #207

Minor Improvements

  • updated documentation
  • define _CheckBase class #195
  • simplify type-matching logic #194