Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add title/description fields #754

Merged
merged 9 commits into from Feb 5, 2022
Merged

Conversation

smackesey
Copy link
Contributor

@smackesey smackesey commented Jan 31, 2022

This PR addresses #728

Right now it's just a start, but my manual (not very thorough testing) indicates that it's working.

Other things that should probably be added/modified:

  • docs
  • tests
  • error messages should use descriptions

Also, I had some failing tests but I don't think these are related to this PR:

FAILED tests/koalas/test_schemas_on_koalas.py::test_nullable[dtype5] - TypeError: field field: LongType can not accept object <NA> in type <class 'pandas._libs.missing.NAType'>
FAILED tests/koalas/test_schemas_on_koalas.py::test_nullable[dtype10] - TypeError: field field: ShortType can not accept object <NA> in type <class 'pandas._libs.missing.NAT...
FAILED tests/koalas/test_schemas_on_koalas.py::test_nullable[dtype11] - TypeError: field field: ByteType can not accept object <NA> in type <class 'pandas._libs.missing.NATy...
FAILED tests/koalas/test_schemas_on_koalas.py::test_nullable[dtype13] - TypeError: field field: IntegerType can not accept object <NA> in type <class 'pandas._libs.missing.N...
FAILED tests/modin/test_schemas_on_modin.py::test_nullable[ray-dtype2] - ValueError: StringArray requires a sequence of strings or pandas.NA
FAILED tests/strategies/test_strategies.py::test_check_nullable_field_strategy[True-index_strategy-data_type24] - ValueError: StringArray requires a sequence of strings or p...
FAILED tests/strategies/test_strategies.py::test_check_nullable_field_strategy[True-series_strategy-data_type24] - ValueError: StringArray requires a sequence of strings or ...

@codecov
Copy link

codecov bot commented Jan 31, 2022

Codecov Report

Merging #754 (0a9c904) into dev (1b90834) will decrease coverage by 0.46%.
The diff coverage is 95.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev     #754      +/-   ##
==========================================
- Coverage   98.16%   97.70%   -0.47%     
==========================================
  Files          45       45              
  Lines        3975     4006      +31     
==========================================
+ Hits         3902     3914      +12     
- Misses         73       92      +19     
Impacted Files Coverage Δ
pandera/model.py 95.01% <ø> (-3.45%) ⬇️
pandera/schema_components.py 99.54% <ø> (ø)
pandera/schemas.py 99.22% <93.75%> (-0.14%) ⬇️
pandera/checks.py 98.51% <100.00%> (+0.01%) ⬆️
pandera/model_components.py 95.57% <100.00%> (+0.16%) ⬆️
pandera/typing/config.py 100.00% <100.00%> (ø)
pandera/typing/modin.py 85.71% <0.00%> (-14.29%) ⬇️
pandera/__init__.py 94.28% <0.00%> (-5.72%) ⬇️
pandera/typing/common.py 94.44% <0.00%> (-4.45%) ⬇️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1b90834...0a9c904. Read the comment docs.

@cosmicBboy
Copy link
Collaborator

thanks @smackesey ! would you mind rebasing your changes onto dev and changing the PR target to it? that should address at least some of the errors.

@smackesey smackesey changed the base branch from master to dev February 2, 2022 14:02
@cosmicBboy
Copy link
Collaborator

Looks great!

The BaseConfig class needs to be extended with title and description properties here, which is then used by SchemaModel when converting to a DataFrameSchema here.

pandera/checks.py Outdated Show resolved Hide resolved
@cosmicBboy
Copy link
Collaborator

unit tests are looking good!

Other things that should probably be added/modified:

  • docs
  • tests
  • error messages should use descriptions

Are you planning to address this in another PR?

@smackesey
Copy link
Contributor Author

Are you planning to address this in another PR?

Up to you how to handle this. I posted this PR without adding these "secondary" items because I didn't really know whether they were necessary and/or the right approach. Projects have different approaches to testing/docs and it can be difficult to know where to add without a global understanding. Sometimes it makes sense for a contributor to add the core code for a feature and a maintainer to fill in this secondary materials.

But, I can add them in this PR if you like. Let me get more specific with questions:

  • Docs: Is it necessary to add any additional material, or are the updated docstrings (and subsequent updates to API docs sufficient)?
  • Tests: the fact that adding these fields did not cause any tests to fail suggests to me that there's no obvious place to add/modify tests for these fields. Am I wrong? Happy to add/adjust if you can give me some pointers as to where.
  • Error messages leveraging this metadata is a separate feature and IMO does not need to be in this PR, though if you disagree happy to work with you.

@cosmicBboy
Copy link
Collaborator

I think the auto-docs in the API reference should be enough for this.

re: tests, the only thing to add here would be to update test_config to just make sure the conversion from SchemaModel to DataFrameSchema preserves the title and description metadata.

Error messages leveraging this metadata is a separate feature and IMO does not need to be in this PR, though if you disagree happy to work with you.

Agreed, #695 should address this

tests/core/test_model.py Outdated Show resolved Hide resolved
@cosmicBboy
Copy link
Collaborator

merging this now, thanks @smackesey !

@cosmicBboy cosmicBboy merged commit df1e826 into unionai-oss:dev Feb 5, 2022
cosmicBboy added a commit that referenced this pull request Feb 8, 2022
* Geopandas (#732)

* Add support for geopandas GeometryDtype (#698)

* Add geopandas support

* small tweak

* linting

* requirements

* fixes

* import variable

* fin

* too many ancestors

* ok actually please be done

* Update geopandas.py

* Update geopandas.py

Co-authored-by: Niels Bantilan <niels.bantilan@gmail.com>

* geopandas docs (#711)

* geopandas docs

* links

* geopandas unit tests, more docs (#731)

* ci doesn't install geopandas on windows

* update strategies tests

* update strategies tests

* update tests

* update docs

* pin sphinx-autodoc-typehints=1.14.1

error in importing Literal in python 3.7

* exclude check docs from windows os build

Co-authored-by: Roshan A <roshan.agrawal95@gmail.com>

* typed descriptors and setup.py only includes pandera (#739)

* typed descriptors and setup.py only includes pandera

* black

* Update common.py

Co-authored-by: Niels Bantilan <niels.bantilan@gmail.com>

* Bugfix/734 (#735)

* fix error-reporting bug for pandas==1.1.5

* add tests, add pandas 1.1.5 to CI

* fix ci conditional on modin

* simplify error formatting logic

* Fastapi initial integration (#741)

* fastapi support

* wip

* [wip] add pre/post_format

* implement pre/post-format semantics on check_types

* rename from/to format config and methods

* pa.check_types uses pydantic.validate_arguments

* update tests

* add check_types(use_pydantic:bool) flag

* rename arg to with_pydantic

* add tests for format conversion

* add happy path fastapi tests

* fix Literal import

* fix deps

* remove prototyping files

* fix lint

* fix typing.fastapi module imports

* ignore mypy errors

* fix set_index with MultiIndex (#751)

* strategies: correctly handle StringArray null values (#748)

* strategies: correctly handle StringArray null values

* handle pandas 1.1.5 and 1.4.0 errors

* Add Python 3.10 to CI matrix (#724)

* Add Python 3.10 to CI

* Fix CI

* Exclude modin tests on Python 3.10

* Exclude ray from Python 3.10 build

* Exclude modin-ray from Python 3.10 build

* Fix environment.yml version of ray

* exclude ray from base deps installation

* Update .github/workflows/ci-tests.yml

* Update .github/workflows/ci-tests.yml

* fix ci and setup.py

* update deps

* update deps, nox

* update ci, noxfile

* update ci

* simplify nox linters

* exclude geopandas in py310

* update nox

* update black version

* update reqs

* update readthedocs py version

* dont install koalas for py310

* update readthedocs py version

* update deps

* test readthedocs

* add back ci tests

* windows

* update pandas version in nox

* remove koalas + windows

* fix ci file

* ignore windows doc checks

* add back other os

* dont build docs for py3.10

* test ubuntu

* set max parallel

* try no-env

* try no-env

* testing

* testing

* get python version from sys

* upgrade pip

* pytest verbosity

* add back cache

* bump cache version

* testing

* testing

* update

* try using conda

* update conda setup

* update conda setup

* use conda

* unpin numpy

* update reqs

* only use conda-forge

* update

* update

* use pip

* use pip

* testing

* install numpy

* increase cache

* update

* test nox

* cache pip

* use nox

* use nox

* update

* testing

* testing

* testing

* testing

* testing

* testing

* testing

* testing

* testing

* testing

* add matrix

* disable koalas in ci

* update

* testing

* testing

* testing

* testing

* testing

* testing

* use nox

* use pytest

* testing

* clean up

Co-authored-by: Niels Bantilan <niels.bantilan@gmail.com>

* fastapi docs, add to ci (#753)

* fastapi docs, add to ci

* add data conversion docs page, sponsor button

* fix lint

* update requirements file

* fix mypy test

* update deps

* fix tests

* fix docs deps

* update docs copy

* more docs copy updates

* formatting

* fix yml

* add title/description fields (#754)

* add title/description fields

* modify base config with title/desc

* fix docstring

* black

* let SchemaModel use docstring as desc

* update test_config

* fix black

* Update tests/core/test_model.py

* fix black

Co-authored-by: Niels Bantilan <niels.bantilan@gmail.com>

* Pull request: Nullable float (#721)

* New datatype: Panda's (ver 1.2.0+) nullable float.

Documentation: https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.2.0.html?highlight=float64dtype#experimental-nullable-data-types-for-float-data

* PyLint expects uppercase snake case.

* fix unit test

* fix docs

* fix fastapi tests

Co-authored-by: Vova <vladimirvilimaitis@gmail.com>
Co-authored-by: cosmicBboy <niels.bantilan@gmail.com>

Co-authored-by: Roshan A <roshan.agrawal95@gmail.com>
Co-authored-by: Cristian Matache <cristianmatache@hotmail.com>
Co-authored-by: James Myatt <james@jamesmyatt.co.uk>
Co-authored-by: Sean Mackesey <s.mackesey@gmail.com>
Co-authored-by: vovavili <64227274+vovavili@users.noreply.github.com>
Co-authored-by: Vova <vladimirvilimaitis@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants