Skip to content

Commit

Permalink
Moved optional packages to extras_require with "all" key (#284)
Browse files Browse the repository at this point in the history
* Moved optional packages to extras_require with "all" key

* Update setup.py

* Updated index.rst with installation options

* Updated README with installation options

* update readme

* update docs

* Update setup.py

* Added catches for import errors and raise the right error that explains how to download the right dependencies for pandera.

* Fixed pylint issues regarding raise-missing-from

* Changed RuntimeError to ImportError when there are missing packages.

* Made error_formatters.py not depend on hypotheses.py, and also raise exception on import issues only when using hypotheses and not when importing pandera.

* Use condition instead of assert when using scipy

* Added docs related to installations

* Changed _has_scipy parameter to be upper case to match linting

* testt have hypothesis/io dependency, update travis

* test pandera-core deps

* fix travis, update docs

* fix travis script

* add tests for Hypothesis import

* update documentation

* ignore coverage

Co-authored-by: Niels Bantilan <niels.bantilan@gmail.com>
Co-authored-by: amitripshtos <amit@noogata.com>
  • Loading branch information
3 people committed Oct 14, 2020
1 parent be23c4e commit c4716a0
Show file tree
Hide file tree
Showing 13 changed files with 155 additions and 31 deletions.
23 changes: 13 additions & 10 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ cache:

before_cache:
- rm -rf $CONDA_DIR/pkgs/cache
- rm -rf $CONDA_DIR/envs/hosts
- rm -rf $CONDA_DIR/envs/pandera
- rm -rf $CONDA_DIR/envs/pandera-core
- rm -rf $CONDA_DIR/conda-meta/history
- touch $CONDA_DIR/conda-meta/history

Expand Down Expand Up @@ -40,7 +41,7 @@ install:
if [ -d "$CONDA_DIR" ] && [ -e "$CONDA_BIN_DIR/conda" ]; then
echo "Miniconda install already present from cache: $CONDA_DIR"
rm -rf $CONDA_DIR/envs/hosts # Just in case...
rm -rf $CONDA_DIR/envs/* # Just in case...
else
echo "Installing Miniconda..."
rm -rf $CONDA_DIR # Just in case...
Expand All @@ -66,19 +67,21 @@ install:
conda update conda
conda update --all
conda info -a || exit 1
# Setup Conda Env
# Setup Conda Envs
- |
conda create -n hosts python=$PYTHON_VERSION || exit 1
conda env update -n hosts -f environment.yml
source activate hosts
python setup.py install
conda create -n pandera python=$PYTHON_VERSION || exit 1
conda env update -n pandera -f environment.yml
conda create -n pandera-core python=$PYTHON_VERSION pytest
conda list
- source activate pandera && pip install .[all]
- source activate pandera-core && pip install .

script:
# Use the conda environment
- source activate hosts
# Dependencies
# Test minimal installation
- source activate pandera-core && pytest
# Test full installation
- source activate pandera
# Check that requirements-dev.text is generated exclusively by environment.yml
- python ./scripts/generate_pip_deps_from_conda.py --compare
# Linting
Expand Down
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ readable and robust.

The official documentation is hosted on ReadTheDocs: https://pandera.readthedocs.io

.. installation:

## Install

Using pip:
Expand All @@ -47,6 +49,13 @@ Using pip:
pip install pandera
```

Installing optional functionality:
```
pip install pandera[hypotheses] # hypothesis checks
pip install pandera[io] # yaml/script schema io utilities
pip install pandera[all] # all packages
```

Using conda:

```
Expand Down
4 changes: 4 additions & 0 deletions docs/source/API_reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@
API Reference
=============

The ``io`` module and built-in ``Hypothesis`` checks require a pandera
installation with the corresponding extension, see the
:ref:`installation<installation>` instructions for more details.

Schemas
-------

Expand Down
10 changes: 10 additions & 0 deletions docs/source/hypothesis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,16 @@ Hypothesis Testing

``pandera`` enables you to perform statistical hypothesis tests on your data.


.. note::

The hypothesis feature requires a pandera installation with ``hypotheses``
dependency set. See the :ref:`installation<installation>` instructions for
more details.

Overview
--------

The :py:class:`Hypothesis` class defines built in methods, which can be called
as in this example of a two-sample t-test:

Expand Down
12 changes: 12 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@ production-critical data pipelines or reproducible research settings. With
via :ref:`function decorators<decorators>`.


.. _installation:


Install
-------

Expand All @@ -32,6 +35,14 @@ Install with `pip`:
pip install pandera
Installing optional functionality:

.. code:: bash
pip install pandera[hypotheses] # hypothesis checks
pip install pandera[io] # yaml/script schema io utilities
pip install pandera[all] # all packages
Or conda:

Expand All @@ -40,6 +51,7 @@ Or conda:
conda install -c conda-forge pandera
Quick Start
-----------

Expand Down
4 changes: 4 additions & 0 deletions docs/source/schema_inference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,10 @@ inferred schema.
Schema Persistence
------------------

The schema persistence feature requires a pandera installation with the ``io``
extension. See the :ref:`installation<installation>` instructions for more
details.

There are two ways of persisting schemas, inferred or otherwise.

Write to a Python script
Expand Down
7 changes: 3 additions & 4 deletions pandera/error_formatters.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,12 @@

import pandas as pd

from .checks import Check
from .hypotheses import Hypothesis
from .checks import _CheckBase


def format_generic_error_message(
parent_schema,
check: Union[Check, Hypothesis],
check: _CheckBase,
check_index: int,
) -> str:
"""Construct an error message when a check validator fails.
Expand All @@ -25,7 +24,7 @@ def format_generic_error_message(

def format_vectorized_error_message(
parent_schema,
check: Union[Check, Hypothesis],
check: _CheckBase,
check_index: int,
reshaped_failure_cases: pd.DataFrame) -> str:
"""Construct an error message when a validator fails.
Expand Down
25 changes: 23 additions & 2 deletions pandera/hypotheses.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,17 @@
from typing import Callable, Union, Optional, List, Dict

import pandas as pd
from scipy import stats

try:
from scipy import stats
except ImportError: # pragma: no cover
HAS_SCIPY = False
else:
HAS_SCIPY = True

from . import errors
from .checks import _CheckBase, SeriesCheckObj, DataFrameCheckObj


DEFAULT_ALPHA = 0.01


Expand Down Expand Up @@ -310,6 +315,14 @@ def two_sample_ttest(
4 4.0 B
"""
if not HAS_SCIPY: # pragma: no cover
raise ImportError(
'Hypothesis checks requires "scipy" to be installed. \n'
"You can install pandera together with the Hypothesis "
"dependencies with: \n"
"pip install pandera[hypothesis]\n"
)

if relationship not in cls.RELATIONSHIPS:
raise errors.SchemaInitError(
"relationship must be one of %s" % set(cls.RELATIONSHIPS))
Expand Down Expand Up @@ -401,6 +414,14 @@ def one_sample_ttest(
"""
if not HAS_SCIPY: # pragma: no cover
raise ImportError(
'Hypothesis checks requires "scipy" to be installed. \n'
"You can install pandera together with the hypothesis "
"dependencies with: \n"
"pip install pandera[hypothesis]"
)

if relationship not in cls.RELATIONSHIPS:
raise errors.SchemaInitError(
"relationship must be one of %s" % set(cls.RELATIONSHIPS))
Expand Down
11 changes: 9 additions & 2 deletions pandera/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,15 @@
from pathlib import Path

import pandas as pd
import yaml
try:
import black
import yaml
except ImportError as exc: # pragma: no cover
raise ImportError(
'IO and formatting requires "pyyaml" and "black" to be installed. \n'
'You can install pandera together with the IO dependencies with: \n'
"pip install pandera[io]\n"
) from exc

from .dtypes import PandasDtype
from .schema_statistics import get_dataframe_schema_statistics
Expand Down Expand Up @@ -286,7 +294,6 @@ def _format_index(index_statistics):


def _format_script(script):
import black # pylint: disable=import-outside-toplevel
formatter = partial(
black.format_str, mode=black.FileMode(line_length=80)
)
Expand Down
15 changes: 11 additions & 4 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@

from setuptools import setup

with open('README.md') as f:
Expand All @@ -7,6 +8,14 @@
with open("pandera/version.py") as fp:
exec(fp.read(), version)

_extras_require = {
"hypotheses": ["scipy"],
"io": ["pyyaml >= 5.1", "black"],
}
extras_require = {
**_extras_require,
"all": list(set(x for l in _extras_require.values() for x in l)),
}

setup(
name="pandera",
Expand All @@ -31,11 +40,9 @@
install_requires=[
"numpy >= 1.9.0",
"pandas >= 0.23.0",
"wrapt",
"pyyaml >= 5.1",
"scipy",
"black",
"wrapt"
],
extras_require=extras_require,
python_requires='>=3.6',
platforms='any',
classifiers=[
Expand Down
16 changes: 16 additions & 0 deletions tests/test_extension_modules.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
"""Tests for extension module imports."""

import pytest


from pandera.hypotheses import Hypothesis, HAS_SCIPY


def test_hypotheses_module_import():
"""Test that Hypothesis built-in methods raise import error."""
if not HAS_SCIPY:
for fn in [
lambda: Hypothesis.two_sample_ttest("sample1", "sample2"),
lambda: Hypothesis.one_sample_ttest(popmean=10)]:
with pytest.raises(ImportError):
fn()
15 changes: 12 additions & 3 deletions tests/test_hypotheses.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,20 @@

import pandas as pd
import pytest
from scipy import stats

from pandera import errors
from pandera import (
Column, DataFrameSchema, Float, Int, String, Hypothesis)
from pandera import Column, DataFrameSchema, Float, Int, String, Hypothesis
from pandera.hypotheses import HAS_SCIPY


if HAS_SCIPY:
from scipy import stats


# skip all tests in module if "hypotheses" depends aren't installed
pytestmark = pytest.mark.skipif(
not HAS_SCIPY, reason='needs "hypotheses" module dependencies'
)


def test_dataframe_hypothesis_checks():
Expand Down
35 changes: 29 additions & 6 deletions tests/test_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,35 @@

import pandas as pd
import pytest
import yaml
import pandera as pa
from pandera import io


PYYAML_VERSION = version.parse(yaml.__version__) # type: ignore
try:
from pandera import io
except ImportError:
HAS_IO = False
else:
HAS_IO = True


try:
import yaml
except ImportError: # pragma: no cover
PYYAML_VERSION = None
else:
PYYAML_VERSION = version.parse(yaml.__version__) # type: ignore


SKIP_YAML_TESTS = (
PYYAML_VERSION is None or
PYYAML_VERSION.release < (5, 1, 0) # type: ignore
)


# skip all tests in module if "io" depends aren't installed
pytestmark = pytest.mark.skipif(
not HAS_IO, reason='needs "io" module dependencies'
)


def _create_schema(index="single"):
Expand Down Expand Up @@ -228,7 +251,7 @@ def _create_schema_null_index():


@pytest.mark.skipif(
PYYAML_VERSION.release < (5, 1, 0), # type: ignore
SKIP_YAML_TESTS,
reason="pyyaml >= 5.1.0 required",
)
def test_inferred_schema_io():
Expand All @@ -245,7 +268,7 @@ def test_inferred_schema_io():


@pytest.mark.skipif(
PYYAML_VERSION.release < (5, 1, 0), # type: ignore
SKIP_YAML_TESTS,
reason="pyyaml >= 5.1.0 required",
)
def test_to_yaml():
Expand All @@ -259,7 +282,7 @@ def test_to_yaml():


@pytest.mark.skipif(
PYYAML_VERSION.release < (5, 1, 0), # type: ignore
SKIP_YAML_TESTS,
reason="pyyaml >= 5.1.0 required",
)
def test_from_yaml():
Expand Down

0 comments on commit c4716a0

Please sign in to comment.