Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip tests for optional dependencies if they are not installed #115

Closed
szuckerman opened this issue Jan 30, 2019 · 11 comments · Fixed by #698
Closed

Skip tests for optional dependencies if they are not installed #115

szuckerman opened this issue Jan 30, 2019 · 11 comments · Fixed by #698
Assignees
Labels
available for hacking This issue has not been claimed by any individual. good intermediate issue Issues that are good for seasoned programmers to make a contribution testing Issues related to testing

Comments

@szuckerman
Copy link
Collaborator

I just tried to run the test suite and had some failures because I don't have some of the Chemistry packages installed. I realized that those packages are only available through conda.

I don't use conda and am finding out that it's not so easy to just "install" conda packages into a previously created virtual environment (I mean, I'll figure it out eventually, but this is just at first glance).

In any event, I think it raises an interesting question: Does this create issues for people who don't use conda? Meaning, if I try to use something from a submodule, but because I don't have a dependency installed, like rdkit, it'll tell me to remedy with a conda install that's not going to work?

I'm not really sure what the answer is, but I could see either putting functions that rely on conda dependencies in their own package for conda, or just keeping it how it is and let people deal with it (assuming that people using these modules most likely already have conda installed).

Just wanted to throw this out there before the submodules get bigger.

@ericmjl
Copy link
Member

ericmjl commented Jan 30, 2019

@szuckerman that's a really great catch!

In general, I have been dependent on conda since grad school - only because conda helps with packaging of non-Python dependencies of Python and other language data science packages. Admittedly, venv has been a second-class citizen in my mind - but only out of convenience, really.

I think what I'll do is go in and mark the tests with their appropriate submodule name, so we can do things like:

$ pytest -m biology

or

$ pytest -m functions

That way, they can be selectively run if necessary.

@ericmjl ericmjl mentioned this issue Jan 30, 2019
@ericmjl ericmjl added question Further information is requested docfix Documentation fixes needed labels Feb 5, 2019
@hectormz
Copy link
Collaborator

hectormz commented Sep 2, 2019

Are the concerns raised here still outstanding @ericmjl, @szuckerman? I suppose this is related in a way to #551.

I'm going through the oldest issues to see what can be done.

@ericmjl
Copy link
Member

ericmjl commented Sep 3, 2019

I think we might need to update this issue - perhaps it should be renamed, "Mark tests with skipif for optional dependencies".

Doing so would be pretty rad - we could skip all of the biology/chemistry/spark functions if they can't be imported successfully.

One thing I would prefer to remain as is, though, is environment.yml - using it keeps the Azure pipelines builds cleaner by abstracting out the exact packages we want installed during CI builds.

@zjpoh
Copy link
Collaborator

zjpoh commented Sep 3, 2019

I agree. Having them as optional for users but required in CI is important.

@hectormz
Copy link
Collaborator

@ericmjl I'm revisiting this one again.
So currently in some of the optional tests we have this pattern:

try:
    import pyspark

    import janitor.spark  # noqa: F401
except ImportError:
    pyspark = None


@pytest.mark.spark_functions
@pytest.mark.skipif(pyspark is None, reason="pyspark tests only required for CI")
def test_clean_names_method_chain(spark_df):
...

There are some other options I learned recently. We can instead skip the whole test file if an import is unsuccessful:

import pytest
pyspark = pytest.importorskip("pyspark")

I think it is very useful to:

  • run certain suites (biology, chemistry, etc...) you are working on
  • run all tests locally without errors for missing packages
  • ensure that CI doesn't ignore skipped tests if a package doesn't get installed somehow

@ericmjl
Copy link
Member

ericmjl commented Mar 22, 2020

@hectormz, this looks like a very good idea. I can definitely see how it’s useful for those who don’t want to install the entire suite of optional dependencies.

If you could help me a bit here, I am not quite following the 3rd point:

ensure that CI doesn't ignore skipped tests if a package doesn't get installed somehow

If I understand correctly, on the CI system, if a package doesn’t get installed somehow, we want the tests to fail, rather than skip them. Is that part of the point you’re making with point 3? Or am I simply overthinking things here? 😸

@hectormz
Copy link
Collaborator

@ericmjl Exactly! This is very related to a PR I'm working on for arviz arviz-devs/arviz#1113

We're still figuring out the best way, either monkeypatching or writing our own importorskip, but having some method that involves creating an environment variable in CI. Let me.know your thoughts or we can see how that PR is resolved and do something similar here. 😉

@hectormz
Copy link
Collaborator

hectormz commented Jun 7, 2020

@ericmjl With arviz-devs/arviz#1113 having been merged, do you want to try a similar approach to what was eventually executed in that PR?

@ericmjl
Copy link
Member

ericmjl commented Jun 7, 2020

Yes! Let's do that. This would be good for the pyspark and domain-specific modules, right?

I looked briefly at the code, and I think this is the block that would help, am I right about it?

pytestmark = pytest.mark.skipif(  # pylint: disable=invalid-name
    (importlib.util.find_spec("numba") is None) & ~running_on_ci(),
    reason="test requires numba which is not installed",
)

@ericmjl ericmjl changed the title Conda dependencies Skip tests for optional dependencies if they are not installed Jun 7, 2020
@ericmjl ericmjl added available for hacking This issue has not been claimed by any individual. testing Issues related to testing good intermediate issue Issues that are good for seasoned programmers to make a contribution and removed question Further information is requested docfix Documentation fixes needed labels Jun 7, 2020
@hectormz
Copy link
Collaborator

hectormz commented Jun 7, 2020

Right. That decorator will skip the test if the module is not installed (but not running on CI)

There's also

from ..helper import importorskip
numba= importorskip("numba")  # pylint: disable=invalid-name

Which would skip the whole file of tests (Useful for tests that are grouped in same file like pyspark IIRC)

This relies on a custom importorskip that checks for an environment variable we'd set in the Azure CI:

def running_on_ci() -> bool:
    """Return True if running on CI machine."""
    return os.environ.get("ARVIZ_CI_MACHINE") is not None


def importorskip(
    modname: str, minversion: Optional[str] = None, reason: Optional[str] = None
) -> Any:
    """Import and return the requested module ``modname``.
        Doesn't allow skips on CI machine.
        Borrowed and modified from ``pytest.importorskip``.
    :param str modname: the name of the module to import
    :param str minversion: if given, the imported module's ``__version__``
        attribute must be at least this minimal version, otherwise the test is
        still skipped.
    :param str reason: if given, this reason is shown as the message when the
        module cannot be imported.
    :returns: The imported module. This should be assigned to its canonical
        name.
    Example::
        docutils = pytest.importorskip("docutils")
    """
    # ARVIZ_CI_MACHINE is True if tests run on CI, where ARVIZ_CI_MACHINE env variable exists
    ARVIZ_CI_MACHINE = running_on_ci()
    if ARVIZ_CI_MACHINE:
        import warnings

        compile(modname, "", "eval")  # to catch syntaxerrors

        with warnings.catch_warnings():
            # make sure to ignore ImportWarnings that might happen because
            # of existing directories with the same name we're trying to
            # import but without a __init__.py file
            warnings.simplefilter("ignore")
            __import__(modname)
        mod = sys.modules[modname]
        if minversion is None:
            return mod
        verattr = getattr(mod, "__version__", None)
        if minversion is not None:
            if verattr is None or Version(verattr) < Version(minversion):
                raise Skipped(
                    "module %r has __version__ %r, required is: %r"
                    % (modname, verattr, minversion),
                    allow_module_level=True,
                )
        return mod
    else:
        return pytest.importorskip(modname=modname, minversion=minversion, reason=reason)

@hectormz
Copy link
Collaborator

hectormz commented Jun 7, 2020

This second option makes sense if it is very clear that the test .py file is dedicated to pyspark for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
available for hacking This issue has not been claimed by any individual. good intermediate issue Issues that are good for seasoned programmers to make a contribution testing Issues related to testing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants