Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duckarray tests for constructors and properties #6903

Open
wants to merge 157 commits into
base: main
Choose a base branch
from

Conversation

TomNicholas
Copy link
Contributor

@TomNicholas TomNicholas commented Aug 9, 2022

Builds on top of #4972 to add tests for Variable/DataArray/Dataset constructors and properties when wrapping duck arrays.

Adds a file xarray/tests/duckarrays/base/constructors.py which contains new test base classes.

Also uses those new base classes to test Sparse array integration (not yet tried for pint integration).

@TomNicholas
Copy link
Contributor Author

TomNicholas commented Dec 14, 2023

@keewis I've come back to this PR now that #8404 is merged. I have:

  • Rewritten the reduce tests to use the new variables strategy
  • Used the array_strategy_fn pattern everywhere
  • Added a new file for testing xarray compatibility with the minimal implementation of the array API standard numpy.array_api
  • Rewritten the sparse tests to match
  • Removed the pint tests / DataArray tests for now
  • Reorganised the files to put all the "base" test classes in xarray/testing, and all the actual concrete tests that get run in xarray/tests. This follows the pattern I used in Hypothesis strategy for generating Variable objects #8404.

I just wanted to show you to see what you thought of this approach now.

Comment on lines +56 to +63
@staticmethod
@abstractmethod
def array_strategy_fn(
*, shape: "_ShapeLike", dtype: "_DTypeLikeNested"
) -> st.SearchStrategy[T_DuckArray]:
# TODO can we just make this an attribute?
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to just do this in the subclassed tests:

class VariableConstructorTests

    array_strategy_fn = ...

but then would I need to make array_strategy_fn an abstract property?

Comment on lines 105 to 119
@pytest.mark.parametrize(
"method",
(
"all",
"any",
# "cumprod", # not in array API
# "cumsum", # not in array API
# "max", # only in array API for real numeric dtypes
# "max", # only in array API for real floating point dtypes
# "median", # not in array API
# "min", # only in array API for real numeric dtypes
# "prod", # only in array API for numeric dtypes
# "std", # TypeError: std() got an unexpected keyword argument 'ddof'
# "sum", # only in array API for numeric dtypes
# "var", # TypeError: std() got an unexpected keyword argument 'ddof'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearly we need to be able to test certain reductions with only certain dtypes

def test_construct(self, data) -> None:
shape = data.draw(self.shapes)
dtype = data.draw(self.dtypes)
arr = data.draw(self.array_strategy_fn(shape=shape, dtype=dtype))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't use strategies.variables to test the constructor because strategies.variables has a call to the constructor inside it, so you end up not being able to access the original raw array that you want to use as the "expected" value.

@TomNicholas
Copy link
Contributor Author

Okay I'm a bit stumped by how to do this. I want some test setup where:

  1. We define sets of tests that a downstream library can import and run using just a couple of lines of code,
  2. The downstream library can override the generated array type and the exact way equality is checked (e.g. to test sparse type arrays),
  3. Certain tests are parametrized over many cases, which we ideally set upstream (e.g. over sum, mean, std, etc.),
  4. But the downstream library can choose to xfail certain cases (e.g. because they haven't implemented std yet),
  5. Different cases are tested for different dtypes automatically (e.g. because the array API standard supports string dtypes in any but not in max),
  6. But the downstream library can also override those dtypes too (e.g. because they support more/less than what's currently in the array API).

I was hoping to do this using some combination of upstream test classes, pytest.parametrize on the test method, and xfail, but I can't really see how to get (4) and (6) to work with that approach. I might have to resort to using globals or something...

@keewis
Copy link
Collaborator

keewis commented Dec 31, 2023

If you look at #4972, there's a new mark that applies other marks, like skip. The API is a bit awkward, and the implementation might be out of date, but it demonstrates that this is possible. Similarly, if you pass the dtypes as a fixture you should be able to override it on a per-module level (pytest has a whole bunch of docs on overriding fixtures).

Edit: although fixtures are not that different from glorified globals

@TomNicholas
Copy link
Contributor Author

If you look at #4972, there's a new mark that applies other marks, like skip.

Interesting, thank you!

Similarly, if you pass the dtypes as a fixture you should be able to override it on a per-module level (pytest has a whole bunch of docs on overriding fixtures).

Also a good idea, but I think I need to override the dtype fixtures on a per-parametrization level...

@keewis
Copy link
Collaborator

keewis commented Dec 31, 2023

Depends on what you want to parametrize, I guess. In general, though, I'm starting to lean towards not parametrizing the functions to test: pytest.mark.parametrize creates "variants" of the same test (i.e. it should test the same thing with different inputs), where having one function per "test" makes the summaries a bit more useful. Plus, running the tests of a single function are much easier that way compared to if they were variants. The downside is code duplication, but I don't think that's too much of an issue, especially for tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-arrays related to flexible array support topic-hypothesis Strategies or tests using the hypothesis library topic-testing
Projects
Duck Array Wrapping
  
In progress
Development

Successfully merging this pull request may close these issues.

None yet

4 participants