DOC: update random and asserts in test guidelines #18777

tupui · 2021-04-14T18:06:39Z

I propose to update the testing guide.

~~Change np.random.seed to using `np.random. default_rng.~~
Change assert_ to plain assert as the statement about python stripping them out is not relevant in this context.
Hide sections about assert_almost_equal and alike as better alternative are recommended.

There was a discussion over at SciPy which suggested that this guide needed updates.
xref scipy/scipy#8583

rkern · 2021-04-14T18:29:50Z

Using np.random and np.random.seed() is still OK for unit tests. In fact, depending on the use case, it still might be preferred due to the stability requirements we have kept on RandomState. default_rng() is actually probably not recommended, except for feeding into code under test that uses Generators. But for pseudorandomly generating data to pass to the code under test, np.random functions or RandomState are typically preferred.

tupui · 2021-04-14T18:31:44Z

Using np.random and np.random.seed() is still OK for unit tests. In fact, depending on the use case, it still might be preferred due to the stability requirements we have kept on RandomState. default_rng() is actually probably not recommended, except for feeding into code under test that uses Generators. But for pseudorandomly generating data to pass to the code under test, np.random functions or RandomState are typically preferred.

Oh sorry yes, I got confused with the other PR on SciPy which is about the examples and not the tests. I will remove this. Is the rest fine?

tupui · 2021-04-14T18:34:48Z

Apparently there is an issue with using :hidden:, if this is not possible I would suggest to add a dedicated section to preface that the other methods are preferred?

tupui · 2021-04-14T19:05:24Z

I am not sure about the global state though. Shouldn't it advertise at least for rng = np.random.RandomState? I am not a fan of global state settings. I reverted the original. Let me know if I should change anything else.

rkern · 2021-04-14T19:13:39Z

In the confines of a unit test, the global doesn't really present a problem. It even makes it a smidge easier to do the seeding in a setUp() and then just use np.random in each test. I'm not particularly zealous about recommending people away from it in this context (and I'm super-zealous about it in most contexts). If you want to document the pattern using self.rng = RandomState(seed) as a primary recommendation, but then also document that np.random.seed() will work for tests, that's fine.

eric-wieser · 2021-04-15T09:20:40Z

doc/source/reference/routines.testing.rst

+   assert_almost_equal
+   assert_approx_equal
+   assert_array_almost_equal


If these are not recommended, can you add a ..note in the docstring for each method that explains why, and what the alternative is?

There is already a note in these functions pointing out to assert_allclose and alike. That's why I am listing these specifically.

Are you talking about something else?

Also I am not sure how to hide these. Because if I am not mistaking they are only defined here so I would need to do the autosummary. But then I don't know how to hide.

Ah, I was not aware that we already had those notes.

Perhaps instead of hiding these you should just put them under a different heading, along with a remark saying that they are not recommended?

OK sure. And also I am still bugged with how to do it otherwise 😅. So this? Asserts (not recommanded)

eric-wieser · 2021-04-15T09:48:44Z

It even makes it a smidge easier to do the seeding in a setUp() and then just use np.random in each test.

Doesn't this result in non-deterministic tests if pytest decides to run tests in multiple threads, such as with pytest-parallel?

tupui · 2021-04-15T09:55:09Z

It even makes it a smidge easier to do the seeding in a setUp() and then just use np.random in each test.

Doesn't this result in non-deterministic tests if pytest decides to run tests in multiple threads, such as with pytest-parallel?

My 2 cents. For testing most people just need to draw number from a random uniform distribution. At most from a normal. Then instead of fighting with seeds and complicated algo, I would recommend to use simple deterministic sequences such as Halton. These are fixed and if someone want a different "seed", they can just advance the sequence. This way you are always guaranty to have the same exact sequence.

tupui · 2021-04-15T11:52:50Z

@rkern @eric-wieser I updated the PR with your suggestions. As for the seed, I could add the following if wanted:

Instead of setting the seed globally, it is recommanded to use NumPy's API::

    rng = np.random.RandomState(some_number)
    random_numbers = rng.random(...)

mattip · 2021-04-15T12:52:03Z

In the confines of a unit test, the global doesn't really present a problem. It even makes it a smidge easier to do the seeding in a setUp() and then just use np.random in each test. I'm not particularly zealous about recommending people away from it in this context (and I'm super-zealous about it in most contexts).

@rkern I am curious about this. Why would you be hesitant to recommend using an rng in tests to get people used to the more correct API? I think

using them in tests should be convenient and natural. If not we have a UX problem
changing the order tests are run can change results, whether by running only specific tests or by using multithreaded frameworks. These types of errors are quite hard to debug.
the sooner we transition people off the older APIs the sooner we can deprecate them, reducing the NumPy API surface.

tupui · 2021-04-15T13:05:42Z

using them in tests should be convenient and natural. If not we have a UX problem

I agree. If we must stick with RandomState, as the doc suggest, I have another proposal. We could add a parameter to default_rng to specify that we want to ensure future reproducibility. Ex. default_rng(mode='test'). This way, it could call behind the seen RandomState or anything else. But we would have a single entry point.

the sooner we transition people off the older APIs the sooner we can deprecate them, reducing the NumPy API surface.

Can't agree more as currently we have 2 cases to handle and it leads to confusions.

rkern · 2021-04-15T13:38:56Z

Why would you be hesitant to recommend using an rng in tests to get people used to the more correct API?

I'm not. I said to recommend it as the primary method. I definitely would prefer that people do that in new code; their tests will be a little bit better for it. I'm saying not to disrecommend np.random.seed(). Using np.random.seed() in tests is still an official use case, and I don't want our documentation to chide people with mounds of existing code into doing a lot of churn. They're not getting any benefit from Generator and might not be able (or need) to use multithreading for their test suite. Churn is a quick way to generate resentment against the changes that we've made.

By all means, document RandomState(seed) as the primary way to do it, and document the limitation that using np.random functions prevents speeding up your test suite via multithreading. Provide the carrots to move away from np.random.seed().

rkern · 2021-04-15T13:41:27Z

changing the order tests are run can change results

Not really. People do know enough to set the seed in the setUp() method or at the top of each function that is using np.random, not relying on one call at the beginning to cross multiple test methods.

tupui · 2021-04-15T13:44:20Z

By all means, document RandomState(seed) as the primary way to do it, and document the limitation that using np.random functions prevents speeding up your test suite via multithreading. Provide the carrots to move away from np.random.seed().

My reading of the following is that I should add the text I proposed. This is ok?

Instead of setting the seed globally, it is recommanded to use NumPy's API::

    rng = np.random.RandomState(some_number)
    random_numbers = rng.random(...)

mattip · 2021-04-15T13:52:07Z

By all means, document RandomState(seed) as the primary way to do it,

If they already are getting a rng, why not recommend rng = np.random.default_rng(seed) ?

rkern · 2021-04-15T14:00:29Z

Relevant NEP 19 section.

mattip · 2021-04-16T16:11:15Z

@rkern would it be acceptable to recommend

rng = np.random.default_rng(seed)
random_numbers = rng.random(...)

seberg · 2021-04-16T16:23:19Z

Sorry didn't notice this one. The changes in doc/TESTS.rst.txt were now already made in gh-18787

tupui · 2021-04-16T16:37:15Z

Sorry didn't notice this one. The changes in doc/TESTS.rst.txt were now already made in gh-18787

Ah no problem. What matters is that this is in 😃

cc @larsoner then.

larsoner · 2021-04-16T17:03:05Z

Ahh sorry didn't see you already had a PR open for some related changes @tupui . I think my PR had a bit more explanatory text and content related to pytest so if you rebase and just keep my changes you should be good I think.

tupui · 2021-04-16T17:04:19Z

Ahh sorry didn't see you already had a PR open for some related changes @tupui . I think my PR had a bit more explanatory text and content related to pytest so if you rebase and just keep my changes you should be good I think.

No worries @larsoner 😄 , I merge master with your changes already.

rkern · 2021-04-16T17:20:23Z

@mattip No. Use RandomState.

doc/source/reference/routines.testing.rst

Co-authored-by: Matthias Bussonnier <bussonniermatthias@gmail.com>

tupui · 2021-04-17T09:03:30Z

Anything else? I believe this is ready.

rgommers

The assert-related changed LGTM.

+1 to all @rkern said about unit testing. The current version of this PR doesn't include and random changes, which is right.

tupui · 2021-04-21T07:38:09Z

@mattip @seberg is there anything I should do here?

charris · 2021-04-21T13:29:52Z

Thanks @tupui , I was waiting to see if someone else wanted to comment.

tupui · 2021-04-21T13:44:15Z

Thanks @tupui , I was waiting to see if someone else wanted to comment.

Thanks, no problem 😃 My first commits in NumPy, hopefully not the last ones!

DOC: remove legacy global seed, assert_almost_equal and assert_

04c97a6

github-actions bot added the 04 - Documentation label Apr 14, 2021

DOC: revert global seed

64bb06f

eric-wieser reviewed Apr 15, 2021

View reviewed changes

tupui force-pushed the test_guidelines_random_asserts branch from a86919e to 2cbaaba Compare April 15, 2021 09:38

DOC: not recommended section

f47f64e

tupui force-pushed the test_guidelines_random_asserts branch from 2cbaaba to f47f64e Compare April 15, 2021 10:00

Merge branch 'main' into test_guidelines_random_asserts

2ba1cde

Carreau reviewed Apr 17, 2021

View reviewed changes

doc/source/reference/routines.testing.rst Outdated Show resolved Hide resolved

Update doc/source/reference/routines.testing.rst

ff72d21

Co-authored-by: Matthias Bussonnier <bussonniermatthias@gmail.com>

rgommers approved these changes Apr 17, 2021

View reviewed changes

charris merged commit ab24563 into numpy:main Apr 21, 2021

tupui deleted the test_guidelines_random_asserts branch April 21, 2021 13:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: update random and asserts in test guidelines #18777

DOC: update random and asserts in test guidelines #18777

tupui commented Apr 14, 2021 •

edited

rkern commented Apr 14, 2021

tupui commented Apr 14, 2021

tupui commented Apr 14, 2021

tupui commented Apr 14, 2021

rkern commented Apr 14, 2021

eric-wieser Apr 15, 2021

tupui Apr 15, 2021

tupui Apr 15, 2021

eric-wieser Apr 15, 2021 •

edited

tupui Apr 15, 2021

eric-wieser commented Apr 15, 2021

tupui commented Apr 15, 2021

tupui commented Apr 15, 2021 •

edited

mattip commented Apr 15, 2021

tupui commented Apr 15, 2021 •

edited

rkern commented Apr 15, 2021

rkern commented Apr 15, 2021

tupui commented Apr 15, 2021 •

edited

mattip commented Apr 15, 2021

rkern commented Apr 15, 2021

mattip commented Apr 16, 2021

seberg commented Apr 16, 2021

tupui commented Apr 16, 2021

larsoner commented Apr 16, 2021

tupui commented Apr 16, 2021

rkern commented Apr 16, 2021

tupui commented Apr 17, 2021

rgommers left a comment

tupui commented Apr 21, 2021

charris commented Apr 21, 2021

tupui commented Apr 21, 2021 •

edited

DOC: update random and asserts in test guidelines #18777

DOC: update random and asserts in test guidelines #18777

Conversation

tupui commented Apr 14, 2021 • edited

rkern commented Apr 14, 2021

tupui commented Apr 14, 2021

tupui commented Apr 14, 2021

tupui commented Apr 14, 2021

rkern commented Apr 14, 2021

eric-wieser Apr 15, 2021

Choose a reason for hiding this comment

tupui Apr 15, 2021

Choose a reason for hiding this comment

tupui Apr 15, 2021

Choose a reason for hiding this comment

eric-wieser Apr 15, 2021 • edited

Choose a reason for hiding this comment

tupui Apr 15, 2021

Choose a reason for hiding this comment

eric-wieser commented Apr 15, 2021

tupui commented Apr 15, 2021

tupui commented Apr 15, 2021 • edited

mattip commented Apr 15, 2021

tupui commented Apr 15, 2021 • edited

rkern commented Apr 15, 2021

rkern commented Apr 15, 2021

tupui commented Apr 15, 2021 • edited

mattip commented Apr 15, 2021

rkern commented Apr 15, 2021

mattip commented Apr 16, 2021

seberg commented Apr 16, 2021

tupui commented Apr 16, 2021

larsoner commented Apr 16, 2021

tupui commented Apr 16, 2021

rkern commented Apr 16, 2021

tupui commented Apr 17, 2021

rgommers left a comment

Choose a reason for hiding this comment

tupui commented Apr 21, 2021

charris commented Apr 21, 2021

tupui commented Apr 21, 2021 • edited

tupui commented Apr 14, 2021 •

edited

eric-wieser Apr 15, 2021 •

edited

tupui commented Apr 15, 2021 •

edited

tupui commented Apr 15, 2021 •

edited

tupui commented Apr 15, 2021 •

edited

tupui commented Apr 21, 2021 •

edited