Remove np.random.seed from tests that don't need it. #16438

anntzer · 2020-02-07T18:41:26Z

e.g. check_figures_equal tests should typically work regardless of
the random dataset used. If anything this may slightly help catching
brittle tests that only work for certain seed values.

PR Summary

PR Checklist

Has Pytest style unit tests
Code is Flake 8 compliant
New features are documented, with examples if plot related
Documentation is sphinx and numpydoc compliant
Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

tacaswell · 2020-02-07T19:00:32Z

I'm a little worried about not seeding the random numbers as it could expose some value dependent bugs, but we have no way of reproducing.

anntzer · 2020-02-07T19:32:31Z

But then at least we can pin the numbers back while investigating.

pharshalp · 2020-02-07T19:59:43Z

I'm a little worried about not seeding the random numbers as it could expose some value dependent bugs, but we have no way of reproducing.

Would a parameterized test work better to address both a random case and a reproducible case?

@pytest.mark.parametrize('np_rand_seed', [1234, None])
def test_long_path(np_rand_seed):
    np.random.seed(np_rand_seed)

anntzer · 2020-02-07T20:06:05Z

That feels like overengineering it, imho.

tacaswell · 2020-02-07T22:16:14Z

@anntzer fair and knowing it has a possible value dependent bug is better than not knowing it has one.

@pharshalp That is starting to go down the path of re-writing hypothesis ( https://hypothesis.readthedocs.io/en/latest/) and while useful, is not a route I think we want to go down in our test suite.

tacaswell · 2020-02-07T22:23:27Z

lib/matplotlib/tests/test_triangulation.py

-    y = np.random.random(npoints)
-    x[duplicate] = x[duplicate_of]
-    y[duplicate] = y[duplicate_of]
+    xy = np.random.random((npoints, 2))


https://dev.azure.com/matplotlib/matplotlib/_build/results?buildId=8800&view=logs&j=a00bdc14-80a3-51a4-3934-86a314523526&t=1906b723-82f5-57cd-ebf1-a6ef0c620a3f&l=138

It looks like sometimes it picks 7 to drop and sometimes it picks 3 to drop. Should confirm with @ianthomas23 that is is OK behavior.

That's exactly the kind of things that I was hoping to catch...

Short answer:
The seed call should stay.

Long answer:
If we are using truly random (i.e. non-seeded) points, the qhull-generated delaunay triangulation algorithm will use just one of the two duplicated points, and we don't know in advance which one it will be. So there are two options:

Keep the seed call.

Remove the seed call and change the assert at the end of the test because the indices of the points used in the triangulation that are returned by qhull, np.unique(triang.triangles), will either be np.delete(np.arange(npoints), duplicate) (the existing code) or np.delete(np.arange(npoints), duplicate_of).

The first option is better for 3 reasons:

It is simpler (less code).

It is what the code currently does and existing working code shouldn't be changed without good reason.

It is far better defensive programming to keep all tests deterministic. There is an exceedingly small but finite chance that a truly random (non-seeded) point will be close enough to the real duplicate points that the triangulation algorithm identifies this as the same point as the duplicate points and the test will fail with a different point index. The failure will happen at some point in the future on an entirely unrelated PR and will be totally unreproducible.

Hence keep the seed call. Then there is no reason to change any other code in the same function in a PR about seed calls.

There is a wider issue here that all tests should be deterministic and hence no seed calls should be removed without the explicit permission of each person who wrote the initial code in each case. Otherwise we are risking a number of extremely unlikely possible failures that are avoided with the existing code. The use of random numbers here is just shorthand for a small number of hard-coded points, we don't actually want to run the test with truly random numbers. If there is fundamental opposition to the use of seed in tests then the random values can be replaced with hard-coded ones. That is easily done for this test, but what about in test_agg that needs 70000 values?!

I'm fine reverting the change here (which I will do) -- especially you point out to a real possible failure mode -- but actually, for most other cases, I am strongly convinced removing the seed is correct. For eample, test_agg tests that you can write a 70,000-point line without raising an exception. If there is any 70,000 point line which causes an exception, this is clearly a bug on our side. It may be tricky to investigate, so we may patch back to forcing the seed while investigating, but we can't just say "oh, that's just an unlikely case, let's ignore it". Likewise, tests involving check_figures_equal are testing logical equivalences -- two series of calls should result in the same backend-level draw operations, regardless of the input.

anntzer · 2020-02-11T08:40:57Z

lib/matplotlib/tests/test_colorbar.py

-        np.testing.assert_allclose(cb.get_ticks(), [0.2, 0.4, 0.6, 0.8])
+    plt.rcParams["_internal.classic_mode"] = False
+    fig, ax = plt.subplots()
+    pc = ax.pcolormesh([[.05, .95]])


Actually this is nicer (and more robust) without using random data...

e.g. check_figures_equal tests should typically work regardless of the random dataset used. If anything this may slightly help catching brittle tests that only work for certain seed values.

timhoffm · 2020-05-08T01:23:30Z

I don't think we should randomize regular test runs to try and find bugs.

If they hit, it may not be the time you want to investigate. More generally, getting tests to pass is a significant step for contribution. It is frustrating for new but also experienced contributors if your PR tests fail and you don't know why.

If we really care about this, we should collect the relevant tests and run them locally without seed. This is about 30 tests, so you can run them quite often in a reasonable amount of time.

anntzer · 2020-05-08T09:30:59Z

ok, extracted one of the changes to #17361 (as I think that one is certainly more correct without the randomness) and closing the rest.

anntzer added the topic: testing label Feb 7, 2020

anntzer force-pushed the unseed branch from fe0f589 to c7de940 Compare February 7, 2020 18:41

tacaswell reviewed Feb 7, 2020

View reviewed changes

tacaswell added this to the v3.3.0 milestone Feb 7, 2020

anntzer force-pushed the unseed branch from c7de940 to 0b983ce Compare February 9, 2020 13:10

anntzer mentioned this pull request Feb 11, 2020

FIX: colorbar minorticks when rcParams['x/ytick.minor.visible'] = True #16469

Merged

6 tasks

anntzer force-pushed the unseed branch from 0b983ce to 64dd71f Compare February 11, 2020 08:40

anntzer commented Feb 11, 2020

View reviewed changes

QuLogic added the status: needs rebase label May 5, 2020

Remove np.random.seed from tests that don't need it.

29dd018

e.g. check_figures_equal tests should typically work regardless of the random dataset used. If anything this may slightly help catching brittle tests that only work for certain seed values.

anntzer removed the status: needs rebase label May 7, 2020

anntzer force-pushed the unseed branch from 64dd71f to 29dd018 Compare May 7, 2020 12:31

anntzer mentioned this pull request May 8, 2020

Remove randomness from test_colorbar_get_ticks_2. #17361

Merged

6 tasks

anntzer closed this May 8, 2020

anntzer deleted the unseed branch May 8, 2020 09:31

QuLogic modified the milestones: v3.3.0, unassigned Jul 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Remove np.random.seed from tests that don't need it. #16438

Remove np.random.seed from tests that don't need it. #16438

Uh oh!

anntzer commented Feb 7, 2020

Uh oh!

tacaswell commented Feb 7, 2020

Uh oh!

anntzer commented Feb 7, 2020

Uh oh!

pharshalp commented Feb 7, 2020

Uh oh!

anntzer commented Feb 7, 2020

Uh oh!

tacaswell commented Feb 7, 2020

Uh oh!

tacaswell Feb 7, 2020

Uh oh!

anntzer Feb 8, 2020

Uh oh!

ianthomas23 Feb 9, 2020

Uh oh!

anntzer Feb 9, 2020

Uh oh!

anntzer Feb 11, 2020

Uh oh!

timhoffm commented May 8, 2020 •

edited

Loading

Uh oh!

anntzer commented May 8, 2020

Uh oh!

Uh oh!

Uh oh!

Remove np.random.seed from tests that don't need it. #16438

Remove np.random.seed from tests that don't need it. #16438

Uh oh!

Conversation

anntzer commented Feb 7, 2020

PR Summary

PR Checklist

Uh oh!

tacaswell commented Feb 7, 2020

Uh oh!

anntzer commented Feb 7, 2020

Uh oh!

pharshalp commented Feb 7, 2020

Uh oh!

anntzer commented Feb 7, 2020

Uh oh!

tacaswell commented Feb 7, 2020

Uh oh!

tacaswell Feb 7, 2020

Choose a reason for hiding this comment

Uh oh!

anntzer Feb 8, 2020

Choose a reason for hiding this comment

Uh oh!

ianthomas23 Feb 9, 2020

Choose a reason for hiding this comment

Uh oh!

anntzer Feb 9, 2020

Choose a reason for hiding this comment

Uh oh!

anntzer Feb 11, 2020

Choose a reason for hiding this comment

Uh oh!

timhoffm commented May 8, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anntzer commented May 8, 2020

Uh oh!

Uh oh!

timhoffm commented May 8, 2020 •

edited

Loading