[MRG] Refactor test_hashing.py as per pytest design. #454

kdexd · 2016-12-12T15:11:18Z

Third Phase PR on #411 (Succeeding PR #453)

This PR deals with refactor of tests in test_hashing.py to adopt the pytest flavor.

Parametrization of certain tests have been done by an accompanying fixture to protect numpy specific code using @with_numpy. Refer Move to py.test? #411 (comment) for more details.
Regex checking of tests has been done using pytest now.
tmpdir_path fixture has been removed and pytest's own tmpdir is brought back to used to keep uniformity across modules. Some older occurences (if any) will be taken care later.
Some cosmetic changes like reordering of imports has been included.

…ure.

kdexd · 2016-12-12T15:11:52Z

@lesteve I had kept this ready for you today ! Please review as per convenience 😄

lesteve · 2016-12-12T15:27:38Z

The travis error indicates that with_numpy does not seem to work as expected, not sure why.

lesteve · 2016-12-12T15:28:11Z

Note that the failing build is the one where numpy is not installed.

kdexd · 2016-12-12T15:28:45Z

@lesteve I knew there was a problem with skipif marker. I guess the old with_numpy cannot be changed. Old with_numpy works perfectly fine.

kdexd · 2016-12-12T15:32:48Z

joblib/test/test_hashing.py

+             # of same name due to 'indirect' keyword used here
+             # Expected results have been generated with joblib 0.9.0
+             [(0, {'py2': '80f2387e7752abbda2658aafed49e086',
+                   'py3': '10a6afc379ca2708acfbaef0ab676eab'}),


@lesteve I cannot write it as:

@with_numpy @parametrize(['to_hash', 'expected'], [(np.arange(100, dtype='<i8').reshape((10, 10))[:, :2], {'py2': '80f2387e7752abbda2658aafed49e086', 'py3': '10a6afc379ca2708acfbaef0ab676eab'}), .... #others]) def method_name(to_hash, expected): # method body pass

I cannot use np in @parametrize args... They are still exposed.

Can you not use a function that returns the parameters i.e. something like:

def input_params(): return [(np.arange(100, dtype='<i8').reshape((10, 10))[:, :2], {'py2': '80f2387e7752abbda2658aafed49e086', 'py3': '10a6afc379ca2708acfbaef0ab676eab'}), .... #others] @with_numpy @parametrize(['to_hash', 'expected'], input_params()) def method_name: ...

Alternatively it seems like the pytest issue you pointed at was skipping parameters inside the input_params function (or at least that is what I wildly guessed from reading it).

Alternatively it seems like the pytest issue you pointed at was skipping parameters inside the input_params function (or at least that is what I wildly guessed from reading it).

I am reasonably confident the first option is not going to work but the latter one will. Here is a snippet that seems to work for me:

import pytest try: import bla except ImportError: bla = None def params(): skip = pytest.mark.skipif(bla is None, reason="bla not installed") return [skip('a'), skip('b')] @pytest.mark.parametrize("param", params()) def test_foo(param): assert False

There may be better ways to do it though.

Well yes, the first one did not work, I tested it. It is because @parametrize runs at setup and the function call also happens at setup. But skipping does not happen. Which is why the second snippet you gave works, because pytest allows providing skipif as parameters to parametrize.

Here is a snippet that seems to work for me

Shall we go this way ? Giving out skipif markers ? Just because we have got one more alternative, out of the two I will vote the already done implementation because I am assuming it would look familiar to eyes of people familair with pytest (maybe ?)

Here is a link to standard example in documentation:

http://doc.pytest.org/en/latest/example/parametrize.html#apply-indirect-on-particular-arguments

As discussed I feel using parametrize in this case is too convoluted. So just use a dumb for loop.

kdexd · 2016-12-12T16:07:54Z

The travis error indicates that with_numpy does not seem to work as expected, not sure why.

@lesteve: Although I fixed that error, there are yield based tests in test_memory.py which keep on misbehaving. So with_numpy is still the old standard decorator for this PR. I will convert it to skipif marker later.

So for now, with_numpy is unchanged and everything is green now...

lesteve · 2016-12-13T06:20:53Z

joblib/test/test_hashing.py

-import tempfile
-import os
-import sys
+import collections


Don't modify the imports order. If you have a plugin that orders imports alphabetically, disable it.

lesteve · 2016-12-13T06:22:30Z

joblib/test/test_hashing.py

+    """Smoke test hash on various types."""
+    # Check that 2 objects have the same hash only if they are the same.
+    if obj1 is obj2:
+        assert hash(obj1) == hash(obj2)


Nope you are not testing the same as the test as it was previously this should be:

is_hash_equal = (hash(obj1) == hash(obj2)) assert is_hash_equal == ob1 is obj2

lesteve · 2016-12-13T06:22:40Z

joblib/test/test_hashing.py

-    """ Test hashing with numpy arrays.
+def three_np_arrays():
+    """
+    Returns three numpy arrays where first two are same and third is a bit


Docstring not PEP0258.

But it seems like having a fixture with with_numpy is a good work-around.

lesteve · 2016-12-13T06:24:11Z

joblib/test/test_hashing.py

+    arr1, arr2, arr3 = three_np_arrays
+
+    # Only same arrays will have same hash
+    assert hash(arr1) == hash(arr2)


You modified the test here too, keep it as it was:

for obj1, obj2 in itertools.product(obj_list, obj_list): is_hash_equal = hash(obj1) == hash(obj2) is_array_equal = np.all(obj1 == obj2) assert is_hash_equal == is_array_equal

lesteve · 2016-12-13T06:25:23Z

joblib/test/test_hashing.py

+             [('This is a string to hash',
+                 {'py2': '80436ada343b0d79a99bfd8883a96e45',
+                  'py3': '71b3f47df22cb19431d85d92d0b230b2'}),
+


Do not skip new lines, it seems a bit weird to me.

lesteve · 2016-12-13T07:53:19Z

joblib/test/test_hashing.py

-                             '108f6ee98e7db19ea2006ffd208f4bf1',
-                             'bd48ccaaff28e16e6badee81041b7180']}
+    # Return member of list requested by parametrize marker in test below
+    return to_hash_list[request.param]


Great try but I think this makes the test quite hard to understand. I spent 10 minutes trying to put together something using pytest features but did not get anywhere.

I would be in favour of not using parametrize in this case and do something very close to what was originally here:

@with_numpy def test_hashes_stay_the_same_with_numpy_objects: to_hash_list = ... expected_dict = ... for to_hash, expected in zip(to_hash_list, expected_list): assert hash(to_hash), expected

To be honest, my general feeling so far is that parametrize can make the test more obscure than it was. The only advantage compared to having a loop in the test is to create one test for each input argument, so I feel parametrize should be used sparingly.

But won't this for loop stop if the first test case itself is incorrect ?

That's fine. Running a separate test for each parameter is nice but not that crucial. If it makes the test hard to understand then we should give up on using parametrize.

lesteve · 2016-12-13T07:57:21Z

joblib/test/test_hashing.py

-                   hash(a, coerce_mmap=coerce_mmap) ==
-                   hash(m, coerce_mmap=coerce_mmap),
-                   coerce_mmap)
+        assert ((hash(a, coerce_mmap=coerce_mmap) ==


Define a temporary variable to make this less obscure to read:

is_hash_equal = hash(a, coerce_mmap=coerce_mmap) == hash(m, coerce_mmap==coerce_mmap) assert is_hash_equal == coerce_mmap

kdexd · 2016-12-13T09:06:56Z

joblib/test/test_hashing.py

@@ -93,8 +93,8 @@ def f(self, x):
 def test_trival_hash(obj1, obj2):
    """Smoke test hash on various types."""
    # Check that 2 objects have the same hash only if they are the same.
-    is_hash_equal = (hash(obj1) == hash(obj2))
-    assert is_hash_equal == (obj1 is obj2)


I think this second paranthesis is needed. Not having them failed tests for me. Because probably is_hash_equal == obj1 is obj2 will be true if all are not None.

lesteve · 2016-12-13T09:13:40Z

Oops I broke the tests by removing the parentheses, sorry about that, I'll fix them.

and rename variables.

lesteve · 2016-12-13T09:23:48Z

Test should be fixed now. Turns out I was wrong about operators priority ..

kdexd · 2016-12-13T09:37:33Z

I am removing the parametrization from test_hashes_stay_the_same_with_numpy_objects but I insist to keep expected in a dictionary form per test case ( {'py2': hash, 'py3': hash} as it is in @parametrize of other tests, as I will locally try to refactor this test in some acceptable and good looking manner.

lesteve · 2016-12-13T09:56:35Z

joblib/test/test_hashing.py

-    py_version_str = 'py3' if PY3_OR_LATER else 'py2'
-    expected_list = expected_dict[py_version_str]
+    # Expected results have been generated with joblib 0.9.0
+    expected_list = [{'py2': '80f2387e7752abbda2658aafed49e086',


Can you leave the expected_dict definition as it was. Remember: look at your diff and do not add changes if there is not a good reason for it.

I am experimenting ways to parametrize this in a good way so I kept this diff. Nevermind, I shifted this to a separate branch.

lesteve · 2016-12-13T09:57:08Z

joblib/test/test_hashing.py

-def test_hash_numpy():
-    """ Test hashing with numpy arrays.
+def three_np_arrays():
+    """Returns three numpy arrays where first two are same and third is a bit


I would remove the docstring, I don't think it adds anything.

lesteve · 2016-12-13T09:57:33Z

joblib/test/test_hashing.py

+def test_hash_numpy_arrays(three_np_arrays):
+    arr1, arr2, arr3 = three_np_arrays
+
+    # Only same arrays will have same hash


I would remove this comment the code is clear enough and it does not say much.

lesteve · 2016-12-13T09:57:41Z

joblib/test/test_hashing.py


-    yield assert_not_equal, hash(arr1), hash(arr1.T)
+    # Two dicts will have same hash if they have exact same key value pairs


Remove this comment too.

lesteve · 2016-12-13T10:19:13Z

joblib/test/test_hashing.py

-            # Check that 2 objects have the same hash only if they are
-            # the same.
-            yield assert_equal, hash(obj1) == hash(obj2), obj1 is obj2
+@parametrize('obj1, obj2', list(itertools.product(


Just a comment I found out while reading the doc that you can do:

input_list = [...] @parametrize('obj1', input_list) @parametrize('obj2', input_list) def test_trivial_hash(obj1, obj2): ...

and this will have the same effect, i.e. iterating on the cartesian product of input_list with itself.

I thought of doing this, but then I stopped, thinking that for the first look I should not demonstrate my diff which introduces a not so special list in global scope.

But now we're on the same boat, so modifying it...

lesteve · 2016-12-13T10:36:19Z

joblib/test/test_hashing.py

@@ -460,13 +444,13 @@ def test_hashes_stay_the_same_with_numpy_objects():
    expected_list = expected_dict[py_version_str]

    for to_hash, expected in zip(to_hash_list, expected_list):
-        yield assert_equal, hash(to_hash), expected
+        assert hash(to_hash) == expected[py_version_str]


You need to remove [py_version_str] to get the tests to pass I believe.

lesteve · 2016-12-13T12:30:18Z

joblib/test/test_hashing.py



 @with_numpy
+@skipif(sys.platform == 'win32', reason='This test is not stable under windows'


Hmmm I missed that is that a new thing?

OK it's not I see it in the diff below, sorry.

Okay, cool 😄

lesteve · 2016-12-13T12:31:59Z

LGTM, merging, thanks a lot!

Karan Desai added 8 commits December 12, 2016 20:34

Parametrize test_trival_hash and reorder imports alphabetically.

b2273dc

Paramterize test_hashes_stay_the_same.

9491393

Parametrize test_hashes_stay_the_same_with_numpy_objects through fixt…

5579e27

…ure.

Split test_hash_numpy into two tests and use fixture.

e5eceb5

Use tmpdir fixture, remove tmpdir_path, parametrize test_hash_memmep.

6369427

Use skipif marker instead of SkipTest.

11432f4

Parametrize test_numpy_datetime_array.

6d59d84

Replace assert_raises_regex with pytest_assert_raises.

a5cb2c2

kdexd commented Dec 12, 2016

View reviewed changes

kdexd force-pushed the pytestify-test-hashing branch 2 times, most recently from ab8f008 to a5cb2c2 Compare December 12, 2016 16:06

lesteve reviewed Dec 13, 2016

View reviewed changes

Rectify changed tests and do some cosmetic changes.

aef7097

kdexd force-pushed the pytestify-test-hashing branch from baf36fb to aef7097 Compare December 13, 2016 08:53

kdexd mentioned this pull request Dec 13, 2016

Use comma-separated string rather than tuple in pytest parametrize arguments #455

Closed

lesteve added 2 commits December 13, 2016 10:05

Remove unnecessary parentheses

f8bae6d

Fix typo trival -> trivial

d7ee804

kdexd commented Dec 13, 2016

View reviewed changes

Remove redundant docstrings

65bacd4

Fix tests

28b2794

and rename variables.

Remove parametrization of test_hashes_stay_the_same_with_numpy_objects.

869b461

kdexd force-pushed the pytestify-test-hashing branch from 419757f to 869b461 Compare December 13, 2016 09:40

Use str for argnames in parametrization instead of list.

8999046

lesteve reviewed Dec 13, 2016

View reviewed changes

Remove redundant docstrings.

2a8c754

lesteve reviewed Dec 13, 2016

View reviewed changes

Refactor parametrization of test_trivial_hash.

4609bd3

kdexd force-pushed the pytestify-test-hashing branch from d9ebd1f to 4609bd3 Compare December 13, 2016 10:53

lesteve reviewed Dec 13, 2016

View reviewed changes

lesteve merged commit 117009d into joblib:master Dec 13, 2016

kdexd deleted the pytestify-test-hashing branch December 13, 2016 12:32

kdexd mentioned this pull request Dec 13, 2016

[MRG] Refactor test_testing.py as per pytest design. #458

Merged


		yield assert_not_equal, hash(arr1), hash(arr1.T)
		# Two dicts will have same hash if they have exact same key value pairs



		@with_numpy
		@skipif(sys.platform == 'win32', reason='This test is not stable under windows'

[MRG] Refactor test_hashing.py as per pytest design. #454

[MRG] Refactor test_hashing.py as per pytest design. #454

Conversation

kdexd commented Dec 12, 2016

Third Phase PR on #411 (Succeeding PR #453)

kdexd commented Dec 12, 2016

lesteve commented Dec 12, 2016

lesteve commented Dec 12, 2016

kdexd commented Dec 12, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kdexd commented Dec 12, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lesteve commented Dec 13, 2016

lesteve commented Dec 13, 2016

kdexd commented Dec 13, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lesteve Dec 13, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lesteve Dec 13, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lesteve commented Dec 13, 2016

kdexd commented Dec 12, 2016 •

edited

lesteve Dec 13, 2016 •

edited

lesteve Dec 13, 2016 •

edited