-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Refactor test_hashing.py as per pytest design. #454
Conversation
@lesteve I had kept this ready for you today ! Please review as per convenience 😄 |
The travis error indicates that |
Note that the failing build is the one where numpy is not installed. |
@lesteve I knew there was a problem with skipif marker. I guess the old |
# of same name due to 'indirect' keyword used here | ||
# Expected results have been generated with joblib 0.9.0 | ||
[(0, {'py2': '80f2387e7752abbda2658aafed49e086', | ||
'py3': '10a6afc379ca2708acfbaef0ab676eab'}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lesteve I cannot write it as:
@with_numpy
@parametrize(['to_hash', 'expected'],
[(np.arange(100, dtype='<i8').reshape((10, 10))[:, :2],
{'py2': '80f2387e7752abbda2658aafed49e086',
'py3': '10a6afc379ca2708acfbaef0ab676eab'}), .... #others])
def method_name(to_hash, expected):
# method body
pass
I cannot use np
in @parametrize args... They are still exposed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you not use a function that returns the parameters i.e. something like:
def input_params():
return [(np.arange(100, dtype='<i8').reshape((10, 10))[:, :2],
{'py2': '80f2387e7752abbda2658aafed49e086',
'py3': '10a6afc379ca2708acfbaef0ab676eab'}), .... #others]
@with_numpy
@parametrize(['to_hash', 'expected'], input_params())
def method_name:
...
Alternatively it seems like the pytest issue you pointed at was skipping parameters inside the input_params function (or at least that is what I wildly guessed from reading it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively it seems like the pytest issue you pointed at was skipping parameters inside the input_params function (or at least that is what I wildly guessed from reading it).
I am reasonably confident the first option is not going to work but the latter one will. Here is a snippet that seems to work for me:
import pytest
try:
import bla
except ImportError:
bla = None
def params():
skip = pytest.mark.skipif(bla is None, reason="bla not installed")
return [skip('a'), skip('b')]
@pytest.mark.parametrize("param", params())
def test_foo(param):
assert False
There may be better ways to do it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well yes, the first one did not work, I tested it. It is because @parametrize
runs at setup and the function call also happens at setup. But skipping does not happen. Which is why the second snippet you gave works, because pytest allows providing skipif as parameters to parametrize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a snippet that seems to work for me
Shall we go this way ? Giving out skipif markers ? Just because we have got one more alternative, out of the two I will vote the already done implementation because I am assuming it would look familiar to eyes of people familair with pytest (maybe ?)
Here is a link to standard example in documentation:
http://doc.pytest.org/en/latest/example/parametrize.html#apply-indirect-on-particular-arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed I feel using parametrize in this case is too convoluted. So just use a dumb for loop.
ab8f008
to
a5cb2c2
Compare
@lesteve: Although I fixed that error, there are yield based tests in test_memory.py which keep on misbehaving. So So for now, |
import tempfile | ||
import os | ||
import sys | ||
import collections |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't modify the imports order. If you have a plugin that orders imports alphabetically, disable it.
"""Smoke test hash on various types.""" | ||
# Check that 2 objects have the same hash only if they are the same. | ||
if obj1 is obj2: | ||
assert hash(obj1) == hash(obj2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope you are not testing the same as the test as it was previously this should be:
is_hash_equal = (hash(obj1) == hash(obj2))
assert is_hash_equal == ob1 is obj2
""" Test hashing with numpy arrays. | ||
def three_np_arrays(): | ||
""" | ||
Returns three numpy arrays where first two are same and third is a bit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring not PEP0258.
But it seems like having a fixture with with_numpy
is a good work-around.
arr1, arr2, arr3 = three_np_arrays | ||
|
||
# Only same arrays will have same hash | ||
assert hash(arr1) == hash(arr2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You modified the test here too, keep it as it was:
for obj1, obj2 in itertools.product(obj_list, obj_list):
is_hash_equal = hash(obj1) == hash(obj2)
is_array_equal = np.all(obj1 == obj2)
assert is_hash_equal == is_array_equal
[('This is a string to hash', | ||
{'py2': '80436ada343b0d79a99bfd8883a96e45', | ||
'py3': '71b3f47df22cb19431d85d92d0b230b2'}), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not skip new lines, it seems a bit weird to me.
'108f6ee98e7db19ea2006ffd208f4bf1', | ||
'bd48ccaaff28e16e6badee81041b7180']} | ||
# Return member of list requested by parametrize marker in test below | ||
return to_hash_list[request.param] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great try but I think this makes the test quite hard to understand. I spent 10 minutes trying to put together something using pytest features but did not get anywhere.
I would be in favour of not using parametrize in this case and do something very close to what was originally here:
@with_numpy
def test_hashes_stay_the_same_with_numpy_objects:
to_hash_list = ...
expected_dict = ...
for to_hash, expected in zip(to_hash_list, expected_list):
assert hash(to_hash), expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, my general feeling so far is that parametrize can make the test more obscure than it was. The only advantage compared to having a loop in the test is to create one test for each input argument, so I feel parametrize
should be used sparingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But won't this for
loop stop if the first test case itself is incorrect ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine. Running a separate test for each parameter is nice but not that crucial. If it makes the test hard to understand then we should give up on using parametrize.
hash(a, coerce_mmap=coerce_mmap) == | ||
hash(m, coerce_mmap=coerce_mmap), | ||
coerce_mmap) | ||
assert ((hash(a, coerce_mmap=coerce_mmap) == |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define a temporary variable to make this less obscure to read:
is_hash_equal = hash(a, coerce_mmap=coerce_mmap) == hash(m, coerce_mmap==coerce_mmap)
assert is_hash_equal == coerce_mmap
baf36fb
to
aef7097
Compare
@@ -93,8 +93,8 @@ def f(self, x): | |||
def test_trival_hash(obj1, obj2): | |||
"""Smoke test hash on various types.""" | |||
# Check that 2 objects have the same hash only if they are the same. | |||
is_hash_equal = (hash(obj1) == hash(obj2)) | |||
assert is_hash_equal == (obj1 is obj2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this second paranthesis is needed. Not having them failed tests for me. Because probably is_hash_equal == obj1 is obj2
will be true if all are not None
.
Oops I broke the tests by removing the parentheses, sorry about that, I'll fix them. |
Test should be fixed now. Turns out I was wrong about operators priority .. |
I am removing the parametrization from |
419757f
to
869b461
Compare
py_version_str = 'py3' if PY3_OR_LATER else 'py2' | ||
expected_list = expected_dict[py_version_str] | ||
# Expected results have been generated with joblib 0.9.0 | ||
expected_list = [{'py2': '80f2387e7752abbda2658aafed49e086', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave the expected_dict definition as it was. Remember: look at your diff and do not add changes if there is not a good reason for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am experimenting ways to parametrize this in a good way so I kept this diff. Nevermind, I shifted this to a separate branch.
def test_hash_numpy(): | ||
""" Test hashing with numpy arrays. | ||
def three_np_arrays(): | ||
"""Returns three numpy arrays where first two are same and third is a bit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove the docstring, I don't think it adds anything.
def test_hash_numpy_arrays(three_np_arrays): | ||
arr1, arr2, arr3 = three_np_arrays | ||
|
||
# Only same arrays will have same hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this comment the code is clear enough and it does not say much.
|
||
yield assert_not_equal, hash(arr1), hash(arr1.T) | ||
# Two dicts will have same hash if they have exact same key value pairs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this comment too.
# Check that 2 objects have the same hash only if they are | ||
# the same. | ||
yield assert_equal, hash(obj1) == hash(obj2), obj1 is obj2 | ||
@parametrize('obj1, obj2', list(itertools.product( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment I found out while reading the doc that you can do:
input_list = [...]
@parametrize('obj1', input_list)
@parametrize('obj2', input_list)
def test_trivial_hash(obj1, obj2):
...
and this will have the same effect, i.e. iterating on the cartesian product of input_list
with itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought of doing this, but then I stopped, thinking that for the first look I should not demonstrate my diff which introduces a not so special list in global scope.
But now we're on the same boat, so modifying it...
@@ -460,13 +444,13 @@ def test_hashes_stay_the_same_with_numpy_objects(): | |||
expected_list = expected_dict[py_version_str] | |||
|
|||
for to_hash, expected in zip(to_hash_list, expected_list): | |||
yield assert_equal, hash(to_hash), expected | |||
assert hash(to_hash) == expected[py_version_str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to remove [py_version_str]
to get the tests to pass I believe.
d9ebd1f
to
4609bd3
Compare
|
||
|
||
@with_numpy | ||
@skipif(sys.platform == 'win32', reason='This test is not stable under windows' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm I missed that is that a new thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK it's not I see it in the diff below, sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, cool 😄
LGTM, merging, thanks a lot! |
Third Phase PR on #411 (Succeeding PR #453)
This PR deals with refactor of tests in test_hashing.py to adopt the pytest flavor.
Parametrization of certain tests have been done by an accompanying fixture to protect numpy specific code using
@with_numpy
. Refer Move to py.test? #411 (comment) for more details.Regex checking of tests has been done using pytest now.
tmpdir_path
fixture has been removed and pytest's owntmpdir
is brought back to used to keep uniformity across modules. Some older occurences (if any) will be taken care later.Some cosmetic changes like reordering of imports has been included.