Fix test reordering for indirect parameterization #9350

haxtibal · 2021-11-29T23:49:20Z

reorder_items groups tests so that each group can share indirectly parameterized fixture instances. This optimizes for fewer fixture setup/teardown cycles. Prior to this PR, grouping considered an parameter's index, not the parameter value itself, to determine whether a parameter is "the same" and therefore can be shared.

Relying on indexes is fast, however only works when there's exactly one parameter list for one fixture. If we provide multiple (possibly different) lists for the same fixture, e.g. by decorating test items, one index can no longer refer to "the same" parameter. In order to still be able to group for fixture reuse, we have to group by parameter value.

Caution: The value ends up inside the key of another dict, and therefore must be hashable. This was true for indexes, but no longer is guaranteed for user provided values. A user may e.g. provide dicts or numpy arrays. The SafeHashWrapper ensures a fallback to id() in such a case.

Fixes #8914.

nicoddemus · 2021-11-30T20:18:47Z

testing/example_scripts/issue_519.py

@@ -22,13 +22,13 @@ def checked_order():
    assert order == [
        ("issue_519.py", "fix1", "arg1v1"),
        ("test_one[arg1v1-arg2v1]", "fix2", "arg2v1"),
-        ("test_two[arg1v1-arg2v1]", "fix2", "arg2v1"),


Hmm, this is odd, because we have increased the number of setups/teardowns for fixture 2 with this change.

On main, we have 2 s/t for fix1, and 4 s/t for fix2.

Here we have the same 2 s/t for fix1, but now we have 8 s/t for fix2. Seems like a regression for this case. 🤔

now we have 8 s/t for fix2

Yes, but we also had 8 s/t for fix2 even before this PR. fix2 is always (and correctly) recreated, because it has function scope. It's just not obvious from the assert statement. Tested it by adding print output pre and post the yield in foo{1,2}. But please double check...

I think the same holds for fixture arg2, it also has function scope and therefore shall always be recreated.

fix1 and arg1 have module scope, and both are successfully reused even after the PR, are they?

Ahh right my bad, missed the fact that fix2 is function scoped, so its order there doesn't matter as it will be recreated each time. 👍

nicoddemus · 2021-11-30T20:19:50Z

Thanks @haxtibal!

The PR looks great, but I left a comment on one of the tests you changed, please take a look?

nicoddemus

LGTM thanks!

It would be nice if another maintainer also reviewed this (@RonnyPfannschmidt @bluetech)

bluetech

Thanks @haxtibal, interesting. This does seem to give better behavior.

To give a proper review, I would need to dig into the details here (like understand the surrounding code, and figure out why the param index was even used in the first place). That might take some time, so I wouldn't want to hold the PR for too long.

One possible problem with using the param value is that some params might be large (like large numpy arrays and what not) and comparing them might be slow. See for example 80d4dd6. We will need to consider if this is a problem for this case or not.

bluetech · 2021-12-08T13:00:17Z

src/_pytest/fixtures.py

+@attr.s(auto_attribs=True, eq=False)
+class SafeHashWrapper:
+    obj: Any
+
+    def __eq__(self, other) -> Any:
+        try:
+            res = self.obj == other
+            bool(res)
+            return res
+        except Exception:
+            return id(self.obj) == id(other)
+
+    def __hash__(self) -> Any:
+        if isinstance(self.obj, Hashable):
+            return hash(self.obj)
+        return hash(id(self.obj))


Some suggestions here (I haven't checked that they all work):

Suggested change

@attr.s(auto_attribs=True, eq=False)

class SafeHashWrapper:

obj: Any

def __eq__(self, other) -> Any:

try:

res = self.obj == other

bool(res)

return res

except Exception:

return id(self.obj) == id(other)

def __hash__(self) -> Any:

if isinstance(self.obj, Hashable):

return hash(self.obj)

return hash(id(self.obj))

@attr.s(auto_attribs=True, eq=False, slots=True)

class SafeHashWrapper:

obj: object

def __eq__(self, other: object) -> bool:

try:

res = self.obj == other

return bool(res)

except Exception:

return id(self.obj) == id(other)

def __hash__(self) -> int:

if isinstance(self.obj, Hashable):

return hash(self.obj)

return hash(id(self.obj))

Thank's, applied all except obj: object, because it makes mypy think obj is always hashable and therefore we have an unreachable statement in SafeHashWrapper.__hash__.

I think the type constraint I'd like to express for obj is "type supports identity" and "type is not NoneType". But I don't know a way how, do you? EDIT: See python/mypy#11799.

A bit more OT, if isinstance(object(), Hashable) and isinstance(dict(), object) are True, doesn't isinstance(dict(), Hashable) == False break Liskov?

src/_pytest/fixtures.py

testing/python/fixtures.py

src/_pytest/fixtures.py

bluetech · 2021-12-08T13:16:39Z

testing/python/fixtures.py

@@ -1312,6 +1312,36 @@ def test2(no_eq):
        result = pytester.runpytest()
        result.stdout.fnmatch_lines(["*4 passed*"])

+    def test_reorder_indirect(self, pytester: Pytester) -> None:


A docstring with a reference to the PR (since there is no issue) would be good.

BTW, if you explicitly want to test the reordering here and not the runtime behavior, I suggest running the pytester with --collect-only --quiet which just spits the test order, and checking that, instead of doing prints and all that.

explicitly want to test the reordering here

Afaik pytest doesn't give guarantees about absolute test order, but only about certain constraints. So I intended to assert for the number of setups/teardowns to be minimal, not for a specific sequence. I think that's only possible if we actually let the fixture requests happen. They happen after collect.

But --collect-only could still be good for kind of unit testing the reorder function, would such an additional test be useful?

haxtibal · 2021-12-08T17:19:23Z

Thanks for the review @nicoddemus and @bluetech

figure out why the param index was even used in the first place

0cfd873 and #244 give some insight. Seems we switched from value to index in 2013 because unhashable params caused failure. This PR works around the unhashable problem by falling back to id() via the SafeHashWrapper.

some params might be large (like large numpy arrays and what not) and comparing them might be slow.

Agreed. Comparing here means hashing, but that could be slow too. It's probably not too bad, but still... A way out could be to consider a user provided id, instead of calculating the hash (such a user id would also be better than the id() fallback for unhashable types). Maybe even pytest.param(id="use_this_one_for_reorder") could be good enough? Give me some time, I'll evaluate that.

numpy arrays

They're special, as they're unhashable and won't compare to bool. It's a case where the SafeHashWrapper falls back to id().

bluetech · 2021-12-08T21:13:52Z

0cfd873 and #244 give some insight.

Indeed, knowing it was originally value based and only switched to indexes for the hashability reassures me that there's not something we're missing.

haxtibal · 2021-12-09T22:43:20Z

Maybe even pytest.param(id="use_this_one_for_reorder") could be good enough?

They could :) dfc773e demonstrates how. Not yet code complete, but all tests pass. Just wanted to hear your opinion about the approach early.

The benefit should be obvious from this example:

@pytest.mark.parametrize("fix", [1, pytest.param({"data": 2}, id="2")], indirect=True)
def test1(fix):
    pass

@pytest.mark.parametrize("fix", [pytest.param({"data": 2}, id="2"), 1], indirect=True)
def test2(fix):
    pass

Instead of subtle surprises triggered by implicit fallback to identity compare, a user can now explicitly control which
parameters are "the same". The id arg seems like the natural choice for that purpose. Otoh, the changes are less
trivial compared to the first commit, so I can understand if you don't like it.

How shall we proceed from here?

EDIT: @nicoddemus Sorry, just realized you've already approved, and my experimental commit totally invalidates that approve. Now I reverted this PR to the conservative fix, and created a new branch feature/param_id_is_key for the experimental one.

bluetech · 2021-12-11T20:51:35Z

@haxtibal Using ids sounds interesting. I'm not sure if it can work, but if does, it would be an elegant solution to the problem.

Your commit is hard to understand directly (because the underlying code is not the prettiest). What would be great is if you could:

Open a separate PR for the id approach, based directly on main.
Add comments (can be code comments or github comments) explaining what each non-trivial change is doing.

Add a test to assert pytest-dev#8914 is fixed. The test assures that both reordering and caching work as intended, and demonstrates how one can use parameter ids to decide about equality. Adapting issue_519.checked_order is not cheating. See our discussion at pytest-dev#9350 (review)

haxtibal · 2021-12-16T21:59:06Z

@bluetech

Open a separate PR for the id approach, based directly on main.

Done, see #9420. It's now basically a combination of using parameter ids from API plus a fallback to the behavior introduced in #9350.

Add comments (can be code comments or github comments) explaining what each non-trivial change is doing.

Did it as series of smaller commits, each with reasoning in commit message, hope that helps a bit.

if it can work

All tests pass, "can work" is "yes" until proven otherwise😏

Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering and caching work as intended, and demonstrates how one can use parameter ids to decide about equality. Adapting issue_519.checked_order is not cheating. See our discussion at pytest-dev#9350 (review)

reorder_items groups tests so that each group can share indirectly parameterized fixture instances. This optimizes for fewer fixture setup/teardown cycles. Prior to this commit, grouping considered an parameter's index, not the parameter value itself, to determine whether a parameter is "the same" and therefore can be shared. Relying on indexes is fast, however only works when there's exactly one parameter list for one fixture. If we provide multiple (possibly different) lists for the same fixture, e.g. by decorating test items, one index can no longer refer to "the same" parameter. In order to still be able to group for fixture reuse, we have to group by parameter value. Caution: The value ends up inside the key of another dict, and therefore must be hashable. This was true for indexes, but no longer is guaranteed for user provided values. A user may e.g. provide dicts or numpy arrays. The SafeHashWrapper ensures a fallback to id() in such a case. Fixes pytest-dev#8914.

Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering and caching work as intended, and demonstrates how one can use parameter ids to decide about equality. Adapting issue_519.checked_order is not cheating. See our discussion at pytest-dev#9350 (review)

haxtibal mentioned this pull request Nov 29, 2021

Indirect session fixture reordering not working as expected #8914

Open

nicoddemus reviewed Nov 30, 2021

View reviewed changes

nicoddemus approved these changes Dec 3, 2021

View reviewed changes

bluetech reviewed Dec 8, 2021

View reviewed changes

haxtibal force-pushed the bugfix/reorder_indirect_params branch from 5a50852 to dfc773e Compare December 9, 2021 22:38

haxtibal force-pushed the bugfix/reorder_indirect_params branch from dfc773e to 2fdbc3e Compare December 10, 2021 12:13

haxtibal mentioned this pull request Dec 16, 2021

Improved fixture reuse by new param keys that can be derived from API ids #9420

Open

nicoddemus self-requested a review December 17, 2021 10:46

haxtibal force-pushed the bugfix/reorder_indirect_params branch from 2fdbc3e to 4a314bf Compare December 19, 2021 13:47

bluetech mentioned this pull request Jul 15, 2023

Make high-scope fixtures teardown at last dependent test #10771

Closed

bluetech mentioned this pull request Sep 9, 2023

Improvement: Base FixtureArgKeys on param values if possible, not param indices #11271

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix test reordering for indirect parameterization #9350

Fix test reordering for indirect parameterization #9350

haxtibal commented Nov 29, 2021

nicoddemus Nov 30, 2021

haxtibal Nov 30, 2021

haxtibal Nov 30, 2021

nicoddemus Dec 3, 2021

nicoddemus commented Nov 30, 2021

nicoddemus left a comment

bluetech left a comment

bluetech Dec 8, 2021

haxtibal Dec 19, 2021 •

edited

bluetech Dec 8, 2021

haxtibal Dec 10, 2021

haxtibal commented Dec 8, 2021

bluetech commented Dec 8, 2021

haxtibal commented Dec 9, 2021 •

edited

bluetech commented Dec 11, 2021

haxtibal commented Dec 16, 2021

Fix test reordering for indirect parameterization #9350

Are you sure you want to change the base?

Fix test reordering for indirect parameterization #9350

Conversation

haxtibal commented Nov 29, 2021

nicoddemus Nov 30, 2021

Choose a reason for hiding this comment

haxtibal Nov 30, 2021

Choose a reason for hiding this comment

haxtibal Nov 30, 2021

Choose a reason for hiding this comment

nicoddemus Dec 3, 2021

Choose a reason for hiding this comment

nicoddemus commented Nov 30, 2021

nicoddemus left a comment

Choose a reason for hiding this comment

bluetech left a comment

Choose a reason for hiding this comment

bluetech Dec 8, 2021

Choose a reason for hiding this comment

haxtibal Dec 19, 2021 • edited

Choose a reason for hiding this comment

bluetech Dec 8, 2021

Choose a reason for hiding this comment

haxtibal Dec 10, 2021

Choose a reason for hiding this comment

haxtibal commented Dec 8, 2021

bluetech commented Dec 8, 2021

haxtibal commented Dec 9, 2021 • edited

bluetech commented Dec 11, 2021

haxtibal commented Dec 16, 2021

haxtibal Dec 19, 2021 •

edited

haxtibal commented Dec 9, 2021 •

edited