New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve idmaker name selection in case of duplicate ids in parametrize #1474

Merged
merged 4 commits into from Mar 23, 2016

Conversation

Projects
None yet
3 participants
@palaviv
Contributor

palaviv commented Mar 22, 2016

As @RonnyPfannschmidt mentioned in pull request #1468 the name selection in case of duplicate ids had some problems.
Let's take for example:

import pytest
@pytest.mark.parametrize("a", [1,2,3,4,5,6], ids=["a", "a", "b", "c", "b", "l"])
def test_a(a,b):
    pass

Before this change the tests names would be:

  • test_a[0a]
  • test_a[1a]
  • test_a[2b]
  • test_a[3c]
  • test_a[4b]
  • test_a[5l]

After this change:

  • test_a[a0]
  • test_a[a1]
  • test_a[b0]
  • test_a[c]
  • test_a[b1]
  • test_a[l]
@@ -26,7 +26,8 @@
* parametrize ids can accept None as specific test id. The
automatically generated id for that argument will be used.
*
* improved idmaker name selection in case of duplicate ids in

This comment has been minimized.

@nicoddemus

nicoddemus Mar 23, 2016

Member

Please include a "thanks palaviv" and mention #1474. You might want to do the same for the item just above this one. 😉

if len(set(ids)) < len(ids):
# user may have provided a bad idfn which means the ids are not unique
ids = [str(i) + testid for i, testid in enumerate(ids)]
duplicates = [testid for testid in ids if ids.count(testid) > 1]

This comment has been minimized.

@nicoddemus

nicoddemus Mar 23, 2016

Member

A faster way to check for duplicates:

has_duplicates = len(ids) != len(set(ids))
if has_duplicates:

While usually it won't matter, for a large list of test ids the difference might be significant:

$python -m timeit -s "ids = list(range(1000))" "[testid for testid in ids if ids.count(testid) > 1]"
100 loops, best of 3: 14.2 msec per loop

$python -m timeit -s "ids = list(range(1000))" "len(ids) != len(set(ids))"
10000 loops, best of 3: 22.2 usec per loop

If I increase the number of ids to 10000 I have to kill python after several seconds using the first snippet, while the second finishes in just 262 usec. 😉

This comment has been minimized.

@palaviv

palaviv Mar 23, 2016

Contributor

At first i used:

duplicates =  [testid for testid, count in collections.Counter(ids).items() if count > 1]

which is more efficient but Counter is 2.7+.
I will change the code to first check if there are duplicates.

This comment has been minimized.

@nicoddemus

nicoddemus Mar 23, 2016

Member

... but Counter is 2.7+.

I figured. 😉

Thanks! 😄

@palaviv

This comment has been minimized.

Contributor

palaviv commented Mar 23, 2016

It seems that the travis-ci failed because of some error unrelated to the tests:

The command "curl -s -o python-3.5.tar.bz2 https://s3.amazonaws.com/travis-python archives/binaries/$(lsb_release -is | tr "A-Z" "a-z")/$(lsb_release -rs)/$(uname -m)/python-3.5.tar.bz2" failed and exited with 18 during .

Is there a way to rerun the build?

@The-Compiler

This comment has been minimized.

Member

The-Compiler commented Mar 23, 2016

I just restarted the failing build.

@nicoddemus nicoddemus merged commit fed89ef into pytest-dev:features Mar 23, 2016

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@nicoddemus

This comment has been minimized.

Member

nicoddemus commented Mar 23, 2016

Thanks again @palaviv! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment