Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unittest subTest failure causes result to be omitted from listing #70082

Closed
zware opened this issue Dec 17, 2015 · 12 comments
Closed

unittest subTest failure causes result to be omitted from listing #70082

zware opened this issue Dec 17, 2015 · 12 comments
Labels
3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@zware
Copy link
Member

zware commented Dec 17, 2015

BPO 25894
Nosy @pitrou, @rbtcollins, @ezio-melotti, @bitdancer, @voidspace, @ambv, @vadmium, @zware, @serhiy-storchaka
PRs
  • bpo-25894: Always report skipped and failed subtests separately #28082
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-09-10.16:00:39.878>
    created_at = <Date 2015-12-17.05:49:35.850>
    labels = ['type-bug', 'library', '3.11']
    title = 'unittest subTest failure causes result to be omitted from listing'
    updated_at = <Date 2021-09-10.16:00:39.877>
    user = 'https://github.com/zware'

    bugs.python.org fields:

    activity = <Date 2021-09-10.16:00:39.877>
    actor = 'lukasz.langa'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-09-10.16:00:39.878>
    closer = 'lukasz.langa'
    components = ['Library (Lib)']
    creation = <Date 2015-12-17.05:49:35.850>
    creator = 'zach.ware'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 25894
    keywords = ['patch']
    message_count = 12.0
    messages = ['256580', '256588', '256590', '256604', '256631', '257146', '261834', '400573', '400700', '401101', '401586', '401587']
    nosy_count = 9.0
    nosy_names = ['pitrou', 'rbcollins', 'ezio.melotti', 'r.david.murray', 'michael.foord', 'lukasz.langa', 'martin.panter', 'zach.ware', 'serhiy.storchaka']
    pr_nums = ['28082']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue25894'
    versions = ['Python 3.11']

    @zware
    Copy link
    Member Author

    zware commented Dec 17, 2015

    The title can barely be called accurate; the description of the problem isn't easy to condense to title length. Here's the issue:

    $ cat subtest_test.py 
    import os
    import unittest
    class TestClass(unittest.TestCase):
    
        def test_subTest(self):
            for t in map(int, os.environ.get('tests', '1')):
                with self.subTest(t):
                    if t > 1:
                        raise unittest.SkipTest('skipped')
                    self.assertTrue(t)
    
    if __name__ == '__main__':
        unittest.main()
    $ ./python.exe subtest_test.py 
    .

    Ran 1 test in 0.000s

    OK
    $ tests=01 ./python.exe subtest_test.py

    ======================================================================
    FAIL: test_subTest (main.TestClass) (<subtest>)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "subtest_test.py", line 12, in test_subTest
        self.assertTrue(t)
    AssertionError: 0 is not true

    Ran 1 test in 0.001s

    FAILED (failures=1)
    $ tests=012 ./python.exe subtest_test.py
    s
    ======================================================================
    FAIL: test_subTest (main.TestClass) (<subtest>)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "subtest_test.py", line 12, in test_subTest
        self.assertTrue(t)
    AssertionError: 0 is not true

    Ran 1 test in 0.001s

    FAILED (failures=1, skipped=1)

    Note that on the first run, the short summary is ".", as expected. The second is "", when one of the subTests fails, but then the third is "s", when one subtest fails but another is skipped. This also extends to verbose mode:

    $ ./python.exe subtest_test.py -v
    test_subTest (__main__.TestClass) ... ok

    Ran 1 test in 0.001s

    OK
    $ tests=01 ./python.exe subtest_test.py -v
    test_subTest (main.TestClass) ...
    ======================================================================
    FAIL: test_subTest (main.TestClass) (<subtest>)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "subtest_test.py", line 12, in test_subTest
        self.assertTrue(t)
    AssertionError: 0 is not true

    Ran 1 test in 0.001s

    FAILED (failures=1)
    $ tests=012 ./python.exe subtest_test.py -v
    test_subTest (main.TestClass) ... skipped 'skipped'

    ======================================================================
    FAIL: test_subTest (main.TestClass) (<subtest>)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "subtest_test.py", line 12, in test_subTest
        self.assertTrue(t)
    AssertionError: 0 is not true

    Ran 1 test in 0.001s

    FAILED (failures=1, skipped=1)

    Note the first run shows "... ok", the second "... ", and the third "... skipped 'skipped'"

    I'm unsure what the solution should be. There should at least be some indication that the test finished, but should mixed results be reported as 'm' ("mixed results" in verbose mode), or should failure/error take precedence, or should every different result be represented?

    @zware zware added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Dec 17, 2015
    @vadmium
    Copy link
    Member

    vadmium commented Dec 17, 2015

    Okay, so you have a test with subtests. You have presented three cases:

    1. Single subtest which passes. No problem I assume.

    2. Two subtests: 1st fails, 2nd passes. This is how subtests are normally used, so I guess there is no problem. Is that right?

    3. After two subtests have already run (one of which failed), SkipTest is raised. I guess you want the test results to be reported better in this case.

    What is the use case? Why not skip the test before any subtests are started?

    @zware
    Copy link
    Member Author

    zware commented Dec 17, 2015

    Martin Panter added the comment:

    Okay, so you have a test with subtests. You have presented three cases:

    1. Single subtest which passes. No problem I assume.

    Or several subtests which pass. No problems.

    1. Two subtests: 1st fails, 2nd passes. This is how subtests are normally used, so I guess there is no problem. Is that right?

    Any of multiple subtests fail, and there is no indication in the
    "summary line" (the line that is usually "..........................",
    a dot for each successful test). When a a regular test fails, an F
    (or an E, if the raised exception was anything but
    self.failureException) is added to the line; when any subtests fail,
    nothing is added. If you have 10 tests methods that use subtests, and
    any subtest in each method fails, your summary line will be blank. In
    verbose mode, you'd get "test_one ... test_two ... test_three ... ..."
    (note lack of newlines) instead of the expected "test_one ...
    FAILURE\ntest_two ... FAILURE\ntest_three ... FAILURE\n..." (note the
    newlines).

    1. After two subtests have already run (one of which failed), SkipTest is raised. I guess you want the test results to be reported better in this case.

    What is the use case? Why not skip the test before any subtests are started?

    Only the subtest is skipped (which should be valid, or documented as
    not valid), and the order of the subtests doesn't matter:

    $ tests=210 ./python.exe subtest_test.py -v
    test_subTest (__main__.TestClass) ... skipped 'skipped'

    ======================================================================
    FAIL: test_subTest (main.TestClass) (<subtest>)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "subtest_test.py", line 14, in test_subTest
        self.assertTrue(t)
    AssertionError: 0 is not true

    Ran 1 test in 0.001s

    FAILED (failures=1, skipped=1)

    But, the summary makes it seem as though the entire test was skipped.

    Hopefully this makes it a bit clearer :)

    @bitdancer
    Copy link
    Member

    I believe this was discussed at the time subTest was added and deemed an acceptable tradeoff for a simpler implementation. I'm not sure it is, but I'm not prepared to write code to fix it :) I'm bothered every time I see this, but I have to admit that the tracebacks are the most important feedback and you do get those.

    @vadmium
    Copy link
    Member

    vadmium commented Dec 18, 2015

    Yes now I understand. If a subtest fails, there is no status update (not even a newline in verbose mode), and each subtest skip triggers a separate status update.

    My gut feeling is that any subtest failure should be counted as the whole test failing. I’m not sure how the failure vs error cases should be handled. Maybe error should trump failure.

    Judging by <https://bugs.python.org/issue16997#msg180259\>, Antoine intended for SkipTest to skip subtests. But I’m not sure that be reported as the whole test being skipped.

    @ezio-melotti
    Copy link
    Member

    My gut feeling is that any subtest failure should be counted as the
    whole test failing. I’m not sure how the failure vs error cases should
    be handled. Maybe error should trump failure.

    I think the priority should be error > failure > skip > pass.
    IOW, pass should never be reported if any of the other 3 happen, skip should be reported only if there are no errors/failures, and errors should trump failures.

    @rbtcollins
    Copy link
    Member

    The basic model is this:

    • a test can have a single outcome [yes, the api is ambiguous, but there it is]
    • subtests let you identify multiple variations of a single test (note the id tweak etc) and *may* be reported differently

    We certainly must not report the test as a whole passing if any subtest did not pass.

    Long term I want to remove the error/failure partitioning of exceptions; its not actually useful.

    The summary for the test, when subtests are used, should probably enumerate the states.

    test_foo (3 passed, 2 skipped, 1 failure)

    in much the same way the run as a whole is enumerated.

    @serhiy-storchaka
    Copy link
    Member

    Were not subtests proposed as a more flexible replacement of parametrized tests? I think that every subtest should be counted as a separate test case: in verbose mode it should output a separate line and in non-verbose mode it should output a separate character.

    @serhiy-storchaka
    Copy link
    Member

    PR 28082 is a draft that implements this idea. Skipped and failed (but not successfully passed) subtests are now reported separately, as a character (sFE) or a line ("skipped", "FAIL", "ERROR"). The description of the subtest is included in the line. For example:

    $ tests=.sFE ./python test_issue25894.py 
    sFE

    ======================================================================
    ERROR: test_subTest (main.TestClass) [3] (t='E')
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/serhiy/py/cpython/test_issue25894.py", line 15, in test_subTest
        raise Exception('error')
        ^^^^^^^^^^^^^^^^^^^^^^^^
    Exception: error

    ======================================================================
    FAIL: test_subTest (main.TestClass) [2] (t='F')
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/serhiy/py/cpython/test_issue25894.py", line 13, in test_subTest
        self.fail('failed')
        ^^^^^^^^^^^^^^^^^^^
    AssertionError: failed

    Ran 1 test in 0.001s

    FAILED (failures=1, errors=1, skipped=1)

    $ tests=.sFE ./python test_issue25894.py -v
    test_subTest (__main__.TestClass) ... 
      test_subTest (__main__.TestClass) [1] (t='s') ... skipped 'skipped'
      test_subTest (__main__.TestClass) [2] (t='F') ... FAIL
      test_subTest (__main__.TestClass) [3] (t='E') ... ERROR

    ======================================================================
    ERROR: test_subTest (main.TestClass) [3] (t='E')
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/serhiy/py/cpython/test_issue25894.py", line 15, in test_subTest
        raise Exception('error')
        ^^^^^^^^^^^^^^^^^^^^^^^^
    Exception: error

    ======================================================================
    FAIL: test_subTest (main.TestClass) [2] (t='F')
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/serhiy/py/cpython/test_issue25894.py", line 13, in test_subTest
        self.fail('failed')
        ^^^^^^^^^^^^^^^^^^^
    AssertionError: failed

    Ran 1 test in 0.001s

    FAILED (failures=1, errors=1, skipped=1)

    As a side effect, the test description is also repeated for every error in the test cleanup code (in tearDown() and doCleanup()).

    Similar changes should be added also in RegressionTestResult. If apply bpo-45057 first they will be much simpler.

    bpo-29152 can be related. If call addError() and addFailure() from addSubTest(), PR 28082 should be rewritten.

    @serhiy-storchaka
    Copy link
    Member

    Any suggestions for the format of output? Currently PR 28082 formats lines for subtest skipping, failure or error with a 2-space identation. Lines for skipping, failure or error in tearDown() or functions registered with addCallback() do not differ from a line skipping, failure or error in the test method itself.

    I am not sure about backporting this change. On one hand, it fixes an old flaw in the unittest output. On other hand, the change affects not only subtests and can confuse programs which parse the unittest output because the test descriptions can occur multiple times.

    @serhiy-storchaka serhiy-storchaka added the 3.11 only security fixes label Sep 5, 2021
    @ambv
    Copy link
    Contributor

    ambv commented Sep 10, 2021

    New changeset f0f29f3 by Serhiy Storchaka in branch 'main':
    bpo-25894: Always report skipped and failed subtests separately (GH-28082)
    f0f29f3

    @ambv
    Copy link
    Contributor

    ambv commented Sep 10, 2021

    Any suggestions for the format of output? Currently PR 28082 formats lines for subtest skipping, failure or error with a 2-space identation. Lines for skipping, failure or error in tearDown() or functions registered with addCallback() do not differ from a line skipping, failure or error in the test method itself.

    I'm fine with that. Ultimately I don't think differentiating subtest status from method status is that important.

    I am not sure about backporting this change.

    Since we're too late for 3.10.0 and it isn't a trivially small change that changes test output, I think we shouldn't be backporting it. There are tools that parse this output in editors. I'm afraid that it's too risky since we haven't given the community enough time to test for output difference.

    I'm marking this as resolved. Thanks for your patch, Serhiy! If you feel strongly about backporting, we'd have to reopen and mark this as release blocker.

    @ambv ambv closed this as completed Sep 10, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    This issue was closed.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants