Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unittest output drives developers to avoid docstrings #90284

Closed
jaraco opened this issue Dec 18, 2021 · 23 comments · Fixed by #32288
Closed

Unittest output drives developers to avoid docstrings #90284

jaraco opened this issue Dec 18, 2021 · 23 comments · Fixed by #32288
Assignees
Labels
3.11 only security fixes tests Tests in the Lib/test dir

Comments

@jaraco
Copy link
Member

jaraco commented Dec 18, 2021

BPO 46126
Nosy @terryjreedy, @jaraco, @merwok, @methane, @ethanfurman, @miss-islington
PRs
  • bpo-46126: Disable 'descriptions' when running tests internally. #30194
  • bpo-46126: Restore 'descriptions' when running tests internally. #32128
  • gh-90284: Restore docstrings in importlib.metadata tests. #32288
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/jaraco'
    closed_at = None
    created_at = <Date 2021-12-18.23:08:24.144>
    labels = ['tests', '3.11']
    title = 'Unittest output drives developers to avoid docstrings'
    updated_at = <Date 2022-04-03.19:33:37.676>
    user = 'https://github.com/jaraco'

    bugs.python.org fields:

    activity = <Date 2022-04-03.19:33:37.676>
    actor = 'miss-islington'
    assignee = 'jaraco'
    closed = False
    closed_date = None
    closer = None
    components = ['Tests']
    creation = <Date 2021-12-18.23:08:24.144>
    creator = 'jaraco'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46126
    keywords = ['patch']
    message_count = 23.0
    messages = ['408876', '408878', '408880', '408881', '408882', '408883', '408884', '408885', '409159', '409161', '409478', '411279', '411280', '415645', '415649', '415826', '415834', '415870', '416068', '416081', '416088', '416091', '416641']
    nosy_count = 6.0
    nosy_names = ['terry.reedy', 'jaraco', 'eric.araujo', 'methane', 'ethan.furman', 'miss-islington']
    pr_nums = ['30194', '32128', '32288']
    priority = 'normal'
    resolution = 'works for me'
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue46126'
    versions = ['Python 3.11']

    @jaraco
    Copy link
    Member Author

    jaraco commented Dec 18, 2021

    In python/importlib_metadata#302, I learned that the way unittest reports failures in tests is incentivizing the replacement of docstrings with comments in order not to make resolution of the relevant failing test more difficult to locate.

    I presume I don't need to explain why docstrings are nice and preferable over comments.

    Better would be for unittest to provide an option to ignore the docstrings or to emit the test path regardless of whether a docstring was present and to employ that option in CPython, allowing for docstrings in tests.

    @jaraco jaraco added 3.11 only security fixes tests Tests in the Lib/test dir labels Dec 18, 2021
    @ethanfurman
    Copy link
    Member

    Could you give a couple examples for those of us not familiar with the problem?

    @jaraco
    Copy link
    Member Author

    jaraco commented Dec 19, 2021

    Absolutely. I was thinking to do just that.

    @jaraco
    Copy link
    Member Author

    jaraco commented Dec 19, 2021

    I created this diff:

    diff --git a/Lib/test/test_importlib/test_metadata_api.py b/Lib/test/test_importlib/test_metadata_api.py
    index e16773a7e8..92aacd5ad5 100644
    --- a/Lib/test/test_importlib/test_metadata_api.py
    +++ b/Lib/test/test_importlib/test_metadata_api.py
    @@ -90,8 +90,11 @@ def test_entry_points_distribution(self):
                 self.assertEqual(ep.dist.version, "1.0.0")
     
         def test_entry_points_unique_packages(self):
    -        # Entry points should only be exposed for the first package
    -        # on sys.path with a given name.
    +        """
    +        Entry points should only be exposed for the first package
    +        on sys.path with a given name.
    +        """
    +        raise ValueError("Failing on purpose")
             alt_site_dir = self.fixtures.enter_context(fixtures.tempdir())
             self.fixtures.enter_context(self.add_sys_path(alt_site_dir))
             alt_pkg = {

    And then ran the tests, but the output is easy to totally scrutable:

    cpython bpo-46126/bad-error-message $ ./python.exe -m test.test_importlib
    ........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................s............................E.................................................x......................................................................................................s....................................................................................................................................................................Trying 20 threads ... 44.7 ms OK.
    Trying 50 threads ... 36.8 ms OK.
    Trying 20 threads ... 27.7 ms OK.
    Trying 50 threads ... 28.0 ms OK.
    Trying 20 threads ... 27.9 ms OK.
    Trying 50 threads ... 31.1 ms OK.
    .Trying 20 threads ... 7.1 ms OK.
    Trying 50 threads ... 7.6 ms OK.
    Trying 20 threads ... 3.2 ms OK.
    Trying 50 threads ... 8.5 ms OK.
    Trying 20 threads ... 3.4 ms OK.
    Trying 50 threads ... 8.7 ms OK.
    .Trying 20 threads ... 40.3 ms OK.
    Trying 50 threads ... 8.7 ms OK.
    Trying 20 threads ... 3.5 ms OK.
    Trying 50 threads ... 6.5 ms OK.
    Trying 20 threads ... 3.2 ms OK.
    Trying 50 threads ... 6.5 ms OK.
    ..................................................s............................s................................................s............................s...............
    ======================================================================
    ERROR: test_entry_points_unique_packages (test.test_importlib.test_metadata_api.APITests)
    Entry points should only be exposed for the first package
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/Users/jaraco/code/public/cpython/Lib/test/test_importlib/test_metadata_api.py", line 97, in test_entry_points_unique_packages
        raise ValueError("Failing on purpose")
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ValueError: Failing on purpose
    
    ----------------------------------------------------------------------
    Ran 1426 tests in 2.377s
    
    FAILED (errors=1, skipped=6, expected failures=1)
    

    So there must be some other test invocation that doesn't provide the clarity of which test is failing.

    @jaraco
    Copy link
    Member Author

    jaraco commented Dec 19, 2021

    My guess is the tests were run with something like this:

    ./python.exe -E -We -m test -v test_importlib | grep '... ERROR'
    Entry points should only be exposed for the first package ... ERROR
    test test_importlib failed
    

    In that case, the location of the test is replaced with the first line of the docstring. If there's no docstring, however, the location of the failure is easier to detect:

    $ ./python.exe -E -We -m test -v test_importlib | grep '... ERROR'
    test_entry_points_unique_packages (test.test_importlib.test_metadata_api.APITests) ... ERROR
    test test_importlib failed
    

    @jaraco
    Copy link
    Member Author

    jaraco commented Dec 19, 2021

    Looks like the code, I believe here's where the reporting happens:

    if self.descriptions and doc_first_line:
    return '\n'.join((str(test), doc_first_line))
    else:
    return str(test)

    I see a "description" property there that may already provide the feature.

    @jaraco
    Copy link
    Member Author

    jaraco commented Dec 19, 2021

    It looks like that 'descriptions' attribute is passed through from TextTestRunner, which sets it to True by default (

    def __init__(self, stream=None, descriptions=True, verbosity=1,
    ). It seems that behavior isn't controllable through any command-line options or environment variables.

    @jaraco
    Copy link
    Member Author

    jaraco commented Dec 19, 2021

    After some investigation (and tracing calls through a dozen or more layers), I found that Python's own regression tests can disable this behavior thus:

    diff --git a/Lib/test/support/testresult.py b/Lib/test/support/testresult.py
    index 2cd1366cd8..328ca8760e 100644
    --- a/Lib/test/support/testresult.py
    +++ b/Lib/test/support/testresult.py
    @@ -145,7 +145,8 @@ def get_test_runner_class(verbosity, buffer=False):
             return functools.partial(unittest.TextTestRunner,
                                      resultclass=RegressionTestResult,
                                      buffer=buffer,
    -                                 verbosity=verbosity)
    +                                 verbosity=verbosity,
    +                                 descriptions=False,)
         return functools.partial(QuietRegressionTestRunner, buffer=buffer)
     
     def get_test_runner(stream, verbosity, capture_output=False):
    

    Combined with the above diff (with a docstring), the output is now as desired:

    $ ./python.exe -m test -v test_importlib | grep '... ERROR'
    test_entry_points_unique_packages (test.test_importlib.test_metadata_api.APITests) ... ERROR
    test test_importlib failed
    

    It seems it would be wise to enable this setting by default, given that without it, every test module is forced to replace docstrings with comments.

    @merwok
    Copy link
    Member

    merwok commented Dec 24, 2021

    I presume I don't need to explain why docstrings are nice and preferable over comments.

    Actually, can you?

    Docstrings to document regular modules, classes or functions are a fine tool, especially with tools that extract them (pydoc, sphinx). I don’t see the same need for test functions.

    @terryjreedy
    Copy link
    Member

    I also use comments in lieu of docstrings for text_xyx methods because I find the addition of the first line of the docstring in error reports to be useless, distracting, and sometimes, depending on the content, confusing. If the error is in the test method, the full comment is available when looking back at the method code. If the error is in the tested code, the comment/docstring is inapplicable.

    When I edit test files, I run them directly from an IDLE editor via the 'if __name__' clause. When I run all IDLE tests from Command Prompt, I usually run 'python -m test.test_idle'. (Similar but not necessarily identical to 'python -m -ugui test_idle'.) So fiddling with regrtest will not help in either case.

    Based on the above, the following seems to work.
    runner = unittest.TextTestRunner(descriptions=False, verbosity=2)
    unittest.main(testRunner=runner)
    I am not sure what passing a runner does, versus leaving the default None, but the verbosity passed to the runner overrides and verbosity argument passed to main. What I would like is 'descriptions' added as a parameter to main, so that
    unittest.main(descriptions=False, verbosity=2)
    would work.

    @jaraco
    Copy link
    Member Author

    jaraco commented Jan 1, 2022

    > I presume I don't need to explain why docstrings are nice and preferable over comments.

    Actually, can you?

    I searched around and didn't find any good treatise or thorough overview of reasons _why_ docstrings should be preferred, so I performed an internal deconstruction of my motivations and published this article on my blog.

    @miss-islington
    Copy link
    Contributor

    New changeset a941e59 by Jason R. Coombs in branch 'main':
    bpo-46126: Disable 'descriptions' when running tests internally. (GH-30194)
    a941e59

    @jaraco
    Copy link
    Member Author

    jaraco commented Jan 22, 2022

    I've merged the fix for regrtest and I'll explore Terry's concerns and see what I can devise for those concerns as well.

    @jaraco
    Copy link
    Member Author

    jaraco commented Mar 21, 2022

    In an attempt to replicate Terry's usage, I added the following patch:

    diff --git a/Lib/idlelib/idle_test/test_text.py b/Lib/idlelib/idle_test/test_text.py
    index 0f31179e04..be977bbfff 100644
    --- a/Lib/idlelib/idle_test/test_text.py
    +++ b/Lib/idlelib/idle_test/test_text.py
    @@ -20,6 +20,10 @@ def test_init(self):
             self.assertEqual(self.text.get('end'), '')
     
         def test_index_empty(self):
    +        """
    +        Failing test with bad description.
    +        """
    +        raise ValueError()
             index = self.text.index
     
             for dex in (-1.0, 0.3, '1.-1', '1.0', '1.0 lineend', '1.end', '1.33',
    

    When I did, I then ran the tests using the -m launcher:

    $ ./python.exe -m test.test_idle -v -k test_index_empty
    idlelib.idle_test.test_configdialog (unittest.loader.ModuleSkipped) ... skipped 'cannot run without OS X gui process'
    idlelib.idle_test.test_debugger (unittest.loader.ModuleSkipped) ... skipped 'cannot run without OS X gui process'
    idlelib.idle_test.test_editmenu (unittest.loader.ModuleSkipped) ... skipped 'cannot run without OS X gui process'
    idlelib.idle_test.test_help (unittest.loader.ModuleSkipped) ... skipped 'cannot run without OS X gui process'
    idlelib.idle_test.test_parenmatch (unittest.loader.ModuleSkipped) ... skipped 'cannot run without OS X gui process'
    idlelib.idle_test.test_percolator (unittest.loader.ModuleSkipped) ... skipped 'cannot run without OS X gui process'
    idlelib.idle_test.test_replace (unittest.loader.ModuleSkipped) ... skipped 'cannot run without OS X gui process'
    idlelib.idle_test.test_scrolledlist (unittest.loader.ModuleSkipped) ... skipped 'cannot run without OS X gui process'
    idlelib.idle_test.test_search (unittest.loader.ModuleSkipped) ... skipped 'cannot run without OS X gui process'
    test_index_empty (idlelib.idle_test.test_text.MockTextTest)
    Failing test with bad description. ... ERROR
    setUpClass (idlelib.idle_test.test_text.TkTextTest) ... skipped 'cannot run without OS X gui process'
    idlelib.idle_test.test_textview (unittest.loader.ModuleSkipped) ... skipped 'cannot run without OS X gui process'
    idlelib.idle_test.test_tooltip (unittest.loader.ModuleSkipped) ... skipped 'cannot run without OS X gui process'
    idlelib.idle_test.test_tree (unittest.loader.ModuleSkipped) ... skipped 'cannot run without OS X gui process'
    idlelib.idle_test.test_undo (unittest.loader.ModuleSkipped) ... skipped 'cannot run without OS X gui process'
    
    ======================================================================
    ERROR: test_index_empty (idlelib.idle_test.test_text.MockTextTest)
    Failing test with bad description.
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/Users/jaraco/code/public/cpython/Lib/idlelib/idle_test/test_text.py", line 26, in test_index_empty
        raise ValueError()
        ^^^^^^^^^^^^^^^^^^
    ValueError
    
    ----------------------------------------------------------------------
    Ran 14 tests in 0.001s
    
    FAILED (errors=1, skipped=14)
    

    As you can see, the location of the failing test in the log is masked, and instead the description is present.

    I've pushed https://github.com/python/cpython/tree/bpo-46126/disable-descriptions-unittest-m, which when applied disables the descriptions.

    This approach works as a surgical intervention, wrapping unittest.main in such a way that it could be replaced across the test suite to disable descriptions. By subclassing the TextTestRunner, it continues to honor other parameters from the command-line (such as verbosity).

    Terry, would you review the concept? Does it meet your needs? If so, I can apply it more broadly and prepare a PR.

    @jaraco jaraco assigned terryjreedy and unassigned jaraco Mar 21, 2022
    @methane
    Copy link
    Member

    methane commented Mar 21, 2022

    As you can see, the location of the failing test in the log is masked, and instead the description is present.

    Could you elaborate?

    test_index_empty (idlelib.idle_test.test_text.MockTextTest)
    Failing test with bad description. ... ERROR
    (snip)
    ======================================================================
    ERROR: test_index_empty (idlelib.idle_test.test_text.MockTextTest)
    Failing test with bad description.
    ----------------------------------------------------------------------
    

    I can see test_index_empty (idlelib.idle_test.test_text.MockTextTest) in both places. What is masked?

    @jaraco
    Copy link
    Member Author

    jaraco commented Mar 22, 2022

    Inada, you're right. Good catch. I think I missed that behavior from before because I used grep ERROR and when I ran it this time, I just assumed the output would be the same, but it's clear that even with descriptions enabled, the location of the test is clearly reported in the case of launching with -m unittest.

    I'm now suspicious that my report in https://bugs.python.org/issue46126#msg408882 was also flawed as it uses grep ERROR, which could easily mask lines that adjacent to the error.

    I now believe that there are no concerns relating to Terry's concern, and likely not Brett's concern despite my best efforts to replicate. I'll bring back the original example and see if it in fact has the reported defect.

    @jaraco jaraco assigned jaraco and unassigned terryjreedy Mar 22, 2022
    @jaraco
    Copy link
    Member Author

    jaraco commented Mar 22, 2022

    And indeed, after removing the grep ERROR part of the repro, even the repro seems to be invalid:

    cpython main $ ./python.exe -m test.test_importlib -v -k test_entry_points_unique
    test_entry_points_unique_packages (test.test_importlib.test_metadata_api.APITests)
    Entry points should only be exposed for the first package ... ERROR
    test.test_importlib.test_windows (unittest.loader.ModuleSkipped) ... skipped "No module named 'winreg'"
    
    ======================================================================
    ERROR: test_entry_points_unique_packages (test.test_importlib.test_metadata_api.APITests)
    Entry points should only be exposed for the first package
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/Users/jaraco/code/public/cpython/Lib/test/test_importlib/test_metadata_api.py", line 97, in test_entry_points_unique_packages
        raise ValueError("Failing on purpose")
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ValueError: Failing on purpose
    
    ----------------------------------------------------------------------
    Ran 2 tests in 0.006s
    
    FAILED (errors=1, skipped=1)
    

    I'm effectively unable to replicate the behavior reported by Brett in the other issue (unless | grep ERROR is consider an important use-case to support).

    At this point, I propose one of two options:

    1. Back out PR 30194 and restore the prior behavior with descriptions.
    2. Leave PR 30194 to suppress descriptions for the default runner.

    And then simply declare that docstrings are acceptable for CPython tests.

    Given that descriptions are on by default, I'm slightly inclined toward (1).

    Thoughts?

    @merwok
    Copy link
    Member

    merwok commented Mar 23, 2022

    I think the situation and the discussion should be summarized on python-dev!

    @jaraco
    Copy link
    Member Author

    jaraco commented Mar 26, 2022

    I've confirmed that prior to the patch in PR 30194, the location of the failure was indeed reported. It was just not reported on the same line:

    cpython main $ git log -1
    commit 3e93af0b06cada874c4a16868b6f863b599919f2 (HEAD -> main)
    Author: Jason R. Coombs <jaraco@jaraco.com>
    Date:   Sat Mar 26 10:45:58 2022 -0400
    
        Revert "bpo-46126: Disable 'descriptions' when running tests internally. (GH-30194)"
        
        This reverts commit a941e5927f7f2540946813606c61c6aea38db426.
    cpython main $ git diff
    diff --git a/Lib/test/test_importlib/test_metadata_api.py b/Lib/test/test_importlib/test_metadata_api.py
    index b3627cbb75..7e3bb09cf2 100644
    --- a/Lib/test/test_importlib/test_metadata_api.py
    +++ b/Lib/test/test_importlib/test_metadata_api.py
    @@ -90,8 +90,11 @@ def test_entry_points_distribution(self):
                 self.assertEqual(ep.dist.version, "1.0.0")
     
         def test_entry_points_unique_packages(self):
    -        # Entry points should only be exposed for the first package
    -        # on sys.path with a given name.
    +        """
    +        Entry points should only be exposed for the first package
    +        on sys.path with a given name.
    +        """
    +        raise ValueError("Failing on purpose")
             alt_site_dir = self.fixtures.enter_context(fixtures.tempdir())
             self.fixtures.enter_context(self.add_sys_path(alt_site_dir))
             alt_pkg = {
    cpython main $ ./python.exe -E -We -m test -v test_importlib | grep entry_points_unique_packages
    test_entry_points_unique_packages (test.test_importlib.test_metadata_api.APITests)
    ERROR: test_entry_points_unique_packages (test.test_importlib.test_metadata_api.APITests)
      File "/Users/jaraco/code/public/cpython/Lib/test/test_importlib/test_metadata_api.py", line 97, in test_entry_points_unique_packages
    test test_importlib failed
    

    The description is given _in addition_ to the location of the test.

    That means the concern reported in #302 was actually invalid.

    I also was under the false impression that the description was masking the test location, but instead, it's just enhancing it (while also injecting a newline).

    Given this renewed understanding, I believe it's appropriate to back out the PR.

    I think the situation and the discussion should be summarized on python-dev!

    Great suggestion. Will do.

    @ethanfurman
    Copy link
    Member

    Perhaps we could make an enhancement then? Having the extra information on a separate line is, at least for me, very jarring -- as in, I hadn't figured out that that was the way it was done until Inadasan pointed it out.

    Perhaps a command-line switch to enable having it all on one line? That would also help when grepping. (I would propose it to be the default behavior, but I can see that that would result in very long lines.)

    @jaraco
    Copy link
    Member Author

    jaraco commented Mar 26, 2022

    My goal with this issue is to simply unblock the use of docstrings in Python tests. I believe with the work above and the proposed published post, I've accomplished that goal. I've already spent more time than I'd hoped on this issue and would like to resolve it at its original scope.

    I do see how this issue reveals some deficiencies with the current UI/UX, and I'd welcome a separate feature request to address those deficiencies. I'll even help draft the bug report on request.

    @ethanfurman
    Copy link
    Member

    That makes sense.

    bpo-47133 created.

    @miss-islington
    Copy link
    Contributor

    New changeset 84acb5c by Jason R. Coombs in branch 'main':
    bpo-46126: Restore 'descriptions' when running tests internally. (GH-32128)
    84acb5c

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes tests Tests in the Lib/test dir
    Projects
    None yet
    Development

    Successfully merging a pull request may close this issue.

    6 participants