Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-117225: Add color to doctest output #117583

Merged
merged 27 commits into from
Apr 24, 2024

Conversation

hugovk
Copy link
Member

@hugovk hugovk commented Apr 6, 2024

The doctest output looks like:

image

I moved the existing _ANSIColors class and _can_colorize function from traceback.py added in #112732, to re-use them, cc @pablogsal.

  • Is _colorize an okay name for this module?

  • If the module has leading underscore, do we also want leading underscores in _ANSIColors and _can_colorize?

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! A few nits below -- it's a little hard to read the overall diff because of all the blank lines being added. Would you mind maybe leaving those out for now?

Lib/_colorize.py Outdated Show resolved Hide resolved
Lib/test/test_doctest/test_doctest.py Outdated Show resolved Hide resolved
Lib/test/test_doctest/test_doctest.py Outdated Show resolved Hide resolved
Lib/doctest.py Outdated Show resolved Hide resolved
Lib/doctest.py Outdated Show resolved Hide resolved
@aisk
Copy link
Member

aisk commented Apr 6, 2024

I think we should also change the British spelling 'colour' to 'color' in the PR's title, which will become the commit message after merging.

@hugovk hugovk changed the title gh-117225: Add colour to doctest output gh-117225: Add color to doctest output Apr 6, 2024
Lib/traceback.py Outdated Show resolved Hide resolved
Lib/doctest.py Outdated Show resolved Hide resolved
Lib/doctest.py Outdated Show resolved Hide resolved
Lib/doctest.py Outdated Show resolved Hide resolved
@AlexWaygood
Copy link
Member

AlexWaygood commented Apr 6, 2024

What about also colorising the divider line that's used to separate failed examples? So it would look like this? It might help to see more clearly where one failed example ends and another begins.

image

This patch does it:

diff --git a/Lib/doctest.py b/Lib/doctest.py
index 00ff1a107c..80d0b70592 100644
--- a/Lib/doctest.py
+++ b/Lib/doctest.py
@@ -1307,7 +1307,10 @@ def report_unexpected_exception(self, out, test, example, exc_info):
             'Exception raised:\n' + _indent(_exception_traceback(exc_info)))
 
     def _failure_header(self, test, example):
-        out = [self.DIVIDER]
+        red, reset = (
+            (ANSIColors.RED, ANSIColors.RESET) if can_colorize() else ("", "")
+        )
+        out = [f"{red}{self.DIVIDER}{reset}"]
         if test.filename:
             if test.lineno is not None and example.lineno is not None:
                 lineno = test.lineno + example.lineno + 1
@@ -1621,7 +1624,7 @@ def summarize(self, verbose=None):
                     print(f" {green}{count:3d} test{s} in {name}{reset}")
 
         if failed:
-            print(self.DIVIDER)
+            print(f"{red}{self.DIVIDER}{reset}")
             print(f"{red}{_n_items(failed)} had failures:{reset}")

@AlexWaygood
Copy link
Member

AlexWaygood commented Apr 6, 2024

Of these three summary lines at the end:

1 item had failures:
   2 of   4 in typing._should_unflatten_callable_args
***Test Failed*** 2 failures.

You currently have all three colored in red if there are errors. I think I might be inclined to keep the last line colored in red (as you have it), but not colorize the first two of those lines. I feel like having everything in red makes it harder to read, and having the last line in red is enough to draw your attention to it and communicate "bad!". What do you think?

@hugovk
Copy link
Member Author

hugovk commented Apr 7, 2024

I've added red divider lines.

Edit: need to update tests to disable colour locally.

Of these three summary lines at the end:

1 item had failures:
   2 of   4 in typing._should_unflatten_callable_args
***Test Failed*** 2 failures.

You currently have all three colored in red if there are errors. I think I might be inclined to keep the last line colored in red (as you have it), but not colorize the first two of those lines. I feel like having everything in red makes it harder to read, and having the last line in red is enough to draw your attention to it and communicate "bad!". What do you think?

Can you share a screenshot and a diff to trigger it?

I don't get all in red:

Screenshots image image

@AlexWaygood
Copy link
Member

AlexWaygood commented Apr 7, 2024

Can you share a screenshot and a diff to trigger it?

Screenshot in #117583 (comment). For how to trigger it — I was testing by doing some silly things to some doctests in Lib/typing.py, then running ./python.exe -m doctest Lib/typing.py directly!

@AlexWaygood
Copy link
Member

AlexWaygood commented Apr 7, 2024

It looks like many tests fail if you run tests with FORCE_COLOR, e.g. FORCE_COLOR=1 ./python.exe -m test (see #117605), and test_doctest is one of them. This patch does seem to make a few more parts of test_doctest fail with FORCE_COLOR, though:

image

Since there are so many that already fail with that environment variable set, I'm not sure it's really an issue, but thought I'd mention it anyway.

Lib/test/test__colorize.py Outdated Show resolved Hide resolved
Lib/test/test__colorize.py Outdated Show resolved Hide resolved
hugovk and others added 2 commits April 7, 2024 11:50
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! You might want to wait for a second review though (preferably from somebody who uses a light theme for their terminal 😄)

@Privat33r-dev
Copy link
Contributor

Privat33r-dev commented Apr 8, 2024

I was pleased to see that colorize now follows the ISP. I love the idea of colorization, and I think that we can continue this trend with unittests as well. Thank you for your contribution!

The minor improvement that could be done is to use stub/other way to return empty colors so instead of this:

        if can_colorize():
            bold_green = ANSIColors.BOLD_GREEN
            bold_red = ANSIColors.BOLD_RED
            green = ANSIColors.GREEN
            red = ANSIColors.RED
            reset = ANSIColors.RESET
            yellow = ANSIColors.YELLOW
        else:
            bold_green = ""
            bold_red = ""
            green = ""
            red = ""
            reset = ""
            yellow = ""

it could be something like this:

        ANSIColors = get_ansi_colors()
        bold_green = ANSIColors.BOLD_GREEN
        bold_red = ANSIColors.BOLD_RED
        green = ANSIColors.GREEN
        red = ANSIColors.RED
        reset = ANSIColors.RESET
        yellow = ANSIColors.YELLOW

Lib/_colorize.py Outdated
RED = "\x1b[31m"
RESET = "\x1b[0m"
YELLOW = "\x1b[33m"

Copy link
Contributor

@Privat33r-dev Privat33r-dev Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ANSIColorsStub = type('ANSIColorsStub', (ANSIColors,), {x: "" for x in dir(ANSIColors) if not x.startswith('__')})
def get_ansi_colors() -> ANSIColors:
return ANSIColors if can_colorize() else ANSIColorsStub

Using get_ansi_colors() preserves IDE suggestions (ctrl+space) and centralizes, as well as abstracts color availability logic, improving code's clarity and flexibility.

Lib/_colorize.py Outdated Show resolved Hide resolved
@hugovk
Copy link
Member Author

hugovk commented Apr 10, 2024

Thanks for the suggestion, something like that sounds good.

I'll wait to give #117672 a change to be merged first so as not to cause conflicts there, then will update this. I might include the refactoring as part of this PR, but mainly want to make sure this is merged before the beta cutoff on 2024-05-07 so it can be in 3.13.

I'll also add some docs, but they can be another PR and can be after the beta cutoff, if necessary.

@hugovk
Copy link
Member Author

hugovk commented Apr 17, 2024

Plan C: it's been a week and #117672 isn't merged yet. So in this PR, I've moved the colourise functionality back to the traceback module. We can refactor after that PR has been merged, and we might even be able to do that after the beta cutoff? doctest already imports traceback so there's no additional performance hit there. And it makes this patch smaller.

Comment on lines +451 to +452
GREEN = "\x1b[32m"
BOLD_GREEN = "\x1b[1;32m"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, is there a reason why these colours might have been omitted from the original changes to colourise Python's tracebacks in general? Maybe to avoid accessibility issues for people who are red-green colourblind? Or to avoid issues for people who have light themes on their terminals?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because when there's a traceback error there's nothing "good" to report in green :)

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch still looks good to me, but I'd still love it if we could get a review from somebody who uses a light theme in their terminal and/or somebody who's colourblind before we merge 👍

@hugovk
Copy link
Member Author

hugovk commented Apr 17, 2024

The widely-used pytest (116m monthly downloads) uses the same ANSI colours:

https://github.com/pytest-dev/pytest/blob/58844247f7ae4a7a213a73c8af2462253b3d8fc7/src/_pytest/_io/terminalwriter.py#L44-L66

@encukou made a good point in python/cherry-picker#120 (comment):

The 16 named colours are determined by the terminal. If there's e.g. not enough contrast between background and red in the default settings, you should probably report it to the terminal devs.

That also means you shouldn't use direct RGB values, or the 256-colour palette. (Unless you also paint the background using RGB, which looks bad if it's not a “full-screen” app.)

So, my suggestion: stick to 16 named colours, implement NO_COLOR & --color/--no-color to opt out, but don't over-analyze how your terminal renders things.

We're sticking to the 16 ANSI colours and have PYTHON_COLORS and NO_COLOR to disable.

@pablogsal How does the following look for you? Is the text still legible?

image

@hugovk hugovk merged commit 975081b into python:main Apr 24, 2024
33 checks passed
@hugovk hugovk deleted the doctest-tidy-output-colour branch April 24, 2024 11:27
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot AMD64 RHEL7 LTO + PGO 3.x has failed when building commit 975081b.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/96/builds/7341) and take a look at the build logs.
  4. Check if the failure is related to this commit (975081b) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/96/builds/7341

Failed tests:

  • test_capi

Summary of the results of the build (if available):

==

Click to see traceback logs
remote: Enumerating objects: 28, done.        
remote: Counting objects:   3% (1/28)        
remote: Counting objects:   7% (2/28)        
remote: Counting objects:  10% (3/28)        
remote: Counting objects:  14% (4/28)        
remote: Counting objects:  17% (5/28)        
remote: Counting objects:  21% (6/28)        
remote: Counting objects:  25% (7/28)        
remote: Counting objects:  28% (8/28)        
remote: Counting objects:  32% (9/28)        
remote: Counting objects:  35% (10/28)        
remote: Counting objects:  39% (11/28)        
remote: Counting objects:  42% (12/28)        
remote: Counting objects:  46% (13/28)        
remote: Counting objects:  50% (14/28)        
remote: Counting objects:  53% (15/28)        
remote: Counting objects:  57% (16/28)        
remote: Counting objects:  60% (17/28)        
remote: Counting objects:  64% (18/28)        
remote: Counting objects:  67% (19/28)        
remote: Counting objects:  71% (20/28)        
remote: Counting objects:  75% (21/28)        
remote: Counting objects:  78% (22/28)        
remote: Counting objects:  82% (23/28)        
remote: Counting objects:  85% (24/28)        
remote: Counting objects:  89% (25/28)        
remote: Counting objects:  92% (26/28)        
remote: Counting objects:  96% (27/28)        
remote: Counting objects: 100% (28/28)        
remote: Counting objects: 100% (28/28), done.        
remote: Compressing objects:   6% (1/15)        
remote: Compressing objects:  13% (2/15)        
remote: Compressing objects:  20% (3/15)        
remote: Compressing objects:  26% (4/15)        
remote: Compressing objects:  33% (5/15)        
remote: Compressing objects:  40% (6/15)        
remote: Compressing objects:  46% (7/15)        
remote: Compressing objects:  53% (8/15)        
remote: Compressing objects:  60% (9/15)        
remote: Compressing objects:  66% (10/15)        
remote: Compressing objects:  73% (11/15)        
remote: Compressing objects:  80% (12/15)        
remote: Compressing objects:  86% (13/15)        
remote: Compressing objects:  93% (14/15)        
remote: Compressing objects: 100% (15/15)        
remote: Compressing objects: 100% (15/15), done.        
remote: Total 15 (delta 13), reused 2 (delta 0), pack-reused 0        
From https://github.com/python/cpython
 * branch            main       -> FETCH_HEAD
Note: checking out '975081b11e052c9f8deb42c5876104651736302e'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at 975081b... gh-117225: Add color to doctest output (#117583)
Switched to and reset branch 'main'

find: ‘build’: No such file or directory
find: ‘build’: No such file or directory
find: ‘build’: No such file or directory
find: ‘build’: No such file or directory
make[2]: [Makefile:3101: clean-retain-profile] Error 1 (ignored)

make: *** [Makefile:2232: buildbottest] Error 2

@hugovk
Copy link
Member Author

hugovk commented Apr 25, 2024

Docs PR: #118268

Refactoring PR: #118283

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants