New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change (regression?) in v3.8.0a3 doctest output after capturing the stderr output from a raised warning #80876
Comments
In this project of mine, I have a tox matrix set up with Pythons from 3.3. to 3.8. I have pytest set up to run doctest on my
Under Python 3.8.0a3, though, it fails (actual local paths elided):
If I change the doctest in README to the following, where the expected output is surrounded by single-quotes instead of double-quotes, and the internal single quotes are escaped, it passes fine in 3.8.0a3:
But, naturally, it fails in 3.7 and below. It *looks* like this is probably a glitch somewhere in 3.8.0a3, where this string containing single quotes is rendered (at the REPL?) using enclosing single quotes and escaped internal single quotes, rather than enclosing double-quotes and non-escaped internal single-quotes? |
Can you please attach a single and standalone file without dependencies like attrs so that it would help in bisecting the issue? |
I tried bisecting and got to this commit 11a8966 (bpo-33375) . Seems this changes warning reporting output to add filename. I guess it's better to change the doctest to adopt this change. I have added the devs on the issue for confirmation. commit 11a8966 (HEAD)
$ cat ../backups/bpo36695_1.py
def foo():
'''
>>> import warnings, io
>>> from contextlib import redirect_stderr
>>> f = io.StringIO()
>>> with redirect_stderr(f):
... warnings.warn("'foo' has no 'bar'")
... err_cap = f.getvalue()
>>> print(err_cap)
'''
pass ➜ cpython git:(11a8966) ./python.exe -m doctest ../backups/bpo36695_1.py File "../backups/bpo36695_1.py", line 9, in bpo36695_1.foo 1 items had failures: # Before 11a8966 ➜ cpython git:(11a8966) git checkout 11a8966~1 File "../backups/bpo36695_1.py", line 9, in bpo36695_1.foo 1 items had failures: I can replicate test failure as below with 11a8966 and passes with the commit before it. README.rst F [ 20%] ============================================= FAILURES ============================================== _______________________________________ [doctest] README.rst ________________________________________
077
078 **Mock** ``stderr``\ **:**
079
080 .. code ::
081
082 >>> import warnings
083 >>> with stdio_mgr() as (in_, out_, err_):
084 ... warnings.warn("'foo' has no 'bar'")
085 ... err_cap = err_.getvalue()
086 >>> err_cap
Expected:
"...UserWarning: 'foo' has no 'bar'\n..."
Got:
'<doctest README.rst[4]>:2: UserWarning: \'foo\' has no \'bar\'\n warnings.warn("\'foo\' has no \'bar\'")\n' /home/karthi/stdio-mgr/README.rst:86: DocTestFailure |
It's not obvious to me why that change to finding the source file related to the warning should affect the format of the warning message printed. It might be something that could be fixed in the warning module. But I don't understand where it's going wrong at present. |
Karthikeyan, my apologies for the slow reply -- I posted this right before I went to bed. To emphasize, the change to the formatting of the string contents, by adding the filename, I think is not problematic: I'm using ellipses to elide everything before and after my custom "warning" message. Rather, I think the problem is that the string is being rendered as a '' string, instead of as a "" string; IOW: 'Test string with \'enclosing\' single quotes' vs "Test string with 'enclosing' double quotes" --- In the interim, as you suggest, Karthikeyan, I can just conditionally skip the doctests on 3.8 with a suitable pytest -k flag. |
Here is warn.py, a minimal no-dependency repro script:
The problem appears to be centered around *doctest*, as the following script DOES NOT raise AssertionError with either of 3.7 or 3.8:
|
No problem, thanks for the simplified program. I wrote a similar one based on doctest that fails with commit and passes before it. I am still confused about the commit impact and warnings also uses C code so hope someone else has some idea over this scenario. As you mentioned it seems to be about doctest that uses exec and compile. I can see the change in output since doctest has it's own internal stdout wrapper like contextlib but using the similar exec and compile statement as a standalone one doesn't reproduce this. |
<nod>, it seems like the problem must somehow stem from the new commit using frame.f_code.co_filename (or the C equivalent), instead of using __file__ as previously. Consider this warn2.py, similar to the other but with no single quotes in the warning message:
This doctest PASSES for me in both 3.7 and 3.8; note that the expected doctest output from Why 11a8966 would break this in this way is ... baffling to me. --- Unfortunately, I don't think it will work to fix the doctest on my end simply by using |
If you look at that commit that Thomas made all it did was change where the string was grabbed from, not what type of object was used. So it doesn't make any sense as to why that would cause any specific change, so I think this may be doctest's doing. Probably the next step is for someone to find in doctest where the string representation is being printed out to understand what would potentially shift its representation (and how it's even generating that representation). |
TBH, now that I've tweaked tox and CI just not to run the doctests on 3.8, I don't really need this to be fixed. This seems like such an edge case -- a doctest catching a warning with a message containing single quotes -- it might not really be worth the effort to figure out. Unless someone is really invested in tracking this down, I would be content to close. |
I did some more debugging. doctest patches linecache which does some regex matching when filename is of the form <doctest <filename>[examplenumber]> to return example source. Before the commit seems absolute path was present in warning and hence this regex didn't match. With the commit returning the filename of this format that matches the regex the example line is again returned. This happens with warnings inside doctest because doctest patches linecache which is used by warnings.py during formatting the warning. In CPython for some reason presence of both single quote and double quote inside a triple quoted string causes the single quote to be escaped. Any concatenation with the escaped triple quoted string also escapes the resulting text. doctest seems to store the examples as single quoted strings that are escaped and escaping them during _formatwarnmsg_impl causes the other one also to be escaped. It also happens with a normal string that has an escaped double quote. >>> a = """Test '' b""" # Two single quotes
>>> a
"Test '' b"
>>> a = """Test " b'""" # One single and double quote
>>> a
'Test " b\''
>>> a + "'c'"
'Test " b\'\'c\''
>>> a = """Test ' b""" # Only single quote
>>> a
"Test ' b"
>>>> a + "'c'"
"Test ' b'c'"
>>>> a = "Test ' b\"" # Escaped double quote
>>>> a
'Test \' b"'
>>>> a + "'a'"
'Test \' b"\'a\'' Does anyone know why this happens with escaped quotes and single quote being escaped? Is this expected and is it part of spec about how single and double quote are swapped over representation? Longer explanation : Take the below sample doctest file $ cat ../backups/bpo36695.rst
>>> import warnings # line 0
>>> warnings.warn("Test 'a'") # line 1 doctest patches linecache.getlines to a custom function linecache.getlines = __patched_linecache_getlines
__LINECACHE_FILENAME_RE = re.compile(r'<doctest '
r'(?P<name>.+)'
r'\[(?P<examplenum>\d+)\]>$')
def __patched_linecache_getlines(self, filename, module_globals=None):
m = self.__LINECACHE_FILENAME_RE.match(filename)
if m and m.group('name') == self.test.name:
example = self.test.examples[int(m.group('examplenum'))]
return example.source.splitlines(keepends=True)
else:
return self.save_linecache_getlines(filename, module_globals) doctest forms a special filename as below that is passed to exec(compile()) and hence as per the commit warning is now raised as the filename "<doctest bpo-36695.rst[1]>" in the warning. doctest also mocks sys.stdout internally to have the output captured to a StringIO buffer. [1] # Use a special filename for compile(), so we can retrieve # Before commit cpython git:(3b0b90c) ./python.exe -m doctest ../backups/bpo36695.rst # After commit $ cpython git:(11a896652e) ./python.exe -m doctest ../backups/bpo36695.rst
<doctest bpo36695.rst[1]>:1: UserWarning: Test 'a'
warnings.warn("Test 'a'") formatting warning message [2] calls linecache.getline with filename as "<doctest bpo-36695.rst[1]>" after commit which in turn calls linecache.getlines that is patched above by doctest and hence it matches the regex and returns the example.source "warnings.warn("Test 'a'")". It seems to be a triple quoted string that is already escaped and hence in the below line calling s += " %s\n" % line causes the actual warning message and the example source line to be escaped. def _formatwarnmsg_impl(msg):
s = ("%s:%s: %s: %s\n"
% (msg.filename, msg.lineno, msg.category.__name__,
msg.message))
if msg.line is None:
try:
import linecache
line = linecache.getline(msg.filename, msg.lineno)
except Exception:
# When a warning is logged during Python shutdown, linecache
# and the import machinery don't work anymore
line = None
linecache = None
else:
line = msg.line
if line:
line = line.strip()
s += " %s\n" % line [0] Line 1468 in 29d018a
[1] Line 1452 in 29d018a
[2] Line 35 in 29d018a
|
It looks to me like it's a standard feature of the CPython string rendering routines, where if single and double quotes are present in any string, the preferred rendering is enclosure with single quotes with escaped internal single quotes. On 3.6.6, regardless how I enter the following, it always returns enclosed in single quotes: >>> """ ' " """
' \' " '
>>> ''' ' " '''
' \' " '
>>> ' \' " '
' \' " '
>>> " ' \" "
' \' " ' For my particular situation, then, the problem is that my warning message, as it sits in the source, consists of a double-quoted string that contains single quotes. Then, when 3.8 doctest goes to print the source line, it has to print a string containing both single and double quotes, so the above default rendering rule kicks in and it gets printed with enclosing single-quotes. For 3.7 doctest, where the regex doesn't match, the source line doesn't get printed, and so the resulting string contains no double quotes, and thus the string gets printed with enclosing double quotes. Clearly, the solution is just for me to change the warning message! And indeed, changing to |
Thanks for the update and report, Brian. |
Thank you for taking the time to dig into it so deeply! |
I'm still a bit confused why it gets escaped - as far as I know, the escaping only happens when you repr() a string, as the displayhook does automatically: >>> a = """ a ' single and " double quote """
>>> a
' a \' single and " double quote '
>>> print(repr(a))
' a \' single and " double quote '
>>> print("%r" % a)
' a \' single and " double quote '
>>> print(a)
a ' single and " double quote The warnings code doesn't appear to ever repr() the message. So I guess it's some further bit of interaction with doctest. But unfortunately I don't have time to dig through doctest to try and understand it. |
The application of repr() (or a repr()-equivalent) appears to occur as some part of the exec(compile(...)) call within doctest ( Lines 1328 to 1329 in 4f5a349
On 3.6.6, in REPL:
Also 3.6.6, at Win cmd:
It *looks* like exec() executes the compile()'d source as if it were typed into a REPL -- IOW, any unassigned non-None return value X gets pushed to stdout as repr(X). This is then what the doctest self._fakeout captures for comparison to the 'want' of the example. |
The 'single' option to compile() means it's run like at a REPL, calling displayhook if it's an expression returning a value. But warnings shouldn't go through the displayhook, as far as I know: >>> from contextlib import redirect_stdout, redirect_stderr
>>> from io import StringIO
>>> sio = StringIO()
>>> with redirect_stderr(sio):
... exec(compile('import warnings; warnings.warn(""" \' " """)', 'dummyfile', 'single'))
...
>>> print(sio.getvalue())
__main__:1: UserWarning: ' " |
Well, the warning content *itself* may not get passed through the displayhook at raise-time, in the process of being run through stderr and displayed by the REPL. But, when you capture the warning content with redirect_stderr(sio) and then ">>> sio.getvalue()", the contents of the capture from stderr, as produced by .getvalue(), *will* get passed through the displayhook, and thus be escaped. In theory, I could have obtained a consistent 'want' by using print() as you've done. However, for my particular example (see OP), I wanted to elide the first part of the warning message, which is messy, irrelevant to my code example, and can change from Python version to Python version. However, as doctest is currently implemented, a 'want' can't start with an ellipsis because it collides with the regex that detects PS2 prompts ( Lines 583 to 586 in 4f5a349
See bpo-36714 (https://bugs.python.org/issue36714) for more information and a proposed enhancement/fix. |
D'oh, yes. I missed that the failing example was displaying the captured string through displayhook. It makes sense now. Thanks for patiently explaining. :-) |
LOL. No special thanks necessary, that last post only turned into something coherent (and possibly correct, it seems...) after a LOT of diving into the source, fiddling with the code, and (((re-)re-)re-)writing! Believe me, it reads as a lot more knowledgeable and confident than I actually felt while writing it. :-D Thanks to all of you for coming along with me on this dive into the CPython internals! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: