explicitly passed parametrize ids do not get escaped to ascii #1351

Closed
davehunt opened this Issue Feb 1, 2016 · 16 comments

Projects

None yet

5 participants

@davehunt
Contributor
davehunt commented Feb 1, 2016

If values in parametrize contain unicode bytes, they are escaped when generating the IDs, however if these same objects are provided via the ids argument, they are not escaped. As a result, the resulting IDs can't be used with pylib's XML/HTML generation.

This was discovered and previously discussed in https://github.com/davehunt/pytest-html/pull/30

To reproduce, create a conftest.py with the following contents:

# -*- coding: utf-8 -*-
def pytest_runtest_logreport(report):
    # print report.nodeid  # test_unicode.py::test_unicode[〈]
    print unicode(report.nodeid)  # UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 30: ordinal not in range(128)
    # print report.nodeid.encode('string-escape')  # test_unicode.py::test_unicode[〈]
    # print unicode(report.nodeid.encode('string-escape'))  # test_unicode[\xe3\x80\x88]

Then, create a test file with the following contents:

# -*- coding: utf-8 -*-
import pytest
@pytest.mark.parametrize('val', ['foo'], ids=[''])
def test_unicode(val):
    pass

Run the test, and you will get the following traceback:

test_unicode.py 
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/main.py", line 90, in wrap_session
INTERNALERROR>     session.exitstatus = doit(config, session) or 0
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/main.py", line 121, in _main
INTERNALERROR>     config.hook.pytest_runtestloop(session=session)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 724, in __call__
INTERNALERROR>     return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 338, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook, methods, kwargs)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 333, in <lambda>
INTERNALERROR>     _MultiCall(methods, kwargs, hook.spec_opts).execute()
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 596, in execute
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/main.py", line 146, in pytest_runtestloop
INTERNALERROR>     item.config.hook.pytest_runtest_protocol(item=item, nextitem=nextitem)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 724, in __call__
INTERNALERROR>     return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 338, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook, methods, kwargs)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 333, in <lambda>
INTERNALERROR>     _MultiCall(methods, kwargs, hook.spec_opts).execute()
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 595, in execute
INTERNALERROR>     return _wrapped_call(hook_impl.function(*args), self.execute)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 253, in _wrapped_call
INTERNALERROR>     return call_outcome.get_result()
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 279, in get_result
INTERNALERROR>     _reraise(*ex)  # noqa
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 264, in __init__
INTERNALERROR>     self.result = func()
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 596, in execute
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/runner.py", line 65, in pytest_runtest_protocol
INTERNALERROR>     runtestprotocol(item, nextitem=nextitem)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/runner.py", line 72, in runtestprotocol
INTERNALERROR>     rep = call_and_report(item, "setup", log)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/runner.py", line 123, in call_and_report
INTERNALERROR>     hook.pytest_runtest_logreport(report=report)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 724, in __call__
INTERNALERROR>     return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 338, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook, methods, kwargs)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 333, in <lambda>
INTERNALERROR>     _MultiCall(methods, kwargs, hook.spec_opts).execute()
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 596, in execute
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "/Users/dhunt/workspace/pytest-scratchpad/conftest.py", line 6, in pytest_runtest_logreport
INTERNALERROR>     print unicode(report.nodeid)  # UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 30: ordinal not in range(128)
INTERNALERROR> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 30: ordinal not in range(128)
@RonnyPfannschmidt
Contributor

at second glace, this isbreaking as expected, print on python2 will break on unicode

please verify

@The-Compiler
Member

No, unicode is - in the case of pytest-html, it's when trying to add the element to HTML

The point is if pytest should always make sure that ids are valid unicode objects, by doing _escape_bytes if they are bytes as well.

@RonnyPfannschmidt
Contributor

my understanding is, that currently node id's are native strings so bytes on python2
if we want to make them unicode in all cases, then we need a major release

@The-Compiler
Member

Sorry, my comment above is wrong - it should ensure they're always ascii-only native strings (i.e. bytes on Python 2), like it does with the autogenerated IDs.

For example, with this conftest.py:

from __future__ import print_function

def pytest_runtest_logreport(report):
    if report.when == 'call':
        print('\n  nodeid: {!r}'.format(report.nodeid))

and this test_foo.py:

# encoding: utf-8

import pytest

@pytest.mark.parametrize('foo', ['büpf'])
def test_args(foo):
    pass

@pytest.mark.parametrize('foo', ['a'], ids=['büpf'])
def test_ids(foo):
    pass

The autogenerated ID gets escaped to be ascii only, but not the ids one:

$ python2 -m py.test  -v
[...]

test_foo.py::test_args[b\xc3\xbcpf] PASSED
  nodeid: 'test_foo.py::test_args[b\\xc3\\xbcpf]'

test_foo.py::test_ids[büpf] PASSED
  nodeid: 'test_foo.py::test_ids[b\xc3\xbcpf]'
@RonnyPfannschmidt RonnyPfannschmidt changed the title from bytes objects in parametrize ID lists are not encoded to explicitly passed parametrize ids do not get escaped to ascii Feb 11, 2016
@RonnyPfannschmidt
Contributor

@hpk42 ping?

im under the impression we should either ensure the id's are always ascii,
and warn if the given id's do not match canonical ascii or support unicode

since python is growing more international and identifiers are unicode in 3.x, i think we should support unicode node id's simply because that is what we will get in future
for non-ascii byte string node id's we should issue a pytest-warning and escape them as if they where byte strings

@ceridwen
Contributor

I'd like to try this bug for my Outreachy application for pytest-html. What's the final conclusion on what parametrize should do with the ids arguments? I'd assume it's probably better to support Unicode ids. Should I call the same _escape_bytes() that's being used in _idval?

@The-Compiler
Member

FWIW, @RonnyPfannschmidt's proposal makes sense from my POV.

@RonnyPfannschmidt
Contributor

a simplistic implementation needs to ensure a few details

  1. ensue non-ascii byte strings are escaped to something ascii-ish
  2. ensure unicode/utf8 is mapped correctly to text

we should always byte-escape non-aphanumeric id values that are byte strings
we should always unicode-escape unicode id values
the target is, that node id's are always ascii text in the end

@nicoddemus
Member

@RonnyPfannschmidt in your proposal do you mean Python 2 and 3, or only for Python 2? Python 3 should we strive to support full unicode?

@RonnyPfannschmidt
Contributor

for both - even if we support full unicode id's we still will run into problems on windows, osx and other things

@nicoddemus
Member

Displaying them in the terminal for example?

@RonnyPfannschmidt
Contributor

yes, also they are sometimes used for filenames/lookup keys - making them consistent ascii text in all cases seems sensible simply to avoid platform issues on unicode boundaries

@nicoddemus
Member

Fair enough, sounds reasonable. Thanks! 😁

@ceridwen
Contributor

A draft pull request is up at #1463.

@ceridwen
Contributor

New pull request targeted at features is up at #1470.

@The-Compiler
Member

Closing as the PR was merged

This was referenced Aug 24, 2016
This was referenced Nov 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment