Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

pretext

Write doctests with strings that work in Python 2.x and Python 3.x.

Just import pretext and call pretext.activate(). By default Python 3.x repr() behaviour is used

>>> import pretext; pretext.activate()
>>> b'Byte strings now have the same repr on Python 2.x & 3.x'
b'Byte strings now have the same repr on Python 2.x & 3.x'
>>> u'Unicode strings & nested strings work too'.split()
['Unicode', 'strings', '&', 'nested', 'strings', 'work', 'too']

The problem

Suppose you have the following functions and doctests

def textfunc():
    """
    >>> textfunc()
    u'A unicode string'
    """
    return u'A unicode string'

def bytesfunc()
    """
    >>> bytesfunc()
    b'A byte string'
    """
    return b'A byte string'

Without pretext

  • textfunc() will pass on Python 2.x, but fail on 3.x.
  • bytesfunc() will fail on Python 2.x, but pass on 3.x.

This is because doctest compares the expected result (from the doc-string) with the repr() of the returned value.

  • repr(u'foo') returns u'foo' on Python 2.x, but 'foo' on 3.x
  • repr(b'bar') returns 'bar' on Python 2.x, but b'bar' on 3.x

If the doctests are editted to remove the prefixes

def textfunc():
    """
    >>> textfunc()
    'A unicode string'
    """
    return u'A unicode string'

def bytesfunc()
    """
    >>> bytesfunc()
    'A byte string'
    """
    return b'A byte string'

then the failures will be reversed

  • textfunc() will now fail on Python 2.x, but pass on 3.x.
  • bytesfunc() will now pass on Python 2.x, but fail on 3.x.

The hack

Replace repr() and sys.displayhook with versions that always prefix string literals, regardless of the Python version. Now the doctests can

  • directly show the string values returned by functions/methods, without resorting to print(), or .encode() etc
  • successfully test the examples on all Python versions

Proof of concept:

r"""
>>> import sys
>>> import pretext
>>> myrepr = bar.PrefixRepr()
>>> repr = myrepr.repr
>>> def _displayhook(value):
...     if value is not None:
...         sys.stdout.write(myrepr.repr(value))
>>> sys.displayhook = _displayhook
>>> u''
u''
>>> b''
b''
>>> bytes()
b''
>>> b'\0'
b'\x00'
>>> b"'"
b"'"
"""

Alternatives

If you're ready to run screaming at the above, there are alternatives

  • pytest provides #doctest: ALLOW_UNICODE and (from 2.9.0) #doctest: ALLOW_BYTES directives

  • lxml includes lxml.html.usedoctest and lxml.usedoctest modules for HTML and XML.

  • Wrap byte-string returns in bytearray(). repr(bytearray(b'abc')) == "bytearray(b'abc'))" on all versions of python that have bytearray() (2.6 onward) e.g.

    >>> bytearray(bytesfunc())
    bytearray(b'I return a byte (binary) string')
  • Support Python 3.x exclusively

  • Use print(bytesfunc().decode('ascii')) and choose your input values carefully

  • Use #doctest: +SKIP

  • Use #doctest: +ELLIPSIS

About

Experiment in doctests for strings on Python 2.x and 3.x

Resources

License

Packages

No packages published

Languages