Skip to content
This repository has been archived by the owner on Jul 23, 2022. It is now read-only.

moreati/b-prefix-all-the-doctests

Repository files navigation

pretext

Attention!

pretext 0.0.5 will be the final release of this module. I've agreed to transfer the package name to the PreTeXt books project. To continue using this module pin you version to pretext<0.1.

Write doctests with strings that work in Python 2.x and Python 3.x.

Just import pretext and call pretext.activate(). By default Python 3.x repr() behaviour is used

>>> import pretext; pretext.activate()
>>> b'Byte strings now have the same repr on Python 2.x & 3.x'
b'Byte strings now have the same repr on Python 2.x & 3.x'
>>> u'Unicode strings & nested strings work too'.split()
['Unicode', 'strings', '&', 'nested', 'strings', 'work', 'too']

The problem

Suppose you have the following functions and doctests

def textfunc():
    """
    >>> textfunc()
    u'A unicode string'
    """
    return u'A unicode string'

def bytesfunc()
    """
    >>> bytesfunc()
    b'A byte string'
    """
    return b'A byte string'

Without pretext

  • textfunc() will pass on Python 2.x, but fail on 3.x.
  • bytesfunc() will fail on Python 2.x, but pass on 3.x.

This is because doctest compares the expected result (from the doc-string) with the repr() of the returned value.

  • repr(u'foo') returns u'foo' on Python 2.x, but 'foo' on 3.x
  • repr(b'bar') returns 'bar' on Python 2.x, but b'bar' on 3.x

If the doctests are editted to remove the prefixes

def textfunc():
    """
    >>> textfunc()
    'A unicode string'
    """
    return u'A unicode string'

def bytesfunc()
    """
    >>> bytesfunc()
    'A byte string'
    """
    return b'A byte string'

then the failures will be reversed

  • textfunc() will now fail on Python 2.x, but pass on 3.x.
  • bytesfunc() will now pass on Python 2.x, but fail on 3.x.

The hack

Replace repr() and sys.displayhook with versions that always prefix string literals, regardless of the Python version. Now the doctests can

  • directly show the string values returned by functions/methods, without resorting to print(), or .encode() etc
  • successfully test the examples on all Python versions

Proof of concept:

r"""
>>> import sys
>>> import pretext
>>> myrepr = bar.PrefixRepr()
>>> repr = myrepr.repr
>>> def _displayhook(value):
...     if value is not None:
...         sys.stdout.write(myrepr.repr(value))
>>> sys.displayhook = _displayhook
>>> u''
u''
>>> b''
b''
>>> bytes()
b''
>>> b'\0'
b'\x00'
>>> b"'"
b"'"
"""

Alternatives

If you're ready to run screaming at the above, there are alternatives

  • pytest provides #doctest: ALLOW_UNICODE and (from 2.9.0) #doctest: ALLOW_BYTES directives

  • lxml includes lxml.html.usedoctest and lxml.usedoctest modules for HTML and XML.

  • Wrap byte-string returns in bytearray(). repr(bytearray(b'abc')) == "bytearray(b'abc'))" on all versions of python that have bytearray() (2.6 onward) e.g.

    >>> bytearray(bytesfunc())
    bytearray(b'I return a byte (binary) string')
  • Support Python 3.x exclusively

  • Use print(bytesfunc().decode('ascii')) and choose your input values carefully

  • Use #doctest: +SKIP

  • Use #doctest: +ELLIPSIS

About

Experiment in doctests for strings on Python 2.x and 3.x

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages