Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an assertBytesEqual to unittest and use it for bytes assertEqual #54373

Closed
bitdancer opened this issue Oct 21, 2010 · 7 comments
Closed

Add an assertBytesEqual to unittest and use it for bytes assertEqual #54373

bitdancer opened this issue Oct 21, 2010 · 7 comments
Assignees
Labels
type-feature A feature request or enhancement

Comments

@bitdancer
Copy link
Member

BPO 10164
Nosy @rhettinger, @bitdancer, @voidspace
Files
  • bytes_multi_line_equal.diff
  • assertBytesEqual.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2010-11-22.00:07:13.576>
    created_at = <Date 2010-10-21.12:48:15.647>
    labels = ['type-feature']
    title = 'Add an assertBytesEqual to unittest and use it for bytes assertEqual'
    updated_at = <Date 2010-11-22.02:45:04.490>
    user = 'https://github.com/bitdancer'

    bugs.python.org fields:

    activity = <Date 2010-11-22.02:45:04.490>
    actor = 'r.david.murray'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2010-11-22.00:07:13.576>
    closer = 'rhettinger'
    components = []
    creation = <Date 2010-10-21.12:48:15.647>
    creator = 'r.david.murray'
    dependencies = []
    files = ['19320', '19331']
    hgrepos = []
    issue_num = 10164
    keywords = ['patch']
    message_count = 7.0
    messages = ['119280', '119360', '119484', '120034', '120105', '122030', '122074']
    nosy_count = 3.0
    nosy_names = ['rhettinger', 'r.david.murray', 'michael.foord']
    pr_nums = []
    priority = 'low'
    resolution = 'rejected'
    stage = 'patch review'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue10164'
    versions = ['Python 3.2']

    @bitdancer
    Copy link
    Member Author

    Just as with regular strings, when comparing failed output involving bytes strings it is really helpful to have a diff showing which bytes have changed. The attached patch adds an assertMultiLineEqual method to unittest and uses it for comparing bytes in assertEqual.

    The diff output is created by first breaking the byte strings into lines using 'splitlines(True)' just as in assertMultiLineEqual, but then each line is converted to a string using 'repr' before the lines sets are passed to ndiff. This results in output containing escaped bytes, making it easier to compare the differences in the byte strings.

    (NB: this patch is the end result of several attempts to make the unittest output more useful in the email package when testing bytes handling).

    @bitdancer bitdancer added the type-feature A feature request or enhancement label Oct 21, 2010
    @bitdancer
    Copy link
    Member Author

    After talking with Michael on #python-dev, I've revised the patch to make it a real assertBytesEqual method rather than a pretend-the-bytes-are-strings method. This version allows the byte strings to be split on an arbitrary byte string, which makes it more useful for getting diffs of structured binary data if that data has a convenient break point. And the default is no splitting, which is what assertEqual uses when comparing bytes.

    One of the tests is failing, and it's late and I can't figure out where the extra space is coming from. Maybe someone else will check it out and spot the problem before I get back to it.

    @bitdancer bitdancer changed the title Add an assertBytesMultiLineEqual to unittest and use it for bytes assertEqual Add an assertBytesEqual to unittest and use it for bytes assertEqual Oct 22, 2010
    @bitdancer
    Copy link
    Member Author

    My best guess currently is that the failing test is a bug in difflib, but I haven't dug into that code yet to prove it. It's still possible it's something stupid in my code, but as far as I can see there's no character over where that - is in the difflib output (that is, the two lines that it says are different appear identical when viewed with cat -A).

    @rhettinger rhettinger self-assigned this Oct 30, 2010
    @rhettinger
    Copy link
    Contributor

    Am discussing this with the OP on IRC and tabling it for a while so we can better think out the API.

    The goal is to let assertEqual(a, b) do straight-comparions of raw bytes, but to give a nice looking diff (possibly translated with line breaks or somesuch) when the test fails. This will be helpful in testing the email module.

    The current patch requires that assertBytesEqual be exposed and called directly so that a user can specify a split-at argument. The purpose of that argument is to approximate the line breaking that occurs naturally in text. The OP does not want to decode the bytes prior to the equality test, but does want a readable diff whenever the bytes represent ascii text.

    @voidspace
    Copy link
    Contributor

    David - would you get a good approximation of what you want simply with:

    self.assertEqual(ascii(first), ascii(second))
    

    (This actually returns "b'first'" "b'second'" so you may want a convenience function that chops the leading and trailing b'/')

    As ascii returns unicode it would automatically delegate to assertMultilineEqual.

    The obvious way to hook this up by default for assertEqual is having the split-character as '\n'. This would not be meaningful for using assertEqual to compare bytes that *aren't* text.

    @rhettinger
    Copy link
    Contributor

    Rejecting this one for reasons we discussed earlier. The assertEqual() method needs to be the primary interface. Everything else is starting to mix content and presentation (i.e. passing in separators). The existing repr() works fine with bytes and Michael's suggested ascii() cast would be the preferred technique in the common cases.

    What might be useful is a less specialized patch letting assertEqual() take an argument pointing to some repr or pre-processing function that would be called after an equality test fails but before it is diffed. That would support a clear separation of concerns and be easily extendable by users would need something more than an ascii() cast.

    @bitdancer
    Copy link
    Member Author

    Agreed on the closing. The pre-diff processing function would be a great addition. For the record, I am currently satisfying my use case by doing this:

    self.assertEqual(bstr1.split(b'\n'), bstr2.split(b'\n'))

    which produces a very readable diff.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants