Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test framework output improvement. #3357

Closed
wants to merge 1 commit into from

Conversation

paulidale
Copy link
Contributor

  • tests are added or updated

Format the test failure output more nicely. More vertical space is used to make things a little clearer. Tests are expected to pass so this doesn't impact the normal case. Strings and memory comparisons highlight differences and break on sixteen character boundaries.

This is a first stab at better output, it is mutable.

@paulidale
Copy link
Contributor Author

paulidale commented May 2, 2017

@dot-asm is this more along the lines you would like?

The output from the new failure test case test_long_output:

static int test_long_output(void)
{
    const char *p = "1234567890123456789012345678901234567890123456789012";
    const char *q = "1234567890klmnopqrs01234567890EFGHIJKLM0123456789XYZ";
    const char *r = "1234567890123456789012345678901234567890123456789012"
                    "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXY+"
                    "12345678901234567890123ABC78901234567890123456789012";
    const char *s = "1234567890123456789012345678901234567890123456789012"
                    "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXY-"
                    "1234567890123456789012345678901234567890123456789012"
                    "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";

    return TEST(0, TEST_str_eq(p,  q))
           & TEST(0, TEST_mem_eq(r, strlen(r), s, strlen(s)));
}

is:

    # ERROR: (string) p == q @ test/test_test.c:293
     Left   0: 1234567890123456789012345678901234567890123456789012
     Right  0: 1234567890klmnopqrs01234567890EFGHIJKLM0123456789XYZ
                         ^^^^^^^^^           ^^^^^^^^^          ^^^

    # ERROR: (memory) r == s @ test/test_test.c:294
     Left   0: 3132333435363738393031323334353637383930313233343536373839303132
     Right  0: 3132333435363738393031323334353637383930313233343536373839303132
     Left   1: 33343536373839303132333435363738393031326162636465666768696a6b6c
     Right  1: 33343536373839303132333435363738393031326162636465666768696a6b6c
     Left   2: 6d6e6f707172737475767778797a4142434445464748494a4b4c4d4e4f505152
     Right  2: 6d6e6f707172737475767778797a4142434445464748494a4b4c4d4e4f505152
     Left   3: 535455565758592b313233343536373839303132333435363738393031323341
     Right  3: 535455565758592d313233343536373839303132333435363738393031323334
                              ^                                              ^^
     Left   4: 42433738393031323334353637383930313233343536373839303132
     Right  4: 3536373839303132333435363738393031323334353637383930313261626364
               ^^^^                                                    
     Right  5: 65666768696a6b6c6d6e6f707172737475767778797a4142434445464748494a
     Right  6: 4b4c4d4e4f505152535455565758595a

It isn't a great example but should indicate the kind of output.

@levitte
Copy link
Member

levitte commented May 2, 2017

Three things:

  1. (actually, this is a fourth) Great initiative!
  2. Try to make sure all such diagnostic lines start with #, for consistency and to be nice to TAP.
  3. The output could be condensed a bit. Lines that are exactly the same don't need to be written twice, and we could use diff-like output
  4. The line counter could be an offset counter (like od)

Your example remade:

    # ERROR: (string) p == q @ test/test_test.c:293
    # 00000000:-1234567890123456789012345678901234567890123456789012
    # 00000000:+1234567890klmnopqrs01234567890EFGHIJKLM0123456789XYZ
                          ^^^^^^^^^           ^^^^^^^^^          ^^^

    # ERROR: (memory) r == s @ test/test_test.c:294
    # 00000000: 3132333435363738393031323334353637383930313233343536373839303132
    # 00000020: 33343536373839303132333435363738393031326162636465666768696a6b6c
    # 00000040: 6d6e6f707172737475767778797a4142434445464748494a4b4c4d4e4f505152
    # 00000060:-535455565758592b313233343536373839303132333435363738393031323341
    # 00000060:+535455565758592d313233343536373839303132333435363738393031323334
    #                          ^                                              ^^
    # 00000080:-42433738393031323334353637383930313233343536373839303132
    # 00000080:+3536373839303132333435363738393031323334353637383930313261626364
    #           ^^^^                                                    
    # 000000a0:+65666768696a6b6c6d6e6f707172737475767778797a4142434445464748494a
    # 000000c0:+4b4c4d4e4f505152535455565758595a

@paulidale
Copy link
Contributor Author

Nice suggestions, I'll implement them tomorrow.

Somethings to think about in the meantime...

Would < and > make more sense than - and +? I don't mind which characters are used but I'd like to use something that can be understood immediately. - and + don't relate to the expressions printed in the first line.

Do we really want hex offsets for strings? For memory it makes sense and will be done.

Would spaces between the bytes in memory dumps aid readability? I think they might. @dot-asm ?

How many bytes per line for memory dumps? The code currently uses the largest multiple of 8 bytes (16 characters) that fits into an 80 character line. Would using a multiple of 16 bytes be better here? The indentation varies, so a fixed value isn't a great idea.

I don't see a need to make the output as compact as possible. Better for it to be easy to understand. These messages are only going to appear when a test fails which should be rare.

@levitte
Copy link
Member

levitte commented May 2, 2017

Whichever character you use to denote where the "left" and "right" expression differ, there's always going to be someone not understanding. If you choose - / + or < / > doesn't matter to me, as long as you're consistent.

I would say that hex offsets are usefull either way, ... and I dunno, I find it easier to read and to see how many bytes are displayed in each line that way.

Spaces between bytes would certainly make it more readable. A really classic display is to have two sets of 8 bytes per line with a little extra space between the two sets, like this:

00 01 02 03 04 05 06 07  08 09 0a 0b 0c 0d 0e 0f

(that's 48 characters, spaces included)

If you want to go really classic, there's the display of characters right after (and display everything non-ascii with a period):

53 54 55 56 57 58 59 0a  31 32 33 34 35 36 37 38  STUVWXY.12345678

(that's 68 characters, spaces included)

@paulidale
Copy link
Contributor Author

There are at least sixteen characters at the start of the line too (at least four indentation spaces, #, space, eight hex digits, colon, space/+/-).

I'll think of something.

@dot-asm
Copy link
Contributor

dot-asm commented May 2, 2017

What's TAP?

@dot-asm
Copy link
Contributor

dot-asm commented May 2, 2017

Would spaces between the bytes in memory dumps aid readability?

It's a tricky question. From my viewpoint it's not only about readability, but even utility. For example, how do you add additional longer vector to evptests.txt? You just make up one (by adding something to shorter input), add some junk to expected value (if/as necessary), run evp_test and copy "got" to expected. From this viewpoint denser printout is better (note that original is just single line ready to be copied-n-pasted), because it takes lesser tweaking. I'd also say that lack of spaces between bytes is well compensated by emphasis. So with these two in mind I for one would vote against space-separated bytes. [Nor do I see much value in ASCII transcript, because in most cases we are looking at cipher texts or hash values, i.e. nothing ASCII-able.]

@dot-asm
Copy link
Contributor

dot-asm commented May 2, 2017

BTW, +1 for offsets and not printing equivalent chunks twice.

@levitte
Copy link
Member

levitte commented May 2, 2017

What's TAP?

http://testanything.org

Basically, it's the output format that Test::More uses, and that I recently integrated into the C test framework as well, see commits 208d721 and 7531b3a

@levitte
Copy link
Member

levitte commented May 2, 2017

It's a tricky question. From my viewpoint it's not only about readability, but even utility. For example, how do you add additional longer vector to evptests.txt? You just make up one (by adding something to shorter input), add some junk to expected value (if/as necessary), run evp_test and copy "got" to expected. From this viewpoint denser printout is better (note that original is just single line ready to be copied-n-pasted), because it takes lesser tweaking. I'd also say that lack of spaces between bytes is well compensated by emphasis. So with these two in mind I for one would vote against space-separated bytes. [Nor do I see much value in ASCII transcript, because in most cases we are looking at cipher texts or hash values, i.e. nothing ASCII-able.]

Good points.

@dot-asm
Copy link
Contributor

dot-asm commented May 2, 2017

lack of spaces between bytes is well compensated by emphasis

I'd like to argue that it would be more appropriate to emphasize bytes in memory comparison. I mean above we can see that nibbles are emphasized, 2b vs. 2d with only b vs. d emphasized, rather than complete bytes.

@paulidale
Copy link
Contributor Author

The current nibble emphasis is due to the way I put the output together -- memory is converted to hex and then displayed as a string. I did this because it was faster to implement and I'm seeking feedback at the moment (which has been great BTW, thanks). Given that the memory and string display/diff code will have to be separated, emphasising bytes in the former will be done properly.

I do agree about the ASCII transcript not being required.

As for readability spacing, I'm less convinced. A short script could convert a pasted block of output removing the leadins, the spaces and newlines and either the '+' or '-' lines to provide the required evptests.txt line. It would be an additional step, but that seems like an okay compromise for readability. If we do want the ability to paste directly into evptests.txt, the output will have to go back to long lines with the highlights below. I'm less keen on this but will go with it if it is the consensus.

@levitte
Copy link
Member

levitte commented May 2, 2017

I'm not sure why string and memory display should be different... but also, I'm missing information on how they should be different.

As for the space / no space debate, I can propose the following compromise:

0001020304050607 08090a0b0c0d0e0f

i.e. an added space every 8th byte (16th char).

When it comes to the highlight of differences, how about doing what you do now, but consider the hex characters in pairs?

@dot-asm
Copy link
Contributor

dot-asm commented May 2, 2017

compromise ...

But still 32 bytes per line, right?

consider the hex characters in pairs

I.e. emphasize bytes, not nibbles.

@levitte
Copy link
Member

levitte commented May 2, 2017

But still 32 bytes per line, right?

Do you mean that, i.e. 64 chars?

Either way, I don't really care that much, just as long as it's consistent and recognisable. However, I would recommend to keep each space separated chunk relatively short, as keeping track of the bytes makes most people blurry eyed after a bit (I'm noticing that after about 16 hex char, I'm starting to lose track of nibble pairs, for example).

So, we need to decide on chunk size. 8 or 16 chars (i.e. 4 or 8 bytes, or 32-bit and 64-bit values, where that's a valuable way to look at it) is fairly common in places where space separation of chunks is done.

@dot-asm
Copy link
Contributor

dot-asm commented May 2, 2017

But still 32 bytes per line, right?

Do you mean that, i.e. 64 chars?

Yes.

@levitte
Copy link
Member

levitte commented May 2, 2017

But still 32 bytes per line, right?

Do you mean that, i.e. 64 chars?

Yes.

... and are you still talking chunked lines, i.e. something like this?

0001020304050607 08090a0b0c0d0e0f 1011121314151617 18191a1b1c1d1e1f

or are you talking about an undivided line?

000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f

As mentioned already, I prefer the former, for readability.

@dot-asm
Copy link
Contributor

dot-asm commented May 2, 2017

... and are you still talking chunked lines, or undivided line?

As already said, to me emphasizing differences compensates for lack of spaces. Which means I don't have special opinion... Well, one can argue that it might be sensible to break at least at 16 bytes to kind of reflect the fact that most ciphers have 128-bit block size...

[going off-line...]

@paulidale
Copy link
Contributor Author

32 bytes (64 chars) will spill over 80 columns at times. A sub test will indent 8 not 4 e.g which pushes to the 80 and deeper nesting might be possible. Will Is this okay?

Strings will be different from memory because they don't require hex conversion, highlight single characters and don't have formatting spaces added. I assume they'll be line wrapped and diffed still.

@paulidale paulidale force-pushed the test-pretty branch 2 times, most recently from 4fb4400 to 0cbac73 Compare May 3, 2017 03:46
@paulidale
Copy link
Contributor Author

I've updated the output code for strings and memory. I think I've caught all of the suggestions thus far. + and - looked better so I stuck with them. I've added the # to the start of all output.

Strings are wrapped in sixteen byte lumps and fitted to an 80 column display. They are shown with single quotes which distinguishes the empty string. A NULL pointer is displayed as NULL without the quotes. Characters differences are highlighted.

    # ERROR: (string) r == s @ test/test_test.c:303
    #    0:  '1234567890123456789012345678901234567890123456789012abcdefghijkl'
    #   64:- 'mnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXY+12345678901234567890123A'
    #   64:+ 'mnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXY-123456789012345678901234'
    #                                                ^                       ^
    #  128:- 'BC78901234567890123456789012'
    #  128:+ '5678901234567890123456789012abcdefghijklmnopqrstuvwxyzABCDEFGHIJ'
    #         ^^                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    #  192:+ 'KLMNOPQRSTUVWXYZ'

Memory dumps are blocked per eight bytes with a space between blocks and fitted to an 80 column display. Byte differences are highlighted. NULL pointers are shown as NULL and zero length regions as empty.

    # ERROR: (memory) r == s @ test/test_test.c:304
    # 0000: 3132333435363738 3930313233343536 3738393031323334 3536373839303132
    # 0020: 3334353637383930 3132333435363738 3930313261626364 65666768696a6b6c
    # 0040: 6d6e6f7071727374 75767778797a4142 434445464748494a 4b4c4d4e4f505152
    # 0060:-535455565758592b 3132333435363738 3930313233343536 3738393031323341
    # 0060:+535455565758592d 3132333435363738 3930313233343536 3738393031323334
    #                     ^^                                                 ^^
    # 0080:-4243373839303132 3334353637383930 3132333435363738 39303132
    # 0080:+3536373839303132 3334353637383930 3132333435363738 3930313261626364
    #       ^^^^                                                       ^^^^^^^^
    # 00a0:+65666768696a6b6c 6d6e6f7071727374 75767778797a4142 434445464748494a
    # 00c0:+4b4c4d4e4f505152 535455565758595a

I haven't used full width fields for the size/offset numbers to allow more horizontal space. This means the difference highlight carets will go out of alignment for huge outputs.

static void test_fail_message_va(const char *prefix, const char *file, int line,
const char *type, const char *fmt, va_list ap)
const char *type, const char *, const char *,
const char *, const char *fmt, ...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the names left, right and op here. They do serve as documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, tomorrow.

@levitte
Copy link
Member

levitte commented May 3, 2017

Have you given any thought to possible control characters in strings? Such as \n?

@paulidale
Copy link
Contributor Author

I did think about control characters and non-printables. They'll lead to the display of characters that aren't in the strings. Displaying a\nb and abc would be something along the lines of:

a\nb
ab c
 ^^

The space in the second string isn't real. Moreover, this is indistinguishable from a comparison of a\\nb and ab c. It gets worse with hex/octal character display e.g. \0x83 or \203.

I'm fine to implement something so long as the shortcomings and loss of regular formatting are acceptable.

Do non-printable strings appear in the tests? On success, I can't remember seeing any (not that that means much). On failure, further investigation will be required. @richsalz do you know?

@levitte
Copy link
Member

levitte commented May 3, 2017

I haven't seen any control characters in tests, yet... but I prefer to be prepared. Glad you thought of it, I'm satisfied with that.

@richsalz
Copy link
Contributor

richsalz commented May 3, 2017

Let's really fix it if/when it becomes an issue.

@paulidale
Copy link
Contributor Author

I've added the argument names to the forward declaration of test_fail_message.
This forward declaration isn't needed from the compiler's viewpoint but it seems worthwhile along with the comment above it.

@paulidale
Copy link
Contributor Author

I could replace unprintable characters with a . as was mentioned earlier. I think this might be the least confusing of the available options. Or we can wait until it is a problem.

@levitte
Copy link
Member

levitte commented May 3, 2017

Sounds reasonable

@paulidale
Copy link
Contributor Author

Okay, the update which displays non-printable characters as . is in.

Format the test failure output more nicely.

More vertical space is used to make things a little clearer.  Tests are expected
to pass so this doesn't impact the normal case.

Strings and memory comparisons highlight differences.
@paulidale
Copy link
Contributor Author

The Travis failure is unrelated to this commit.

Copy link
Member

@levitte levitte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good enough for me

Copy link
Contributor

@richsalz richsalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me know if you want me to merge.

@levitte
Copy link
Member

levitte commented May 4, 2017

Merged. 516deca

@levitte levitte closed this May 4, 2017
levitte pushed a commit that referenced this pull request May 4, 2017
Format the test failure output more nicely.

More vertical space is used to make things a little clearer.  Tests are expected
to pass so this doesn't impact the normal case.

Strings and memory comparisons highlight differences.

Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from #3357)
@paulidale paulidale deleted the test-pretty branch May 4, 2017 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants