Test framework output improvement. #3357

paulidale · 2017-05-02T04:49:31Z

tests are added or updated

Format the test failure output more nicely. More vertical space is used to make things a little clearer. Tests are expected to pass so this doesn't impact the normal case. Strings and memory comparisons highlight differences and break on sixteen character boundaries.

This is a first stab at better output, it is mutable.

paulidale · 2017-05-02T04:53:05Z

@dot-asm is this more along the lines you would like?

The output from the new failure test case test_long_output:

static int test_long_output(void)
{
    const char *p = "1234567890123456789012345678901234567890123456789012";
    const char *q = "1234567890klmnopqrs01234567890EFGHIJKLM0123456789XYZ";
    const char *r = "1234567890123456789012345678901234567890123456789012"
                    "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXY+"
                    "12345678901234567890123ABC78901234567890123456789012";
    const char *s = "1234567890123456789012345678901234567890123456789012"
                    "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXY-"
                    "1234567890123456789012345678901234567890123456789012"
                    "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";

    return TEST(0, TEST_str_eq(p,  q))
           & TEST(0, TEST_mem_eq(r, strlen(r), s, strlen(s)));
}

is:

    # ERROR: (string) p == q @ test/test_test.c:293
     Left   0: 1234567890123456789012345678901234567890123456789012
     Right  0: 1234567890klmnopqrs01234567890EFGHIJKLM0123456789XYZ
                         ^^^^^^^^^           ^^^^^^^^^          ^^^

    # ERROR: (memory) r == s @ test/test_test.c:294
     Left   0: 3132333435363738393031323334353637383930313233343536373839303132
     Right  0: 3132333435363738393031323334353637383930313233343536373839303132
     Left   1: 33343536373839303132333435363738393031326162636465666768696a6b6c
     Right  1: 33343536373839303132333435363738393031326162636465666768696a6b6c
     Left   2: 6d6e6f707172737475767778797a4142434445464748494a4b4c4d4e4f505152
     Right  2: 6d6e6f707172737475767778797a4142434445464748494a4b4c4d4e4f505152
     Left   3: 535455565758592b313233343536373839303132333435363738393031323341
     Right  3: 535455565758592d313233343536373839303132333435363738393031323334
                              ^                                              ^^
     Left   4: 42433738393031323334353637383930313233343536373839303132
     Right  4: 3536373839303132333435363738393031323334353637383930313261626364
               ^^^^                                                    
     Right  5: 65666768696a6b6c6d6e6f707172737475767778797a4142434445464748494a
     Right  6: 4b4c4d4e4f505152535455565758595a

It isn't a great example but should indicate the kind of output.

levitte · 2017-05-02T06:57:40Z

Three things:

(actually, this is a fourth) Great initiative!
Try to make sure all such diagnostic lines start with #, for consistency and to be nice to TAP.
The output could be condensed a bit. Lines that are exactly the same don't need to be written twice, and we could use diff-like output
The line counter could be an offset counter (like od)

Your example remade:

    # ERROR: (string) p == q @ test/test_test.c:293
    # 00000000:-1234567890123456789012345678901234567890123456789012
    # 00000000:+1234567890klmnopqrs01234567890EFGHIJKLM0123456789XYZ
                          ^^^^^^^^^           ^^^^^^^^^          ^^^

    # ERROR: (memory) r == s @ test/test_test.c:294
    # 00000000: 3132333435363738393031323334353637383930313233343536373839303132
    # 00000020: 33343536373839303132333435363738393031326162636465666768696a6b6c
    # 00000040: 6d6e6f707172737475767778797a4142434445464748494a4b4c4d4e4f505152
    # 00000060:-535455565758592b313233343536373839303132333435363738393031323341
    # 00000060:+535455565758592d313233343536373839303132333435363738393031323334
    #                          ^                                              ^^
    # 00000080:-42433738393031323334353637383930313233343536373839303132
    # 00000080:+3536373839303132333435363738393031323334353637383930313261626364
    #           ^^^^                                                    
    # 000000a0:+65666768696a6b6c6d6e6f707172737475767778797a4142434445464748494a
    # 000000c0:+4b4c4d4e4f505152535455565758595a

paulidale · 2017-05-02T07:59:39Z

Nice suggestions, I'll implement them tomorrow.

Somethings to think about in the meantime...

Would < and > make more sense than - and +? I don't mind which characters are used but I'd like to use something that can be understood immediately. - and + don't relate to the expressions printed in the first line.

Do we really want hex offsets for strings? For memory it makes sense and will be done.

Would spaces between the bytes in memory dumps aid readability? I think they might. @dot-asm ?

How many bytes per line for memory dumps? The code currently uses the largest multiple of 8 bytes (16 characters) that fits into an 80 character line. Would using a multiple of 16 bytes be better here? The indentation varies, so a fixed value isn't a great idea.

I don't see a need to make the output as compact as possible. Better for it to be easy to understand. These messages are only going to appear when a test fails which should be rare.

levitte · 2017-05-02T08:35:36Z

Whichever character you use to denote where the "left" and "right" expression differ, there's always going to be someone not understanding. If you choose - / + or < / > doesn't matter to me, as long as you're consistent.

I would say that hex offsets are usefull either way, ... and I dunno, I find it easier to read and to see how many bytes are displayed in each line that way.

Spaces between bytes would certainly make it more readable. A really classic display is to have two sets of 8 bytes per line with a little extra space between the two sets, like this:

00 01 02 03 04 05 06 07  08 09 0a 0b 0c 0d 0e 0f

(that's 48 characters, spaces included)

If you want to go really classic, there's the display of characters right after (and display everything non-ascii with a period):

53 54 55 56 57 58 59 0a  31 32 33 34 35 36 37 38  STUVWXY.12345678

(that's 68 characters, spaces included)

paulidale · 2017-05-02T09:10:38Z

There are at least sixteen characters at the start of the line too (at least four indentation spaces, #, space, eight hex digits, colon, space/+/-).

I'll think of something.

dot-asm · 2017-05-02T10:51:50Z

What's TAP?

dot-asm · 2017-05-02T11:20:15Z

Would spaces between the bytes in memory dumps aid readability?

It's a tricky question. From my viewpoint it's not only about readability, but even utility. For example, how do you add additional longer vector to evptests.txt? You just make up one (by adding something to shorter input), add some junk to expected value (if/as necessary), run evp_test and copy "got" to expected. From this viewpoint denser printout is better (note that original is just single line ready to be copied-n-pasted), because it takes lesser tweaking. I'd also say that lack of spaces between bytes is well compensated by emphasis. So with these two in mind I for one would vote against space-separated bytes. [Nor do I see much value in ASCII transcript, because in most cases we are looking at cipher texts or hash values, i.e. nothing ASCII-able.]

dot-asm · 2017-05-02T11:22:40Z

BTW, +1 for offsets and not printing equivalent chunks twice.

levitte · 2017-05-02T11:24:58Z

What's TAP?

http://testanything.org

Basically, it's the output format that Test::More uses, and that I recently integrated into the C test framework as well, see commits 208d721 and 7531b3a

levitte · 2017-05-02T11:26:30Z

It's a tricky question. From my viewpoint it's not only about readability, but even utility. For example, how do you add additional longer vector to evptests.txt? You just make up one (by adding something to shorter input), add some junk to expected value (if/as necessary), run evp_test and copy "got" to expected. From this viewpoint denser printout is better (note that original is just single line ready to be copied-n-pasted), because it takes lesser tweaking. I'd also say that lack of spaces between bytes is well compensated by emphasis. So with these two in mind I for one would vote against space-separated bytes. [Nor do I see much value in ASCII transcript, because in most cases we are looking at cipher texts or hash values, i.e. nothing ASCII-able.]

Good points.

dot-asm · 2017-05-02T11:31:53Z

lack of spaces between bytes is well compensated by emphasis

I'd like to argue that it would be more appropriate to emphasize bytes in memory comparison. I mean above we can see that nibbles are emphasized, 2b vs. 2d with only b vs. d emphasized, rather than complete bytes.

paulidale · 2017-05-02T12:05:04Z

The current nibble emphasis is due to the way I put the output together -- memory is converted to hex and then displayed as a string. I did this because it was faster to implement and I'm seeking feedback at the moment (which has been great BTW, thanks). Given that the memory and string display/diff code will have to be separated, emphasising bytes in the former will be done properly.

I do agree about the ASCII transcript not being required.

As for readability spacing, I'm less convinced. A short script could convert a pasted block of output removing the leadins, the spaces and newlines and either the '+' or '-' lines to provide the required evptests.txt line. It would be an additional step, but that seems like an okay compromise for readability. If we do want the ability to paste directly into evptests.txt, the output will have to go back to long lines with the highlights below. I'm less keen on this but will go with it if it is the consensus.

levitte · 2017-05-02T12:13:21Z

I'm not sure why string and memory display should be different... but also, I'm missing information on how they should be different.

As for the space / no space debate, I can propose the following compromise:

0001020304050607 08090a0b0c0d0e0f

i.e. an added space every 8th byte (16th char).

When it comes to the highlight of differences, how about doing what you do now, but consider the hex characters in pairs?

dot-asm · 2017-05-02T12:31:45Z

compromise ...

But still 32 bytes per line, right?

consider the hex characters in pairs

I.e. emphasize bytes, not nibbles.

levitte · 2017-05-02T12:58:49Z

But still 32 bytes per line, right?

Do you mean that, i.e. 64 chars?

Either way, I don't really care that much, just as long as it's consistent and recognisable. However, I would recommend to keep each space separated chunk relatively short, as keeping track of the bytes makes most people blurry eyed after a bit (I'm noticing that after about 16 hex char, I'm starting to lose track of nibble pairs, for example).

So, we need to decide on chunk size. 8 or 16 chars (i.e. 4 or 8 bytes, or 32-bit and 64-bit values, where that's a valuable way to look at it) is fairly common in places where space separation of chunks is done.

dot-asm · 2017-05-02T13:32:23Z

But still 32 bytes per line, right?

Do you mean that, i.e. 64 chars?

Yes.

levitte · 2017-05-02T13:37:27Z

But still 32 bytes per line, right?

Do you mean that, i.e. 64 chars?

Yes.

... and are you still talking chunked lines, i.e. something like this?

0001020304050607 08090a0b0c0d0e0f 1011121314151617 18191a1b1c1d1e1f

or are you talking about an undivided line?

000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f

As mentioned already, I prefer the former, for readability.

dot-asm · 2017-05-02T13:49:00Z

... and are you still talking chunked lines, or undivided line?

As already said, to me emphasizing differences compensates for lack of spaces. Which means I don't have special opinion... Well, one can argue that it might be sensible to break at least at 16 bytes to kind of reflect the fact that most ciphers have 128-bit block size...

[going off-line...]

paulidale · 2017-05-02T23:59:14Z

32 bytes (64 chars) will spill over 80 columns at times. A sub test will indent 8 not 4 e.g which pushes to the 80 and deeper nesting might be possible. Will Is this okay?

Strings will be different from memory because they don't require hex conversion, highlight single characters and don't have formatting spaces added. I assume they'll be line wrapped and diffed still.

paulidale · 2017-05-03T03:52:17Z

I've updated the output code for strings and memory. I think I've caught all of the suggestions thus far. + and - looked better so I stuck with them. I've added the # to the start of all output.

Strings are wrapped in sixteen byte lumps and fitted to an 80 column display. They are shown with single quotes which distinguishes the empty string. A NULL pointer is displayed as NULL without the quotes. Characters differences are highlighted.

    # ERROR: (string) r == s @ test/test_test.c:303
    #    0:  '1234567890123456789012345678901234567890123456789012abcdefghijkl'
    #   64:- 'mnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXY+12345678901234567890123A'
    #   64:+ 'mnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXY-123456789012345678901234'
    #                                                ^                       ^
    #  128:- 'BC78901234567890123456789012'
    #  128:+ '5678901234567890123456789012abcdefghijklmnopqrstuvwxyzABCDEFGHIJ'
    #         ^^                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    #  192:+ 'KLMNOPQRSTUVWXYZ'

Memory dumps are blocked per eight bytes with a space between blocks and fitted to an 80 column display. Byte differences are highlighted. NULL pointers are shown as NULL and zero length regions as empty.

    # ERROR: (memory) r == s @ test/test_test.c:304
    # 0000: 3132333435363738 3930313233343536 3738393031323334 3536373839303132
    # 0020: 3334353637383930 3132333435363738 3930313261626364 65666768696a6b6c
    # 0040: 6d6e6f7071727374 75767778797a4142 434445464748494a 4b4c4d4e4f505152
    # 0060:-535455565758592b 3132333435363738 3930313233343536 3738393031323341
    # 0060:+535455565758592d 3132333435363738 3930313233343536 3738393031323334
    #                     ^^                                                 ^^
    # 0080:-4243373839303132 3334353637383930 3132333435363738 39303132
    # 0080:+3536373839303132 3334353637383930 3132333435363738 3930313261626364
    #       ^^^^                                                       ^^^^^^^^
    # 00a0:+65666768696a6b6c 6d6e6f7071727374 75767778797a4142 434445464748494a
    # 00c0:+4b4c4d4e4f505152 535455565758595a

I haven't used full width fields for the size/offset numbers to allow more horizontal space. This means the difference highlight carets will go out of alignment for huge outputs.

levitte · 2017-05-03T06:48:41Z

test/testutil/tests.c

-static void test_fail_message_va(const char *prefix, const char *file, int line,
-                                 const char *type, const char *fmt, va_list ap)
+                              const char *type, const char *, const char *,
+                              const char *, const char *fmt, ...)


Please keep the names left, right and op here. They do serve as documentation.

Okay, tomorrow.

levitte · 2017-05-03T06:53:12Z

Have you given any thought to possible control characters in strings? Such as \n?

paulidale · 2017-05-03T07:36:12Z

I did think about control characters and non-printables. They'll lead to the display of characters that aren't in the strings. Displaying a\nb and abc would be something along the lines of:

a\nb
ab c
 ^^

The space in the second string isn't real. Moreover, this is indistinguishable from a comparison of a\\nb and ab c. It gets worse with hex/octal character display e.g. \0x83 or \203.

I'm fine to implement something so long as the shortcomings and loss of regular formatting are acceptable.

Do non-printable strings appear in the tests? On success, I can't remember seeing any (not that that means much). On failure, further investigation will be required. @richsalz do you know?

levitte · 2017-05-03T08:17:35Z

I haven't seen any control characters in tests, yet... but I prefer to be prepared. Glad you thought of it, I'm satisfied with that.

richsalz · 2017-05-03T12:26:54Z

Let's really fix it if/when it becomes an issue.

paulidale · 2017-05-03T21:11:03Z

I've added the argument names to the forward declaration of test_fail_message.
This forward declaration isn't needed from the compiler's viewpoint but it seems worthwhile along with the comment above it.

paulidale · 2017-05-03T21:25:38Z

I could replace unprintable characters with a . as was mentioned earlier. I think this might be the least confusing of the available options. Or we can wait until it is a problem.

levitte · 2017-05-03T21:48:07Z

Sounds reasonable

paulidale · 2017-05-03T23:01:42Z

Okay, the update which displays non-printable characters as . is in.

Format the test failure output more nicely. More vertical space is used to make things a little clearer. Tests are expected to pass so this doesn't impact the normal case. Strings and memory comparisons highlight differences.

paulidale · 2017-05-04T00:19:00Z

The Travis failure is unrelated to this commit.

levitte

Good enough for me

richsalz

let me know if you want me to merge.

levitte · 2017-05-04T15:22:02Z

Merged. 516deca

Format the test failure output more nicely. More vertical space is used to make things a little clearer. Tests are expected to pass so this doesn't impact the normal case. Strings and memory comparisons highlight differences. Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from #3357)

paulidale mentioned this pull request May 2, 2017

Test infrastructure items #3341

Closed

paulidale force-pushed the test-pretty branch 2 times, most recently from 4fb4400 to 0cbac73 Compare May 3, 2017 03:46

levitte requested changes May 3, 2017

View reviewed changes

paulidale force-pushed the test-pretty branch from 0cbac73 to 23464d1 Compare May 3, 2017 21:08

paulidale force-pushed the test-pretty branch from 23464d1 to 59baab8 Compare May 3, 2017 23:01

Test framework output improvement.

59baab8

Format the test failure output more nicely. More vertical space is used to make things a little clearer. Tests are expected to pass so this doesn't impact the normal case. Strings and memory comparisons highlight differences.

levitte approved these changes May 4, 2017

View reviewed changes

richsalz approved these changes May 4, 2017

View reviewed changes

levitte closed this May 4, 2017

paulidale deleted the test-pretty branch May 4, 2017 21:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test framework output improvement. #3357

Test framework output improvement. #3357

paulidale commented May 2, 2017

paulidale commented May 2, 2017 •

edited

levitte commented May 2, 2017

paulidale commented May 2, 2017

levitte commented May 2, 2017 •

edited

paulidale commented May 2, 2017

dot-asm commented May 2, 2017

dot-asm commented May 2, 2017

dot-asm commented May 2, 2017

levitte commented May 2, 2017

levitte commented May 2, 2017 •

edited

dot-asm commented May 2, 2017

paulidale commented May 2, 2017

levitte commented May 2, 2017

dot-asm commented May 2, 2017

levitte commented May 2, 2017

dot-asm commented May 2, 2017

levitte commented May 2, 2017

dot-asm commented May 2, 2017

paulidale commented May 2, 2017

paulidale commented May 3, 2017

levitte May 3, 2017

paulidale May 3, 2017

levitte commented May 3, 2017

paulidale commented May 3, 2017

levitte commented May 3, 2017

richsalz commented May 3, 2017

paulidale commented May 3, 2017

paulidale commented May 3, 2017

levitte commented May 3, 2017

paulidale commented May 3, 2017

paulidale commented May 4, 2017

levitte left a comment

richsalz left a comment

levitte commented May 4, 2017

Test framework output improvement. #3357

Test framework output improvement. #3357

Conversation

paulidale commented May 2, 2017

paulidale commented May 2, 2017 • edited

levitte commented May 2, 2017

paulidale commented May 2, 2017

levitte commented May 2, 2017 • edited

paulidale commented May 2, 2017

dot-asm commented May 2, 2017

dot-asm commented May 2, 2017

dot-asm commented May 2, 2017

levitte commented May 2, 2017

levitte commented May 2, 2017 • edited

dot-asm commented May 2, 2017

paulidale commented May 2, 2017

levitte commented May 2, 2017

dot-asm commented May 2, 2017

levitte commented May 2, 2017

dot-asm commented May 2, 2017

levitte commented May 2, 2017

dot-asm commented May 2, 2017

paulidale commented May 2, 2017

paulidale commented May 3, 2017

levitte May 3, 2017

Choose a reason for hiding this comment

paulidale May 3, 2017

Choose a reason for hiding this comment

levitte commented May 3, 2017

paulidale commented May 3, 2017

levitte commented May 3, 2017

richsalz commented May 3, 2017

paulidale commented May 3, 2017

paulidale commented May 3, 2017

levitte commented May 3, 2017

paulidale commented May 3, 2017

paulidale commented May 4, 2017

levitte left a comment

Choose a reason for hiding this comment

richsalz left a comment

Choose a reason for hiding this comment

levitte commented May 4, 2017

paulidale commented May 2, 2017 •

edited

levitte commented May 2, 2017 •

edited

levitte commented May 2, 2017 •

edited