[Testing] Add Travis Validation Tests for P4C. #2458

fruffy · 2020-06-30T18:17:08Z

This pull request add translation validation to the CI workflow of p4c. Translation validation certifies that a particular compiler pass has correctly transformed a given input program without changing its semantics. To perform translation validation for p4c, we use a symbolic interpreter (Gauntlet) to transform P4 programs into Z3 formulas. The Gauntlet workflow requires two external dependencies, which are described in this source organization. You can also find a detailed description of the tool and basic usage instructions there. I currently maintain this organization.

To validate a P4 program, the symbolic interpreter converts the program into a Z3 formula capturing its input-output semantics. An equivalence checker then submits the Z3 formulas of a program before and after a compiler pass to the Z3 SMT solver. The solver tries to find an input that violates equivalence of these two formulas. If it finds such an input, a validation error is reported. Translation validation has two advantages over fuzz testing. First, it can accurately detect subtle differences in program semantics without any knowledge about expected input packets or table entries. Second, by accessing intermediate P4 programs after each compiler pass, we can pinpoint the erroneous pass. This tool specifically tests the front and mid end passes of p4c using p4test.

The interpreter is written as an extension to p4c and uses the IR generated by the p4c parser to determine the semantics of the P4 program. Each programmable block of a P4 package represents an independent Z3 formula. For example, the v1model package has 6 different independent programmable blocks: Parser, VerifyChecksum, Ingress, Egress, ComputeChecksum, and Deparser. For each block, a new Z3 formula is generated.

Concretely, the semantic representation of the Ingress pipe of the program key-bmv2.p4 looks like this:

INFO:PIPE ig:                                                                                                   
P4 PROGRAM FRAGMENT          INPUT                        OUTPUT                                                
---------------------------  ---------------------------  ---------------------------------------------------   
h.h.a                        If(0_valid, 0(ig), invalid)  If(0_valid, 0(ig), invalid)                           
h.h.b                        If(0_valid, 1(ig), invalid)  If(0_valid,                                           
                                                             If(And(2*0(ig) == t_table_key_0, 1 == t_action),   
                                                                0(ig),                                          
                                                                1(ig)),                                         
                                                             invalid)                                           
sm.ingress_port              2(ig)                        2(ig)                                                 
sm.egress_spec               3(ig)                        0                                                     
sm.egress_port               4(ig)                        4(ig)                                                 
sm.instance_type             5(ig)                        5(ig)                                                 
sm.packet_length             6(ig)                        6(ig)                                                 
sm.enq_timestamp             7(ig)                        7(ig)                                                 
sm.enq_qdepth                8(ig)                        8(ig)                                                 
sm.deq_timedelta             9(ig)                        9(ig)                                                 
sm.deq_qdepth                10(ig)                       10(ig)                                                
sm.ingress_global_timestamp  11(ig)                       11(ig)                                                
sm.egress_global_timestamp   12(ig)                       12(ig)                                                
sm.mcast_grp                 13(ig)                       13(ig)                                                
sm.egress_rid                14(ig)                       14(ig)                                                
sm.checksum_error            15(ig)                       15(ig)                                                
sm.parser_error              16(ig)                       16(ig)                                                
sm.priority                  17(ig)                       17(ig)

This representation can be obtained by running python3 get_semantics.py -i p4c/testdata/p4_16_samples/key-bmv2.p4.

Here we set h.h.b to h.h.a, only if there is a match-action pair in the table that has the key h.h.a + h.h.a and executes the action a(). ig corresponds to a symbolic input packet, whereas 0(ig), 1(ig), 2(ig),... represent the flattened members of the parsed packet in canonical form (e.g., 0(ig) is the member h.h.a of inout Headers h). We use this semantic representation in the INPUT/OUTPUT columns for equality comparison.

Specific usage in the context of continuous integration

After this pull request, Gauntlet will run on every commit and catch new semantic compiler bugs as they are introduced. Gauntlet will run translation validation of all the passes emitted by p4test on every program in p4c/testdata/p4_16_samples/. If all emitted program versions are equal, the test passes. Otherwise, the test fails. When the validation tool finds a failing test, it will emit VIOLATION and provide the two passes that differ. At this moment, only information about which two passes are different is available.

The Travis check will fail if Gauntlet detects a problematic pass and the log will show the pass that has failed. A developer who has Gauntlet installed can run python3 validate_p4_translation.py -i [buggy_program.p4] to identify the defective pass. There may be false positives or interpreter crashes, which will require fixing by me. In that case, please file an issue at https://github.com/p4gauntlet/p4_tv.

A word of caution: Just because Gauntlet does not detect an issue with a particular program compilation, there is no guarantee that the compiler is completely bug free. The interpreter may produce incorrect semantics or have bugs itself. This tool is intended to find issues on a best-effort basis.

Unimplemented language constructs and current limitations

The interpreter passes the majority of P4-16 tests in the compiler suite. Failing tests can be found here. There are still several open problems that need fixing, but they are being worked on:

Semantics for Runtime Header Indices, Annotations, ~~Range, String Literals, and LPM/Ternary matches~~ are missing.
Semantics for table properties such as counters, meters, or action profiles are not implemented.
Symbolic outputs of extern methods do not respect call order or input values yet (they always return the same symbolic output).
Method overloading in externs is not yet supported.
~~Type inference for generics does not work for control and parser bodies yet.~~
Variable scoping does not work in all cases (e.g., shadowing).
Parser loops are currently bounded using a fixed value.
Some specific P4 specification method semantics such as extract or lookahead are missing.

mihaibudiu

This actually looks pretty good.
Can you write a longer description of what this tool does, so it can be reviewed by other people as well?
I have talked to @fruffy about this offline.

mihaibudiu · 2020-06-30T20:36:13Z

tools/travis-build

+ln -s /p4c /gauntlet/p4c
+
+# Install Gauntlet Python dependencies locally
+# Unfortunately we need Python 3.6, which Xenial does not have


what's so special about 3.6?

f-strings. I have been using them a lot in the code because they are nicer to read and use. Python 3.6 is available with 16.10 but apparently not 16.04.

fruffy · 2020-07-11T10:20:54Z

Made a bunch of changes to the Z3 representation of a P4 program. The validation now covers all passes, including ExpandLookahead. The tests also include fabric.p4 now.
A detailed description can be found in the readme of https://github.com/p4gauntlet/p4_tv. Should I add an extra readme here, too? I can also try to provide an estimate on the language constructs that are not fully implemented yet.

hesingh · 2020-07-11T21:36:39Z

Instead of adding any new readme to p4_tv, maybe edit p4c/README.md for a mention of this tool. After all the README.md mentions make check.

Also, since p4c/docs/compiler-design.pptx includes a slide or two regarding debugging a failed test with make check, the slides should be updated to mention this tool.

It would be very helpful to get an estimate on the language constructs not fully implemented yet.

mihaibudiu · 2020-07-13T16:46:18Z

My request was for a comment attached to this PR that describes what your tool does.
Note that if we merge this tool you will be responsible for keeping it up-to-date, since our build will depend on it.

fruffy · 2020-07-14T14:26:41Z

I updated the description.

mihaibudiu · 2020-08-04T23:08:23Z

Does anyone object to merging this tool? It may lengthen the compilation process, but it may find bugs early in the compiler.

mihaibudiu · 2020-08-04T23:08:53Z

Correction, it may lengthen the testing process.

antoninbas · 2020-08-04T23:55:52Z

@mbudiu-vmw it's one more job but it's executed in parallel. It seems that the new job doesn't take more time than the existing ones. The issues is that Travis limits open-source projects to 5 concurrent builds (which I believe apply across the entire p4lang organization?). p4c alone runs 6 jobs as part of CI (7 with this change). Since we were already above the limit of 5, and we are still under 10 with this new job, there isn't much of an impact, at least when p4c is the only p4lang project running CI at a given time.

Edit: actually there is a small impact since the jobs don't all take the same amount of time, (4mins, 8 mins, and 18mins for all the other ones), but I wouldn't worry about it... If it becomes an issue, we can rethink the CI testing, and move some jobs to Github actions (no limit on concurrent jobs IIRC). It's more important to have better coverage.

antoninbas

LGTM

fruffy added 2 commits June 30, 2020 17:42

Add validation tests to Travis.

40ad49c

Comments.

9afb120

mihaibudiu approved these changes Jun 30, 2020

View reviewed changes

fruffy added 3 commits July 1, 2020 13:14

Run validation in parallel to normal tests.

7798f21

Parallel tests

87ac810

Dollar sign.

6850a85

fruffy marked this pull request as ready for review July 11, 2020 10:21

fruffy added 3 commits July 23, 2020 16:05

Update p4c link.

9f73edb

Try to depend on main repository submodule

d4a3345

Flip order.

aacbb78

antoninbas approved these changes Aug 5, 2020

View reviewed changes

mihaibudiu merged commit a5c6c5b into p4lang:master Aug 5, 2020

fruffy deleted the travis_validation_tests branch August 21, 2020 09:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Testing] Add Travis Validation Tests for P4C. #2458

[Testing] Add Travis Validation Tests for P4C. #2458

fruffy commented Jun 30, 2020 •

edited

mihaibudiu left a comment

mihaibudiu Jun 30, 2020

fruffy Jul 1, 2020

fruffy commented Jul 11, 2020

hesingh commented Jul 11, 2020

mihaibudiu commented Jul 13, 2020

fruffy commented Jul 14, 2020

mihaibudiu commented Aug 4, 2020

mihaibudiu commented Aug 4, 2020

antoninbas commented Aug 4, 2020 •

edited

antoninbas left a comment

[Testing] Add Travis Validation Tests for P4C. #2458

[Testing] Add Travis Validation Tests for P4C. #2458

Conversation

fruffy commented Jun 30, 2020 • edited

Specific usage in the context of continuous integration

Unimplemented language constructs and current limitations

mihaibudiu left a comment

Choose a reason for hiding this comment

mihaibudiu Jun 30, 2020

Choose a reason for hiding this comment

fruffy Jul 1, 2020

Choose a reason for hiding this comment

fruffy commented Jul 11, 2020

hesingh commented Jul 11, 2020

mihaibudiu commented Jul 13, 2020

fruffy commented Jul 14, 2020

mihaibudiu commented Aug 4, 2020

mihaibudiu commented Aug 4, 2020

antoninbas commented Aug 4, 2020 • edited

antoninbas left a comment

Choose a reason for hiding this comment

fruffy commented Jun 30, 2020 •

edited

antoninbas commented Aug 4, 2020 •

edited