Skip unsupported regression tests #615

schuessf · 2023-02-28T14:17:39Z

Currently many regression tests still fail (see #611). Therefore the Jenkins status ("unstable") is not a good indicator. However, some of the tests are expected to fail (and always have), because they only pass for some settings/toolchains (e.g. due to overapproximation).

This PR provides the possibility to mark those tests as skipped after running them. Therefore we store all tests (consisting of file, settings, toolchain) along with a verdict in a separate file. If a tests fails with this verdict, it is marked as skipped, otherwise as failed.

There are still some open points to discuss:

Currently the files containing the verdicts need to be in the same folder as the settings or toolchains (and not the file itself!) to parse the skipped-file beforehand.
In the current version the verdicts are the results as strings (e.g. TIMEOUT, UNKNOWN, EXCEPTION_OR_ERROR,...). This has two possible issues. First we need to convert these Strings to enums, which might fail. Second these categories might not be as precise as wanted. Therefore we could use the description (e.g. Unable to prove ... Reason: overapproximation of ...) as verdict in this skipped-file.
And of course, if it is reasonable to mark the given tests as skipped.

danieldietsch · 2023-02-28T15:07:32Z

When you run a test suite with skipped tests, can you infer from the test suite result alone which tests where skipped? For example by looking into the TEST-*.xml?

danieldietsch · 2023-02-28T15:10:13Z

trunk/examples/programs/FloatingPoint/regression/c/.skip

@@ -0,0 +1,18 @@
+// Timeout


Can we handle them by increasing the timeout?

Just tried it, but I am not sure, if it succeeds then.

danieldietsch · 2023-02-28T15:10:38Z

trunk/examples/programs/FloatingPoint/regression/c/.skip

+cbmc_float4_PART1.i AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml TIMEOUT
+
+// MathSAT does not support quantifiers
+ctrans-float-rounding.c AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml EXCEPTION_OR_ERROR


Wouldnt it then be better to change the directory structure here?

We could do it that way, but if we simply exclude them, they still run and we could still find any changes.

danieldietsch · 2023-02-28T15:12:04Z

trunk/examples/programs/regression/c/.skip

+array10_pattern_simplified.c AutomizerC-BitvectorTranslation.epf AutomizerC.xml TIMEOUT
+
+// Unable to decide satisfiability of path constraint
+array10_pattern_simplified.c AutomizerC-nestedInterpolants.epf AutomizerC.xml UNKNOWN


looks like changes in smtinterpol, right?

I don't know. This is a fairly new test from me and I of course did not check that all settings pass,

danieldietsch · 2023-02-28T15:12:29Z

trunk/examples/programs/regression/c/.skip

+BitwiseOperations02.c KojakC-Reach-32Bit-Default.epf KojakC.xml UNKNOWN
+
+// Timeout
+BitwiseOperations01.c BlockEncodingV2AutomizerC-FP-MaxSaneBE.epf BlockEncodingV2AutomizerC.xml TIMEOUT


these worked with MaxSaneBE at some point. Does increasing the timeout help here?

danieldietsch · 2023-02-28T15:16:36Z

...urce/Library-UltimateTest/src/de/uni_freiburg/informatik/ultimate/test/UltimateTestCase.java

@@ -183,7 +184,12 @@ public void test() {
 				}
 				if (th != null) {
 					message += " (Ultimate threw an Exception: " + th.getMessage() + ")";
+					if (result == TestResult.IGNORE) {
+						skipTest(message, th);


are you sure its a good idea to throw a new exception and dont keep the old one? at least as wrapped exception?

@maul-esel wrote this 🙂

danieldietsch · 2023-02-28T15:20:29Z

...c/de/uni_freiburg/informatik/ultimate/test/junitextension/testfactory/FactoryTestRunner.java

+		} catch (final AssumptionViolatedException e) {
+			notifier.fireTestAssumptionFailed(new Failure(description, e));
+		} catch (final SkipTestException e) {
+			notifier.fireTestIgnored(description);


danieldietsch · 2023-04-23T08:15:30Z

@schuessf Is there a reason why this isnt merged?

maul-esel and others added 18 commits February 10, 2023 18:30

add support for ignoring tests after running them

0192c36

preliminary support for test result deciders to skip tests

dd2f800

First attempt to check if a test case is ignored

05db15d

Add possibility to ignore test failures to SafetyCheckTestResultDecider

90cf856

Don't overwrite IGNORE result

2144cfe

Skip overflow tests as well

fa70180

Ignore safety tests in various folders

17f52d1

Ignore some more tests and move to the correct location

29d6ed6

Ignore another test

d4e79e3

Improve parsing of ignore file

221acaa

Restructure ignore files, add comments with reason

5bdfaf5

Add expected verdict to ignored.txt, only skip with that verdict

00640e5

Fix some expected verdicts

b350b12

Catch case where String cannot be converted to enum

fff5474

Minor: Restructure method

815dee0

Fix documentation

5fcbc00

Rename ignored.txt to .skip

0d11d41

Refactoring, allow .skip file in toolchain and settings dir

199ffc4

schuessf requested review from maul-esel, Heizmann and danieldietsch February 28, 2023 14:17

danieldietsch approved these changes Feb 28, 2023

View reviewed changes

Skip tests with overapproximation

5a302e1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip unsupported regression tests #615

Skip unsupported regression tests #615

schuessf commented Feb 28, 2023 •

edited

Loading

danieldietsch commented Feb 28, 2023

danieldietsch Feb 28, 2023

schuessf Feb 28, 2023

danieldietsch Feb 28, 2023

schuessf Feb 28, 2023

danieldietsch Feb 28, 2023

schuessf Feb 28, 2023

danieldietsch Feb 28, 2023

danieldietsch Feb 28, 2023

schuessf Feb 28, 2023

danieldietsch Feb 28, 2023

danieldietsch commented Apr 23, 2023

Skip unsupported regression tests #615

Are you sure you want to change the base?

Skip unsupported regression tests #615

Conversation

schuessf commented Feb 28, 2023 • edited Loading

danieldietsch commented Feb 28, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danieldietsch commented Apr 23, 2023

schuessf commented Feb 28, 2023 •

edited

Loading