-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip unsupported regression tests #615
base: dev
Are you sure you want to change the base?
Conversation
When you run a test suite with skipped tests, can you infer from the test suite result alone which tests where skipped? For example by looking into the |
@@ -0,0 +1,18 @@ | |||
// Timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we handle them by increasing the timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tried it, but I am not sure, if it succeeds then.
cbmc_float4_PART1.i AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml TIMEOUT | ||
|
||
// MathSAT does not support quantifiers | ||
ctrans-float-rounding.c AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml EXCEPTION_OR_ERROR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldnt it then be better to change the directory structure here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do it that way, but if we simply exclude them, they still run and we could still find any changes.
array10_pattern_simplified.c AutomizerC-BitvectorTranslation.epf AutomizerC.xml TIMEOUT | ||
|
||
// Unable to decide satisfiability of path constraint | ||
array10_pattern_simplified.c AutomizerC-nestedInterpolants.epf AutomizerC.xml UNKNOWN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like changes in smtinterpol, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. This is a fairly new test from me and I of course did not check that all settings pass,
BitwiseOperations02.c KojakC-Reach-32Bit-Default.epf KojakC.xml UNKNOWN | ||
|
||
// Timeout | ||
BitwiseOperations01.c BlockEncodingV2AutomizerC-FP-MaxSaneBE.epf BlockEncodingV2AutomizerC.xml TIMEOUT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these worked with MaxSaneBE at some point. Does increasing the timeout help here?
@@ -183,7 +184,12 @@ public void test() { | |||
} | |||
if (th != null) { | |||
message += " (Ultimate threw an Exception: " + th.getMessage() + ")"; | |||
if (result == TestResult.IGNORE) { | |||
skipTest(message, th); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure its a good idea to throw a new exception and dont keep the old one? at least as wrapped exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maul-esel wrote this 🙂
} catch (final AssumptionViolatedException e) { | ||
notifier.fireTestAssumptionFailed(new Failure(description, e)); | ||
} catch (final SkipTestException e) { | ||
notifier.fireTestIgnored(description); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@schuessf Is there a reason why this isnt merged? |
Currently many regression tests still fail (see #611). Therefore the Jenkins status ("unstable") is not a good indicator. However, some of the tests are expected to fail (and always have), because they only pass for some settings/toolchains (e.g. due to overapproximation).
This PR provides the possibility to mark those tests as skipped after running them. Therefore we store all tests (consisting of file, settings, toolchain) along with a verdict in a separate file. If a tests fails with this verdict, it is marked as skipped, otherwise as failed.
There are still some open points to discuss:
Unable to prove ... Reason: overapproximation of ...
) as verdict in this skipped-file.