Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip unsupported regression tests #615

Open
wants to merge 19 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions trunk/examples/concurrent/bpl/regression/.skip
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
// Timeout with Automizer, GemCutter succeeds
MirabelleConcurrentIncrement.bpl ReachSafety-32bit-Automizer.epf ReachSafety.xml TIMEOUT
concurrent_increment.bpl ReachSafety-32bit-Automizer.epf ReachSafety.xml TIMEOUT
equal-array-sums.bpl ReachSafety-32bit-Automizer.epf ReachSafety.xml TIMEOUT
loop-lockstep-example.bpl ReachSafety-32bit-Automizer.epf ReachSafety.xml TIMEOUT
sum_invariant.bpl ReachSafety-32bit-Automizer.epf ReachSafety.xml TIMEOUT
2 changes: 2 additions & 0 deletions trunk/examples/concurrent/pthreads/regression/.skip
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
// Timeout with Automizer, GemCutter succeeds
NonAtomicIncrement_2Threads.c ReachSafety-32bit-Automizer.epf ReachSafety.xml TIMEOUT
18 changes: 18 additions & 0 deletions trunk/examples/programs/FloatingPoint/regression/c/.skip
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
// Timeout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we handle them by increasing the timeout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tried it, but I am not sure, if it succeeds then.

BvaddWithThreeArguments.c AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml TIMEOUT
SAS09-Float.c AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml TIMEOUT
SAS09-Float.c AutomizerC-Reach-32bit-Z3-IcWpLv-Float_Const.epf AutomizerC.xml TIMEOUT
ctrans-float-rounding.c AutomizerC-Reach-32bit-Z3-IcWpLv-Float_Const.epf AutomizerC.xml TIMEOUT
cbmc_float4_PART1.i AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml TIMEOUT

// MathSAT does not support quantifiers
ctrans-float-rounding.c AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml EXCEPTION_OR_ERROR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldnt it then be better to change the directory structure here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do it that way, but if we simply exclude them, they still run and we could still find any changes.

cbmc_float-no-simp1.i AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml EXCEPTION_OR_ERROR
cbmc_float-to-double2.i AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml EXCEPTION_OR_ERROR
cbmc_float12.i AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml EXCEPTION_OR_ERROR
cbmc_float13.i AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml EXCEPTION_OR_ERROR
cbmc_float19.i AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml EXCEPTION_OR_ERROR
cbmc_float20_Part1.i AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml EXCEPTION_OR_ERROR
cbmc_float20_Part3.i AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml EXCEPTION_OR_ERROR
cbmc_float6.i AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml EXCEPTION_OR_ERROR
QuantifiedFloatInvariant.c AutomizerCInline-Reach-32bit-MathSAT-IcSpLv-Float_Const.epf AutomizerC.xml EXCEPTION_OR_ERROR
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
// Fails due to overapproximation of bitwise operators in the integer translation
ShiftLeft06.c AutomizerC-SignedIntegerOverflowCheck.epf AutomizerC.xml UNKNOWN
ShiftLeft07.c AutomizerC-SignedIntegerOverflowCheck.epf AutomizerC.xml UNKNOWN
2 changes: 2 additions & 0 deletions trunk/examples/programs/quantifier/regression/bpl/.skip
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
// Overapproximation
AuxVarInCall.bpl AutomizerBpl-forwardPredicates.epf AutomizerBpl.xml UNKNOWN
14 changes: 14 additions & 0 deletions trunk/examples/programs/regression/bpl/.skip
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
// Timeout
BugLiveVariables03-Safe.bpl AutomizerBpl-FPandBP.epf AutomizerBpl.xml TIMEOUT
BugNestedSsaWithPendingContexts.bpl AutomizerBpl-forwardPredicates.epf AutomizerBpl.xml TIMEOUT
BugNoPredecessorOfReturnTransition.bpl PdrAutomizerBpl.epf PdrAutomizerBpl.xml TIMEOUT
nestedWhileSimple-Safe.bpl PdrAutomizerBpl.epf PdrAutomizerBpl.xml TIMEOUT

// Overapproximation
Overapproximation.bpl AutomizerBpl-nestedInterpolants.epf AutomizerBpl.xml UNKNOWN
Overapproximation.bpl AutomizerBpl-forwardPredicates.epf AutomizerBpl.xml UNKNOWN
Overapproximation.bpl AutomizerBpl-FPandBP.epf AutomizerBpl.xml UNKNOWN
Overapproximation.bpl PdrAutomizerBpl.epf PdrAutomizerBpl.xml UNKNOWN

// The solver is unable to evaluate the formula, PDR cannot handle this
BugLiveVariables04-Safe.bpl PdrAutomizerBpl.epf PdrAutomizerBpl.xml EXCEPTION_OR_ERROR
23 changes: 23 additions & 0 deletions trunk/examples/programs/regression/c/.skip
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
// Fails due to overapproximation of bitwise operators in the integer translation
BitwiseOperations02.c AutomizerC-forwardPredicates.epf AutomizerC.xml UNKNOWN
BitwiseOperations02.c AutomizerC-forwardPredicates_const.epf AutomizerC.xml UNKNOWN
BitwiseOperations02.c AutomizerC-nestedInterpolants.epf AutomizerC.xml UNKNOWN
BitwiseOperations02.c AutomizerC-nestedInterpolants_const.epf AutomizerC.xml UNKNOWN
BitwiseOperations02.c BlockEncodingV2AutomizerC-forwardPredicates.epf BlockEncodingV2AutomizerC.xml UNKNOWN
BitwiseOperations02.c KojakC-Reach-32Bit-Default.epf KojakC.xml UNKNOWN

// Timeout
BitwiseOperations01.c BlockEncodingV2AutomizerC-FP-MaxSaneBE.epf BlockEncodingV2AutomizerC.xml TIMEOUT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these worked with MaxSaneBE at some point. Does increasing the timeout help here?

BitwiseOperations02.c BlockEncodingV2AutomizerC-FP-MaxSaneBE.epf BlockEncodingV2AutomizerC.xml TIMEOUT
ShortCircuit-SideEffect-DoStatement-Safe.c AutomizerC-BitvectorTranslation.epf AutomizerC.xml TIMEOUT
ShortCircuit-SideEffect-ForStatement-Safe.c AutomizerC-BitvectorTranslation.epf AutomizerC.xml TIMEOUT
StructPositiveSum-Safe.c AutomizerC-BitvectorTranslation.epf AutomizerC.xml TIMEOUT
StructPositiveSum-Safe.c AutomizerC-forwardPredicates.epf AutomizerC.xml TIMEOUT
StructPositiveSum-Safe.c AutomizerC-forwardPredicates_const.epf AutomizerC.xml TIMEOUT
StructPositiveSum-Safe.c BlockEncodingV2AutomizerC-FP-MaxSaneBE.epf BlockEncodingV2AutomizerC.xml TIMEOUT
StructPositiveSum-Safe.c BlockEncodingV2AutomizerC-forwardPredicates.epf BlockEncodingV2AutomizerC.xml TIMEOUT
array10_pattern_simplified.c AutomizerC-BitvectorTranslation.epf AutomizerC.xml TIMEOUT

// Unable to decide satisfiability of path constraint
array10_pattern_simplified.c AutomizerC-nestedInterpolants.epf AutomizerC.xml UNKNOWN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like changes in smtinterpol, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. This is a fairly new test from me and I of course did not check that all settings pass,

array10_pattern_simplified.c AutomizerC-nestedInterpolants_const.epf AutomizerC.xml UNKNOWN
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
import de.uni_freiburg.informatik.ultimate.test.decider.ITestResultDecider;
import de.uni_freiburg.informatik.ultimate.test.decider.ITestResultDecider.TestResult;
import de.uni_freiburg.informatik.ultimate.test.junitextension.testfactory.FactoryTestMethod;
import de.uni_freiburg.informatik.ultimate.test.junitextension.testfactory.FactoryTestRunner.SkipTestException;
import de.uni_freiburg.informatik.ultimate.test.mocks.ConsoleLogger;
import de.uni_freiburg.informatik.ultimate.test.reporting.IIncrementalLog;
import de.uni_freiburg.informatik.ultimate.test.reporting.ITestLogfile;
Expand Down Expand Up @@ -124,7 +125,7 @@ public void test() {
final ILogger logger = getLoggerFromStarter(starter);
logger.info("Deciding this test: " + deciderName);
result = mDecider.getTestResult(starter.getServices());
if (!returnCode.isOK() && result != TestResult.FAIL) {
if (!returnCode.isOK() && result != TestResult.FAIL && result != TestResult.IGNORE) {
logger.fatal("#################### Overwriting decision of " + deciderName
+ " and setting test status to FAIL ####################");
logger.fatal("Ultimate returned an unexpected status:");
Expand Down Expand Up @@ -183,7 +184,12 @@ public void test() {
}
if (th != null) {
message += " (Ultimate threw an Exception: " + th.getMessage() + ")";
if (result == TestResult.IGNORE) {
skipTest(message, th);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure its a good idea to throw a new exception and dont keep the old one? at least as wrapped exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maul-esel wrote this 🙂

}
failTest(message, th);
} else if (result == TestResult.IGNORE) {
skipTest(message);
} else {
failTest(message);
}
Expand Down Expand Up @@ -266,6 +272,14 @@ private static void failTest(final String message, final Throwable th) throws Ul
throw exception;
}

private static void skipTest(final String message) {
throw new SkipTestException(message);
}

private static void skipTest(final String message, final Throwable cause) {
throw new SkipTestException(message, cause);
}

@Override
public int compareTo(final UltimateTestCase o) {
return mUltimateRunDefinition.compareTo(o.mUltimateRunDefinition);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,10 @@ public enum TestResult {
/**
* Success should always represent a JUnit test success.
*/
SUCCESS
SUCCESS,
/**
* Marks a test as ignored
*/
IGNORE
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,22 @@
* Copyright (C) 2014-2015 Daniel Dietsch (dietsch@informatik.uni-freiburg.de)
* Copyright (C) 2014-2015 Matthias Heizmann (heizmann@informatik.uni-freiburg.de)
* Copyright (C) 2015 University of Freiburg
*
*
* This file is part of the ULTIMATE UnitTest Library.
*
*
* The ULTIMATE UnitTest Library is free software: you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published
* by the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
*
* The ULTIMATE UnitTest Library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
*
* You should have received a copy of the GNU Lesser General Public License
* along with the ULTIMATE UnitTest Library. If not, see <http://www.gnu.org/licenses/>.
*
*
* Additional permission under GNU GPL version 3 section 7:
* If you modify the ULTIMATE UnitTest Library, or any covered work, by linking
* or combining it with Eclipse RCP (or a modified version of Eclipse RCP),
Expand All @@ -38,31 +38,46 @@
import de.uni_freiburg.informatik.ultimate.test.util.TestUtil;

/**
* Use keywords in filename and first line to decide correctness of safety
* checker results.
*
* Use keywords in filename and first line to decide correctness of safety checker results.
*
* @author heizmann@informatik.uni-freiburg.de
* @author Daniel Dietsch (dietsch@informatik.uni-freiburg.de)
*
*
*/
public class SafetyCheckTestResultDecider extends ThreeTierTestResultDecider<SafetyCheckerOverallResult> {

/**
*
*
* @param ultimateRunDefinition
*
*
* @param unknownIsJUnitSuccess
* if true the TestResult UNKNOWN is a success for JUnit, if
* false, the TestResult UNKNOWN is a failure for JUnit.
* if true the TestResult UNKNOWN is a success for JUnit, if false, the TestResult UNKNOWN is a failure
* for JUnit.
*
* @param overridenExpectedVerdict
* The expected verdict overridden in a separate file, null if not present.
*/
public SafetyCheckTestResultDecider(final UltimateRunDefinition ultimateRunDefinition, final boolean unknownIsJUnitSuccess) {
public SafetyCheckTestResultDecider(final UltimateRunDefinition ultimateRunDefinition,
final boolean unknownIsJUnitSuccess, final String overridenExpectedVerdict) {
super(ultimateRunDefinition, unknownIsJUnitSuccess, overridenExpectedVerdict);
}

/**
*
* @param ultimateRunDefinition
*
* @param unknownIsJUnitSuccess
* if true the TestResult UNKNOWN is a success for JUnit, if false, the TestResult UNKNOWN is a failure
* for JUnit.
*/
public SafetyCheckTestResultDecider(final UltimateRunDefinition ultimateRunDefinition,
final boolean unknownIsJUnitSuccess) {
super(ultimateRunDefinition, unknownIsJUnitSuccess);
}

@Override
public IExpectedResultFinder<SafetyCheckerOverallResult> constructExpectedResultFinder() {
return new KeywordBasedExpectedResultFinder<>(
TestUtil.constructFilenameKeywordMap_AllSafetyChecker(), null,
return new KeywordBasedExpectedResultFinder<>(TestUtil.constructFilenameKeywordMap_AllSafetyChecker(), null,
TestUtil.constructFirstlineKeywordMap_SafetyChecker());
}

Expand All @@ -76,6 +91,18 @@ public ITestResultEvaluation<SafetyCheckerOverallResult> constructTestResultEval
return new SafetyCheckerTestResultEvaluation();
}

private SafetyCheckerOverallResult getOverridenExpectedVerdict() {
if (mOverridenExpectedVerdict == null) {
return null;
}
try {
return SafetyCheckerOverallResult.valueOf(mOverridenExpectedVerdict);
} catch (final IllegalArgumentException e) {
// Cannot convert to SafetyCheckerOverallResult, fall back to provided expected result
return null;
}
}

public class SafetyCheckerTestResultEvaluation implements ITestResultEvaluation<SafetyCheckerOverallResult> {
private String mCategory;
private String mMessage;
Expand All @@ -84,13 +111,19 @@ public class SafetyCheckerTestResultEvaluation implements ITestResultEvaluation<
@Override
public void evaluateTestResult(final IExpectedResultFinder<SafetyCheckerOverallResult> expectedResultFinder,
final IOverallResultEvaluator<SafetyCheckerOverallResult> overallResultDeterminer) {
final SafetyCheckerOverallResult expectedVerdict = getOverridenExpectedVerdict();
if (expectedVerdict != null) {
mMessage = "ExpectedResult (overriden): " + expectedVerdict;
compareToOverallResult(expectedVerdict, overallResultDeterminer, true);
return;
}
evaluateExpectedResult(expectedResultFinder);
switch (expectedResultFinder.getExpectedResultFinderStatus()) {
case ERROR:
// we will not evaluate overall result;
return;
case EXPECTED_RESULT_FOUND:
compareToOverallResult(expectedResultFinder.getExpectedResult(), overallResultDeterminer);
compareToOverallResult(expectedResultFinder.getExpectedResult(), overallResultDeterminer, false);
return;
case NO_EXPECTED_RESULT_FOUND:
evaluateOverallResultWithoutExpectedResult(overallResultDeterminer);
Expand Down Expand Up @@ -130,7 +163,8 @@ private void evaluateOverallResultWithoutExpectedResult(
}

private void compareToOverallResult(final SafetyCheckerOverallResult expectedResult,
final IOverallResultEvaluator<SafetyCheckerOverallResult> overallResultDeterminer) {
final IOverallResultEvaluator<SafetyCheckerOverallResult> overallResultDeterminer,
final boolean skipOnSuccess) {
final SafetyCheckerOverallResult overallResult = overallResultDeterminer.getOverallResult();
final String overallResultMsg = overallResultDeterminer.generateOverallResultMessage();

Expand All @@ -139,7 +173,7 @@ private void compareToOverallResult(final SafetyCheckerOverallResult expectedRes
switch (overallResult) {
case EXCEPTION_OR_ERROR:
mCategory = overallResult + " (Expected:" + expectedResult + ") " + overallResultMsg;
mTestResult = TestResult.FAIL;
mTestResult = skipOnSuccess && expectedResult == overallResult ? TestResult.IGNORE : TestResult.FAIL;
break;
case SAFE:
if (expectedResult == SafetyCheckerOverallResult.SAFE) {
Expand All @@ -162,17 +196,20 @@ private void compareToOverallResult(final SafetyCheckerOverallResult expectedRes
break;
case UNSAFE_OVERAPPROXIMATED:
if (expectedResult == overallResult) {
mTestResult = TestResult.SUCCESS;
mTestResult = skipOnSuccess ? TestResult.IGNORE : TestResult.SUCCESS;
} else if (expectedResult == SafetyCheckerOverallResult.UNSAFE) {
mTestResult = TestResult.SUCCESS;
} else {
mTestResult = TestResult.UNKNOWN;
}
break;
case UNKNOWN:
case TIMEOUT:
// syntax error should always have been found
if (expectedResult == SafetyCheckerOverallResult.SYNTAX_ERROR) {
mTestResult = TestResult.FAIL;
} else if (skipOnSuccess && expectedResult == overallResult) {
mTestResult = TestResult.IGNORE;
} else {
mTestResult = TestResult.UNKNOWN;
}
Expand All @@ -184,14 +221,6 @@ private void compareToOverallResult(final SafetyCheckerOverallResult expectedRes
mTestResult = TestResult.FAIL;
}
break;
case TIMEOUT:
// syntax error should always have been found
if (expectedResult == SafetyCheckerOverallResult.SYNTAX_ERROR) {
mTestResult = TestResult.FAIL;
} else {
mTestResult = TestResult.UNKNOWN;
}
break;
case UNSUPPORTED_SYNTAX:
mTestResult = TestResult.FAIL;
break;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ public abstract class ThreeTierTestResultDecider<OVERALL_RESULT> implements ITes
private final IExpectedResultFinder<OVERALL_RESULT> mExpectedResultEvaluation;
private IOverallResultEvaluator<OVERALL_RESULT> mUltimateResultEvaluation;
private ITestResultEvaluation<OVERALL_RESULT> mTestResultEvaluation;
protected final String mOverridenExpectedVerdict;

/**
*
Expand All @@ -67,15 +68,32 @@ public abstract class ThreeTierTestResultDecider<OVERALL_RESULT> implements ITes
* @param unknownIsJUnitSuccess
* if true the TestResult UNKNOWN is a success for JUnit, if false, the TestResult UNKNOWN is a failure
* for JUnit.
*
* @param overridenExpectedVerdict
* The expected verdict overridden in a separate file, null if not present.
*/
public ThreeTierTestResultDecider(final UltimateRunDefinition ultimateRunDefinition,
final boolean unknownIsJUnitSuccess) {
final boolean unknownIsJUnitSuccess, final String overridenExpectedVerdict) {
mUnknownIsJUnitSuccess = unknownIsJUnitSuccess;
mOverridenExpectedVerdict = overridenExpectedVerdict;
mUltimateRunDefinition = ultimateRunDefinition;
mExpectedResultEvaluation = constructExpectedResultFinder();
mExpectedResultEvaluation.findExpectedResult(ultimateRunDefinition);
}

/**
*
* @param ultimateRunDefinition
*
* @param unknownIsJUnitSuccess
* if true the TestResult UNKNOWN is a success for JUnit, if false, the TestResult UNKNOWN is a failure
* for JUnit.
*/
public ThreeTierTestResultDecider(final UltimateRunDefinition ultimateRunDefinition,
final boolean unknownIsJUnitSuccess) {
this(ultimateRunDefinition, unknownIsJUnitSuccess, null);
}

@Override
public final TestResult getTestResult(final IUltimateServiceProvider services) {
mUltimateResultEvaluation = constructUltimateResultEvaluation();
Expand Down Expand Up @@ -123,8 +141,10 @@ public boolean getJUnitSuccess(final TestResult testResult) {
return mUnknownIsJUnitSuccess;
case FAIL:
return false;
case IGNORE:
return false;
default:
throw new AssertionError("unknown actualResult");
throw new AssertionError("unknown actualResult: " + testResult);
}

}
Expand Down
Loading