Skip to content
This repository has been archived by the owner on Oct 3, 2021. It is now read-only.

Excluding 'ntdrivers-todo/diskperf_true-unreach-call.i.cil.c' from SV-COMP-2018 due to undefined behaviour. #498

Closed

Conversation

marek-trtik
Copy link
Contributor

Moving the benchmark diskperf_true-unreach-call.i.cil.c from ntdrivers directory to ntdrivers-todo in order to exclude it from the SV-COMP-2018 due to presence of an undefined behaviour.

The benchmark was extended by comments of the form '@UNDEFINED_BEHAVIOUR: ' describing a feasible path to an undefined behaviour (dereference of invalid poiter), which shows the incorrect classification of the benchmark as true-unreach-call.

  • programs added to new and appropriately named directory

  • license present and acceptable (either in separate file or as comment at beginning of program)

  • contributed-by present (either in README file or as comment at beginning of program)

  • programs added to a .set file of an existing category, or new sub-category established (if justified)

  • intended property matches the corresponding .prp file

  • expected answer in file names according to convention

  • architecture (32 bit vs. 64 bit) matches the corresponding .cfg file
  • original sources present
  • preprocessed files present
  • preprocessed files generated with correct architecture
  • Makefile added with correct content and without overly broad suppression of warnings

Adding comments of the form '@UNDEFINED_BEHAVIOUR: <message>'
describing the feasible path to the undefined behaviour (dereference
of invalid poiter).
@mutilin
Copy link
Contributor

mutilin commented Nov 14, 2017

@marek-trtik No chance to fix undefined behaviour?

@mutilin
Copy link
Contributor

mutilin commented Nov 14, 2017

Please look at the failing check of Travis CI build

@dbeyer
Copy link
Member

dbeyer commented Nov 14, 2017

@marek-trtik The goal is not to delete, but to fix.

@dbeyer dbeyer closed this Nov 14, 2017
@marek-trtik
Copy link
Contributor Author

@mutilin : I do not see any easy way how to fix that.

@marek-trtik
Copy link
Contributor Author

marek-trtik commented Nov 14, 2017

@mutilin Regarding Travis issue: The original version of the benchmark does NOT compile and link (unresolved externals). And I only added few C comment lines describing the feasible execution path leading to an undefined behavior.

@marek-trtik
Copy link
Contributor Author

@dbeyer : First of all, I did NOT remove the benchmark from the repository. I only moved it to a TODO directory, so that it is not used in the SV-COMP 2018. That gives the benchmark a space to be fixed later, ideally for the next competition. The reason for that is, fixing the benchmarks is NOT easy. In all cases, the fact is that the benchmark has incorrect classification due to the proven occurrence of an undefined behavior. So, keeping the benchmark for this year competition would certainly compromise confidence of the results.

@marek-trtik
Copy link
Contributor Author

@mutilin : Regarding Travis issue: I put the missing Makefile into the directory. So, this issue should be ok now.

@lucasccordeiro
Copy link
Contributor

@dbeyer: Clearly, there is an undefined behaviour in this benchmark, as pointed by @marek-trtik. Our suggestion is to move this benchmark to a TODO list so that the benchmark creator can fix it later (hopefully for this year or next year competition). If we keep benchmarks with undefined behaviour, we will certainly compromise the relevance of the results.

@mutilin
Copy link
Contributor

mutilin commented Nov 14, 2017

@marek-trtik You already know which memory is not initialized. Could you fix it by initialization and run your tool again?

@marek-trtik
Copy link
Contributor Author

@mutilin : I fixed the initialisation, see:

https://github.com/marek-trtik/sv-benchmarks/blob/exclude_ntdrivers_diskperf_true/c/ntdrivers-todo/diskperf_true-unreach-call.i.cil.c#L3172

However, with respect to this year's competition the fix solves almost nothing, because there are literally tons of other issue. For example, I marked another path to another undefined behaviour. This path is almost the same as the original one, only 4 lines longer. Please, look into the benchmark and you will see in how bad state the benchmark is and how much work it would require to fully fix it.

To sum up, it is not possible for me to fix all issues in this benchmark till the deadline of this year competition. Therefore, rather than compromising results of the competition I proposed to exclude it until it is fully fixed.

@mutilin
Copy link
Contributor

mutilin commented Nov 15, 2017

I fixed the initialisation

I have looked at it and found that you have initialized some inner object, not the root one.
DEVICE_OBJECT devobj ;
You need to initialize devobj of struct type DEVICE_OBJECT in a recursive manner nondeterministically.

struct _IO_STACK_LOCATION io_stack_location; // HOWEVER, HOW TO INITIALISE FIELDS? THIS WOULD REQUIRE UNDERSTANDING INTENDED FUNCTIONALITY OF THIS BENCHMARK - NONTRIVIAL MANUAL TASK!!!!

You are writing verification tools, which is much more nontrivial manual task)
I suspect that it is enough to initialize the fields in a recursive manner nondeterministically.

@marek-trtik
Copy link
Contributor Author

marek-trtik commented Nov 15, 2017

@mutilin : The response to your note could be the same as for the issue #500 . So, let me quote it here:
"My intention was to fix the issue I proven to be present, while preserving the original program's behaviour as much of as possible.

Changing the benchmark in anyhow more invasive manner without a proof that the changes are nothing but minimal necessary prevention/elimination of real issues in the benchmark may lead to eligible disagreement amongst contributors to the competition in acceptance of the changes."

@mikhailramalho
Copy link
Contributor

Just so you know,

The (original version of the) benchmark has a division by zero in line 2590, confirmed by both ESBMC and CBMC.

@mutilin
Copy link
Contributor

mutilin commented Nov 17, 2017

I think that for the purpose of

keep a copy of the original benchmark and reclassify it to false-def-behavior.

suggested by @peterschrammel we need the new pull request.
This one had different purpose. Now it has conflicts

@dbeyer
Copy link
Member

dbeyer commented Nov 21, 2017

Further fixes to this verification task should be done via a separate pull request.

@dbeyer dbeyer closed this Nov 21, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

Successfully merging this pull request may close these issues.

8 participants