-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Quint specification #163
Conversation
Here is how to start experimenting with Quint and the model. Installation
Basic resources
REPLAfter installing the quint tool, run the REPL by typing
Inspecting the NMT modelAs a first step, examine the model operators from within REPL.
After getting acquainted with all the operators, you can simulate the model behavior by running
Running testsOnce a test file is generated, it is read by |
Thank you for providing instructions on using Quint and running tests. I am currently reviewing the PR and will provide my comments as I go. However, I have an immediate question:
Another question: |
The |
Thanks @ivan-gavran for addressing my previous comments.
|
Hi @staheri14 , thanks for the questions and for digging deeper into this PR.
You are right that the model version and Go version of the computation here are at a similar level of abstraction and complexity. Where we do gain on readability of NMT are the test cases: using the model, we treat test-cases explicitly as data (whereas in the original Go code they are intertwined with test execution).
The current test generation (as present in this PR) is the most similar to the existing
This touches on the three modes I was describing: a random simulation, symbolic simulation, and full verification. I'd like to emphasise that Quint tests are really Go tests. The difference comes from separating the test-case generation and execution. In the "traditional testing approach", the test cases (such as this tabular sequence) are written by hand; in the "Quint testing", the test cases are generated by the tool.
The point I wanted to make was that as the codebase evolves (e.g., parameters of a certain function are changed, or the logic changes making some test-cases behave differently etc.), we do not have to re-write a large number of manually written test cases. Instead, we could change the model and the go function which reads the generated test cases and executes them, and have again a huge number of test cases re-generated. Finally, it is fair to say that Quint is a new technology so best testing practices are still being established and evaluated. |
Thanks @ivan-gavran for your thorough response!
Thanks for clarifying your point. Based on your explanation, the Quint model for nmt simplifies and automates the process of generating test data, which has two primary benefits (echoing what you said essentially): 1) it allows for the identification of hidden corner cases that would otherwise be difficult to discover manually; and 2) it reduces the workload for reproducing test data and refactoring code when new features are added. However, as the current state of the PR, the test data generation follows a random simulation, hence, coverage-wise (referring to the first point above) the generated test data is similar to ones generated in the I would also appreciate it if you could clarify whether Quint can be used as a specification for nmt, I would be curious to learn more about how that would work. For example, could the quint model (compared to the current go implementation) be leveraged to facilitate the implementation of nmt in a different language? My understanding is that it cannot replace a formal english specification, and we should not rely on Quint as a replacement of specifications. Would like to know your point of view on this.
I'm interested to know how to generate test data that intentionally fail for specific reasons using quint and how to incorporate that data into the tests of the Also, I think quint-generated test data are not meant for unit testing of the functions? For example, tests such as Line 67 in 05017fc
I see your point, thanks for clarification! |
Codecov Report
@@ Coverage Diff @@
## master #163 +/- ##
=======================================
Coverage 95.39% 95.39%
=======================================
Files 5 5
Lines 565 565
=======================================
Hits 539 539
Misses 15 15
Partials 11 11 |
I like this discussion :)
That is correct!
I believe Quint could be used as a specification for nmt. The most promising approach would be to interleave English descriptions (for intuition) with Quint expressions (for precision). More generally, I see Quint code as a replacement for pseudocode - and now you have specs with pseudocode you can simulate, automatically reason about, or connect to test cases.
In the latest commit, I wrote the spec and the Go adapter that mimic the tests from {
"three leaves and not found but with range", 2,
append(repeat(generateLeafData(2, 0, 1, []byte("_data")), 2), newNamespaceDataPair([]byte{1, 1}, []byte("_data"))),
[]byte{0, 1},
2, 3,
false,
}, the invariant is defined by val fiveLeavesAndNotFoundButWithinRange =
val leavesSet = listToSet(leaves_namespace_idx_v)
and{
length(leaves_namespace_idx_v) == 5,
not(leavesSet.contains(namespace_v)),
min(leavesSet) < namespace_v,
max(leavesSet) > namespace_v
} The difference between the original test description and Quint description is that the original description is simultaneously describing a scenario and technically achieving it. In the commit, the Quint part only describes what a test scenario should be about, while the technical details (e.g., byte representation) are left to the adapter and exact data values are generated (and stored in a After negating the invariant, a test case is found as a violation of that (negated) invariant. This example illustrates that one does not have to explicitly write desired corruption cases. What we do instead is describe the data by an invariant. You are right that Quint is not meant primarily for units tests, but can be used for it, too (as the example illustrates).
Yes, the test data is generated using random simulation. The other two modes are in the coming soon stage: they depend on connecting Quint to the Apalache model checker (here is the tracking issue) |
In the previous comment, I mentioned the new commit, which mimics tests from
{
"two leaves and found", 1,
append(generateLeafData(1, 0, 1, []byte("_data")), generateLeafData(1, 1, 2, []byte("_data"))...),
[]byte{1},
1, 2,
true,
} , Quint defines it by val twoLeavesAndFound =
and{
length(leaves_namespace_idx_v) == 2,
val leavesSet = listToSet(leaves_namespace_idx_v)
leavesSet.contains(namespace_v)
} Finally, we negate the property and use the negated property as an invariant, whose violation is then a test case.
|
…e under formal_spec
Would the current quint model for nmt be applicable for the other two simulation modes or would a different quint model be required for those modes? |
formal_spec/nmt.qnt
Outdated
|
||
type BINARY = List[int] | ||
|
||
def binary(n: int) : BINARY = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful if all functions in this .qnt file were accompanied by a description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do!
The same model can be used for all three modes. The difference is only in how they are looking for an invariant violation (and, therefore, what guarantees they can provide):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added further comments and questions.
@ivan-gavran Thanks a lot for addressing the comments, they look good to me! The issue with gobencher is currently being looked into. As soon as it's resolved, I'll let you know. At that point, I'll kindly request you merge the master branch into your branch one last time. After that, we should be ready to merge your PR :) |
@ivan-gavran Would you mind updating your branch with the master branch for the last time, then we are good to merge this PR (given that CIs are fine excluding the gobencher). Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @ivan-gavran !!
LGTM 👍 👍
The merge-base changed after approval.
The merge-base changed after approval.
The merge-base changed after approval.
The merge-base changed after approval.
The merge-base changed after approval.
The merge-base changed after approval.
The merge-base changed after approval.
The merge-base changed after approval.
Overview
This PR introduces a formal specification of the NMT proof/verification logic in Quint.
The benefits of having a Quint specification are threefold:
verificationAlwaysCorrect
is an example of a property against which a specification can be checked.nmtTest
iteratively generates proofs and non-deterministically corrupts them. These generated test runs are exported in a json format (e.g., file ITF_traces/runTest.itf.json). To be executed as a part of the standard suite, an adapter simulation_test.go is necessary. (The adapter iterates through the json-represented execution state and translates them to function calls.) The generation of the tests happens through simulation. In that sense, it is similar to the existing tests fuzz_test.go, except that it also adds corruption of the data.Current limitations:
ignoreMaxNamespace
option). Modelling it correctly depends on the decision of what the desired behaviour is (issue IgnoreMaxNS leads to false computing maxNs #148 )Checklist