Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translation from Ivy to mypyvy #76

Open
wants to merge 47 commits into
base: master
Choose a base branch
from

Conversation

dranov
Copy link

@dranov dranov commented Dec 27, 2023

This PR adds a method to automatically translate Ivy specifications to mypyvy.

This works by retrofitting the SMT-LIB generated by Ivy into mypyvy syntax, so the resulting mypyvy specifications are quite a bit uglier than those a human would write.

You invoke the method by adding attribute method = convert_to_mypyvy in the isolate that needs to be translated (or at the top-level of the Ivy file), and then invoke ivy_check on the file. When invoked with the convert_to_mypyvy method, ivy_check takes two (optional, default to false) arguments:

  • unfold_macros=true/false – determines whether macro definitions in the SMT-LIB generated by Ivy are unfolded (unfolding macros results in larger mypyvy transitions, but with fewer variables; these are more similar to what humans would write and might be easier for the SMT solver in certain cases)
  • simplify=true/false – determines whether Z3's ctx-solver-simplify tactic is applied; this uses Z3 to remove redundant clauses in the SMT-LIB generated by Ivy, but this can take on the order of hours for larger Ivy specifications; when simplify=false, the much faster (but less effective) ctx-simplify tactic is used

The converted mypyvy specification is generated in the directory in which ivy_check is invoked and takes the name of the Ivy specification, e.g. tpc.ivy becomes tpc.pyv.

In terms of implementation:

  • I have ported two files from mypyvy, syntax.py and utils.py so we generate mypyvy specifications directly in mypyvy's internal representation and use mypyvy's own pretty-printer. Both Ivy and mypyvy are MIT licensed. Would it be preferable to depend on mypyvy as package rather than copy the files?

  • The only change I have made to Ivy proper is in ivy_solver.py, to disable some trivial optimizations of the generated SMT-LIB. As part of the translation (more specifically, the simplifications via Z3), I have added code to translate SMT-LIB generated by Ivy back into Ivy formulas and added a round-tripping test (Ivy -> SMT-LIB -> Ivy) to ensure my translation was faithful. I've disabled those simplifications in ivy_solver.py because they prevented my round-tripping tests from working.

Due to how it works (by leveraging Ivy's translation to SMT-LIB), the translation should support all Ivy features and it should require little to no maintenance if/when the language evolves. I have tested this only with Ivy specifications with #lang ivy1.7 and #lang ivy1.8.

To give you a sense of what the output looks like, this Ivy specification is translated into this .pyv file.

This seems to mostly work, but there are two issues:

1) mypyvy has a distinction between mutable and immutable
relations, and only immutable relations can appear in axioms

Currently we have no way to figure out whether a relation is
mutable or immutable. I guess we can add a constraint: if it shows
up in an axiom, it must be immutable.

2) You get axioms like these, which are not valid mypyvy because
`<` is not a valid relation name in mypyvy:

```
axiom [ledger_spec_transitivity] <(T, U) & <(U, V) -> <(T, V)
axiom [ledger_spec_antisymmetry] !(<(T, U) & <(U, T))
```
Fixed previous issue with immutable symbols.
We can also automatically translate toy_consensus.ivy - the
only remaining thing is to properly generate new() expressions.
action.update() also returns a precondition which restricts the
existentially quantified variables. I previously did not include
this precondition in the translated transition. Now it's included.
This seems to mostly work, but two things remain:

1) define an action which clobbers/arbitrarily sets
all these intermediary relations/functions

2) properly handle initializers, which also seem to
refer to intermediate relations/functions
I realised we cannot translate Ivy initializers separately. At least
in Ivy 1.8 (and I think 1.7 as well), initializers are now stateful
and semanticaly they execute one after the other, such that the changes
made by former initializers are visible to the latter.

To account for this, we create an 'artificial' action that Sequences
all the `after init` block in the order they show up in
`mod.initializers`, and translate this action as a single initializer.

Whereas actions have parameters, for initializers we explicitly
add an outer existential quantifier. Moreover, we use `make_vc()`
rather than `update()` to generate a one-state formula, as opposed
to the two-state formulas normally used for actions.
It's not clear yet whether these work.

I've tried translating Mark's Suzuki Kasami spec, and the invariant
that passes Ivy's checks fails in the mypyvy translated version, so
there is definitely something wrong with the translation...
Things like `new(env_historical_auth_required(O, t__this, _approve))`
appear within transitions and mypyvy complains that t__this does not
show up in the transition's `modifies` clause.

This seems like a bug in mypyvy, since t__this is not actually modified.

In any case, we now traverse the generated Mypyvy formula and generate
the modifies clause from that rather than simply translating Ivy's
(correct) view of the modified symbols.
The previous commit didn't produce correct specs, since
the not-actually-modified symbols were implicitly havoc'ed.

We need to explicitly assert they are unchanged.
My handling of de Bruijn indices is broken :-(

This formula:

```
(~(__new_fml:x = env.current_ledger & new_env.current_ledger = __fml:y &
(forall L,M,K. (t.balance.map(K,L) & t.balance.map(K,M)) -> L = M) &
(forall V,K. t.balance.map(K,V) -> t.balance.pre(K)) &
~env.tx_has_started & ledger.lt(__new_fml:x,__fml:y) & (forall Y.
ledger.lt(__new_fml:x,Y) -> ledger.lte(__fml:y,Y)) &
ledger.succ(__new_fml:x,__fml:y) & env.tx_has_started) & (__new_fml:x =
env.current_ledger & new_env.current_ledger = __fml:y & (forall L,M,K.
(t.balance.map(K,L) & t.balance.map(K,M)) -> L = M) & (forall V,K.
t.balance.map(K,V) -> t.balance.pre(K)) & ~env.tx_has_started &
ledger.lt(__new_fml:x,__fml:y) & (forall Y. ledger.lt(__new_fml:x,Y) ->
ledger.lte(__fml:y,Y)) & ledger.succ(__new_fml:x,__fml:y)))
```

Gets round-tripped to this:

```
(__new_fml:x = env.current_ledger & new_env.current_ledger = __fml:y &
(forall K:address,L:integer,M:integer.
(t.balance.map(K:address,L:integer) &
t.balance.map(K:address,M:integer)) -> L:integer = M:integer) & (forall
K:address,V:integer. t.balance.map(K:address,V:integer) ->
t.balance.pre(K:address)) & ~env.tx_has_started &
ledger.lt(__new_fml:x,__fml:y) & (forall Y:ledger.
ledger.lt(__new_fml:x,Y:ledger) -> ledger.lte(__fml:y,Y:ledger)) &
ledger.succ(__new_fml:x,__fml:y))
```

Notice that the quantifiers get messed up.
Still have issues with quantified variables over EnumeratedSort...
Round-tripping still fails. I spot-check two issues:

(1) `a <-> b` gets transformed into `a = b`
(2) binder order for quantifiers gets messed up
This might be useful or make the solver faster, but makes it a
bit of a pain for us to check that our Ivy -> SMT -> Ivy pipeline
is correct.
It seems like the the issue with de Bruijn indices was trivial
to fix.

I am now having issues with mypyvy's modifies again. Since we
now simplify formulas, we cannot rely on Ivy's 'modified' to generate
the mypyvy one (even with the previous caveats).

I had removed the code that distinguishes between "actually modified"
and "only perceived by mypyvy to be modified" fields, but we cannot
actually do this. In the below, t__this is considered modified, so
it's havoc'ed if we don't explicitly set it:

```
new(env_historical_auth_required(O, t__this, _approve)
```

I'll need to reimplement this logic, which is now a bit trickier to
do because of the simplification.
`token.ivy` is now simplified and translated, and the invariants
that hold on the Ivy side hold on the mypyvy side as well, and traces
can be satisfied :-)
Giulano alerted me that the following spec does not translate
correctly; it has unresolved symbols: https://github.com/nano-o/soroban-decidable-verification/blob/6205e8bbf302dbac018a816f10c5cd1edd69660c/models/simplified/pool.ivy

The issue is that `mod.sig.symbols` for the 'pool' isolate does not
actually contain all symbols that show up when one generates SMT-LIB
from pool's actions. In the SMT that's generate, symbols that refer
to tokens appear as well, since the pool's actions call token actions.

From Ivy's perspective, the 'pool' isolate really does NOT need to know
anything about token internals, but in mypyvy (and SMT), since we do
not have isolates, we DO need to know, since those symbols will show
up when we generate the full definition of actions.

It seems Ivy stores the "parent" isolate's signature in `mod.old_sig`
upon creating/entering a new isolate
(`ivy_module.py:Module:__enter__()`, called implicitly by Python's
context managing `with` statement).

Rather than using an isolate's own signature for translation, we use
the parent signature.

IMPORTANT: I am not sure this always makes sense/will do the right
thing, but it seems to work for our current examples. In any case,
this will produce a type/sort error in mypyvy if it doesn't do the right
thing, so there is no risk of silent corruption.
TODO: we should refactor this code to identify macros on the Ivy
side of things. I wanted to do everything SMT-side because there
were already tactics for macro identification, but ran into
issues with substitution.
This is enough to simplify `mint` for the LP contract (modulo the
FIXME in `is_temp_equality`), but fails on `deposit` and other
transitions because it tries to unfold:

```
[_macro_139] __fml:depositor = pool._depositor_addr
unfolding macro __fml:depositor = pool._depositor_addr...
```

This should NOT happen because `depositor` is a formal parameter
of the `deposit` transition.

This also seems to originate from `if depositor = _depositor_addr` in the
GHOST portion of the spec, so really shouldn't be identified as a macro even
if `depositor` wasn't a parameter.

Will investigate.
Still need to implement the optimisation from the previous commit
message.

Also need to do the equivalence check once at the end rather than
after every unfold.
Now equalities like `__fml_c_r = token_a_balance_pre(__fml_c_addr)`
are unfolded, i.e. `__fml_c_r` is replaced with RHS (to eliminate
the Skolem altogether).
Macro unfolding should now be slightly faster than before, as some
unecessary traversals of the formula are avoided.
@@ -583,11 +583,11 @@ def clauses_to_z3(clauses):

def formula_to_z3_int(fmla):
# print "formula_to_z3_int: {} : {}".format(fmla,type(fmla))
if isinstance(fmla,ivy_logic.Definition or ivy_logic.is_eq(fmla) or isinstance(fmla,ivy_logic.Iff)):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this code even do? It's passing an or to isinstance, but that or should always trivially be ivy_logic.Definition. Should the close paren be earlier?

Copy link
Author

@dranov dranov Dec 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch. This is always ivy_logic.Definition, so this portion of the code only handles Definitions.

Equalities are, strangely enough, handled by the ivy_logic.is_atom(fmla) branch at the beginning of the function. The definition of is_atom(term) is return (isinstance(term,App) or isinstance(term,Symbol)) and term.sort == lg.Boolean or isinstance(term,lg.Eq).

return args[0]
if ivy_logic.is_false(fmla.args[1]):
return z3.Not(args[0])
if isinstance(fmla,ivy_logic.Definition or ivy_logic.is_eq(fmla)):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seem to remain a parenthesis issue here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed this in a new commit.

@dranov
Copy link
Author

dranov commented Dec 28, 2023

This doesn't yet handle native integers in the Ivy spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants