Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ProvenTransaction object #38

Closed
Tracked by #29
bobbinth opened this issue Feb 28, 2023 · 9 comments · Fixed by #40
Closed
Tracked by #29

Implement ProvenTransaction object #38

bobbinth opened this issue Feb 28, 2023 · 9 comments · Fixed by #40
Assignees

Comments

@bobbinth
Copy link
Contributor

bobbinth commented Feb 28, 2023

A ProvenTransaction object is the result of executing and proving a transaction. It should contain the minimal amount of data needed to verify that a transaction was executed correctly. The object should consist of the following:

Component Size Description
Account ID $1$ element The identifier of the account involved in the transaction.
Initial account hash $4$ elements Hash of the account state at the beginning of the transaction.
Final account hash $4$ elements Hash of the account state at the end of the transaction.
Consumed note info $8n$ elements A list of tuples (nullifier, script_root) for all notes consumed by the transaction.
Created note info $6n$ elements A list of tuples (note_hash, note_meta) for all notes created during the transaction.
tx script root $4$ elements MAST root of the transaction script for the transaction (if any).
Block reference $4$ elements Hash of the last known block at the time the transaction was created.
Proof variable STARK proof attesting to the correct execution of the transaction program.

A verifier would use the above information as follows:

  1. Compute a MAST root of the transaction program using note script_roots, tx_script_root, and components of transaction kernel (i.e., prologue, epilogue, note setup script etc.).
  2. Verify that the above program was executed correctly against the following stack inputs/outputs.
Inputs:  [block_ref, acct_id, init_acct_hash, input_notes_hash, ...]
Outputs: [final_acct_hash, created_notes_hash, ...]

In the above, input_notes_hash is a sequential hash of all (nullifier, script_root) tuples of consumed notes, and created_notes_hash is a sequential hash of all (note_hash, note_meta) tuples of created notes.

@frisitano
Copy link
Contributor

frisitano commented Mar 1, 2023

Whats the reason for requiring the script_root in the consumed notes hash. Couldn't we make a commitment exclusively from a sequential hash of nullifiers. I just wonder if this partialy leaks information related to the nature of the transaction? @bobbinth I suspect it's because we need to construct the MAST root but I wonder if we can delegate that to the client?

@bobbinth
Copy link
Contributor Author

bobbinth commented Mar 1, 2023

We need script_root's to construct tx_program (which is basically script_root's + tx_script + prologue/epilogue/note_setup_script from kernel). If we don't know script roots we have no way of verifying whether the right programs were executed for each note (at least not without modifying some aspects of Miden VM).

It could leak information in some cases (e.g., if script root is a for a well-known script) - though, there might be some ways to mitigate this (one way could be to push/pop some random values somewhere in the script).

@frisitano
Copy link
Contributor

We need script_root's to construct tx_program (which is basically script_root's + tx_script + prologue/epilogue/note_setup_script from kernel). If we don't know script roots we have no way of verifying whether the right programs were executed for each note (at least not without modifying some aspects of Miden VM).

It could leak information in some cases (e.g., if script root is a for a well-known script) - though, there might be some ways to mitigate this (one way could be to push/pop some random values somewhere in the script).

Couldn't we delegate the construction of the tx_program to the client? It looks like the prologue expect a sequential hash of nullifiers as the consumed notes commitment.

@bobbinth
Copy link
Contributor Author

bobbinth commented Mar 1, 2023

Couldn't we delegate the construction of the tx_program to the client? It looks like the prologue expect a sequential hash of nullifiers as the consumed notes commitment.

On the verifier side, we need to make sure that tx_program was constructed correctly. This means:

  • The prologue was executed first.
  • For each note, we first executed note_setup_script and then executed call.note_script.
  • Then we executed tx_script (if it was present).
  • And finally we executed epilogue.

The challenging part is ensuring that nothing else was executed (otherwise, for example, a malicious prover could execute some code in between note scripts which modifies the state of kernel memory and breaks some invariants).

So, if I as the verifier get a MAST root which I wrap it in prologue/epilogue - I can't really tell what happened in between these (i.e., were all note scripts called? were they called with right inputs? was there some other code executed which shouldn't have been etc.). Having access to script_root's of all notes is an easy (though probably not ideal) way of addressing this.

@bobbinth
Copy link
Contributor Author

bobbinth commented Mar 1, 2023

It looks like the prologue expect a sequential hash of nullifiers as the consumed notes commitment.

This is actually outdated: I first described it as hash of nullifiers provided as inputs and hash of note scripts provided returned as outputs, but changed this to tuples once we introduced note metadata.

@frisitano
Copy link
Contributor

frisitano commented Mar 1, 2023

On the verifier side, we need to make sure that tx_program was constructed correctly. This means:

  • The prologue was executed first.
  • For each note, we first executed note_setup_script and then executed call.note_script.
  • Then we executed tx_script (if it was present).
  • And finally we executed epilogue.

The challenging part is ensuring that nothing else was executed (otherwise, for example, a malicious prover could execute some code in between note scripts which modifies the state of kernel memory and breaks some invariants).

So, if I as the verifier get a MAST root which I wrap on prologue/epilogue - I can't really tell what happened in between these (i.e., were all note scripts called? were they called with right inputs? was there some other code executed which shouldn't have been etc.). Having access to script_root's of all notes is an easy (though probably not ideal) way of addressing this.

Right I see that makes sense. I'm not sure if this would work but one idea would be that we have a standardised tx_program that follows the execution flow you mention. When it reads in the notes it authenticates the script_root against the hash of the note which is subsequently authenticated against the notes db. We then have a machine instruction adv_call.* which seeds a MAST from the advice provider. e.g. adv_call.note_script_root. This would execute the MAST provided by the advice provider and assert that it hashes to the note_script_root.

I'm not sure if this will work and not suggesting we should implement this now, just trying to explore the possibilities.

@bobbinth
Copy link
Contributor Author

bobbinth commented Mar 1, 2023

I'm not sure this will work because the call targets need to be statically defined - so, adv_call cannot read an arbitrary MAST root from the advice provider and execute it. Or I guess it can, but that MAST root would also need to be known to the verifier.

Basically, the verifier needs to see CALL node in the expected position in the MAST to know that everything was executed correctly (they don't really care about what was passed in to CALL but they need to know that it was a CALL operation). Unfortunately, in the current setup, knowing that it was a CALL operation also implies knowing the root of the called MAST.

One potential solution could be to have another type of a node in the MAST - e.g., CALLR. This would work exactly like CALL but the hash of CALLR node would be computed differently. Specifically:

  • For CALL we compute the hash as $hash_{call}(a,0)$ where $a$ is the root of the callee's MAST.
  • For CALLR we'd compute it as $hash_{callr}(a, r)$ where $r$ is a blinding factor (a random value).

By executing CALLR we'd know that a call was executed but what was called would be blinded by $r$ - so, resulting MAST root would be different every time.

We can also modify the semantics of the CALL operation to always work like this (and when you don't want randomization, you set $r = 0$). So, it may not be that big of a change - probably worth thinking it through a bit more.

For this PR, I'd implement it as is and probably would create an issue (or a discussion) in the Miden VM repo.

@frisitano
Copy link
Contributor

Thank you for the explanation. I think this feature would be quite useful. I've gone ahead and implemented this PR as is. Issue has been created here.

@frisitano
Copy link
Contributor

I've updated the PR it's not entirely clear to me how the consumed notes and created notes will be hashed so I've gone for a naive implementation of hashing sequentially in order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants