Blinded CALL targets (PCALL instruction proposal) #728

frisitano · 2023-03-01T09:28:52Z

It may be valuable to hide the call target when executing a CALL operation on the VM. A use case for this arises when executing and proving client side transactions. To maximise privacy and minimise information leakage we would need to hide note script roots and the transaction script root. To read more about this use case and discussed solutions see this issue in miden-base here. I have extracted a proposed solution from @bobbinth for your convenience below:

Basically, the verifier needs to see CALL node in the expected position in the MAST to know that everything was executed correctly (they don't really care about what was passed in to CALL but they need to know that it was a CALL operation). Unfortunately, in the current setup, knowing that it was a CALL operation also implies knowing the root of the called MAST.

One potential solution could be to have another type of a node in the MAST - e.g., CALLR. This would work exactly like CALL but the hash of CALLR node would be computed differently. Specifically:

For CALL we compute the hash as $hash_{call}(a,0)$ where a is the root of the callee's MAST.

For CALLR we'd compute it as $hash_{callr}(a,r)$ where r is a blinding factor (a random value).

By executing CALLR we'd know that a call was executed but what was called would be blinded by r - so, resulting MAST root would be different every time.

We can also modify the semantics of the CALL operation to always work like this (and when you don't want randomization, you set r=0). So, it may not be that big of a change - probably worth thinking it through a bit more.

The text was updated successfully, but these errors were encountered:

bobbinth · 2023-03-01T19:46:31Z

I think I missed one detail in the above description. To compute hash of a CALL block, the verifier would need to know both $a$ and $r$. So, with just a single CALL block we don't achieve the desired property.

One way to fix this is to wrap one call inside another. Let's say $a$ is a hash of a CALL block computed as $hash_{call}(b, r)$ where $b$ is the script we want to "hide". The, we can compute the hash of this outer CALL block as $hash_{call}(a, 0)$. This way, the verifier can be sure that a CALL operation was used to invoke some code, but the contents of $b$ would be hidden from them.

It does feel a bit "hacky" to wrap one call inside another for this purpose. So, I wonder if there is a better approach.

frisitano · 2023-03-03T06:29:29Z

One way to fix this is to wrap one call inside another. Let's say a is a hash of a CALL block computed as hashcall(b,r) where b is the script we want to "hide". The, we can compute the hash of this outer CALL block as hashcall(a,0). This way, the verifier can be sure that a CALL operation was used to invoke some code, but the contents of b would be hidden from them.

It does feel a bit "hacky" to wrap one call inside another for this purpose. So, I wonder if there is a better approach.

I don't think this sounds particularly hacky. Adding a layer of indirection to achieve this functionality would be worthwhile in my opinion. The value add of the new capabilities are quite exciting. The way in which we structure transaction programs could change significantly if we had this feature. How do you see this as a priority @bobbinth?

bobbinth · 2023-03-03T09:13:43Z

I think there is more to think through here and the more I think about it, the more it seems like we might need to either have a new instruction in the VM or define a new block type in the MAST.

For example, we want both to hide the hash of the executed MAST subtree, but also bind it to something. The changes I described above don't really provide a good way to verify that the target of the call is bound to something we care about.

Specifically, we want the second CALL to be restricted somehow. For example, we want to say something like "for this call, make sure that the value on the top of the stack is the same as the call target". But I'm not sure how to do this yet.

frisitano · 2023-03-03T14:33:49Z

So the construct I was thinking about would achieve a form of dynamic dispatch. E.g. the function (MAST) that we want to execute is specified at runtime via the top word on the stack. The call target MAST would be provided via non-determinism and the decoder would assert internal consistency of said MAST. In essence, it would check to ensure that the executed MAST hashes to the value provided via the stack. However this MAST block value would not be encoded into the input program MAST tree.

I think to achieve this we will need to introduce a new block type in the MAST and a new opcode. This block type should be opaque - in essence to the caller / input program it should have a constant block hash value. However as mentioned, internally we will need to verify consistency. To execute this block type we would have a new opcode lets call it pcall (private call) which pops a word off the stack and creates two rows in the decoder. The first is a row of hiding - e.g. a PCALL operation with no value (or some predetermined value) provided in the decoder hasher. The second row would be the internal CALL operation (the row for binding), with it's children hashes populating the decoder hasher rate. The execution the proceeds in typical CALL fashion. At the end of a PCALL block we would have two END rows in the decoder. The first would be the termination of the CALL block where we would assert internal consistency of the CALL MAST root and check to assert it matches the value that we popped of the top of the stack. The second would be the wrapper row to signal termination of the PCALL block.

This does have significant implications regarding the uniqueness of a program based on it's MAST root. This property would no longer hold if said opaque block type is included in a MAST. Instead it would provide uniqueness in the structure of all branches above opaque blocks.

This could have implications on the virtual machine and the protocol but could also prove valuable. We may consider only making this opcode available in the root context. I thin this does add some additional complexity to the VM but seems like a cool feature, even if only to provide some fun discussion.

bobbinth · 2023-03-04T15:56:57Z

Very interesting! Yes, I think something like this will work - though, we need to think through the exact mechanics a bit more (e.g., I wonder if two rows in the decoder are required - we might be able to get away with relatively few changes). And I do think that PCALL would need to be restricted to the root context only.

However, adding this to the VM would be pretty challenging now as we don't have any "slots" for new control flow operations available (i.e., see here). So, the first thing we need to figure out is how to add more operations to the VM. I have some idea about how to support 8 more operation slots at the expense of one extra trace column - but these thoughts are still in a very early stage.

frisitano · 2023-03-05T10:05:32Z

However, adding this to the VM would be pretty challenging now as we don't have any "slots" for new control flow operations available (i.e., see here). So, the first thing we need to figure out is how to add more operations to the VM. I have some idea about how to support 8 more operation slots at the expense of one extra trace column - but these thoughts are still in a very early stage.

Right I see. I'm keen to hear more about your thoughts on how we could go about increasing instruction slots and how the pcall instruction implementation would look.

Regarding the implications of this construct to the transaction kernel, I think this would remove the need to build a transaction program per transaction. Instead we would have a single standardised transaction program which would execute using dynamic dispatch against the notes and tx script provided via the advice provider. Furthermore, I think this would remove the requirement for the prover to provide commitments to note script roots and tx script root. This would be beneficial for privacy and reducing bandwidth requirements.

bobbinth · 2023-03-29T06:22:30Z

I'm keen to hear more about your thoughts on how we could go about increasing instruction slots and how the pcall instruction implementation would look.

First, for context, our current scheme for computing operation flags is described here. This scheme requires 8 columns: 7 for the opcode and one for degree reduction. The layout of these columns is as follows:

$b_6$	$b_5$	$b_4$	$b_3$	$b_2$	$b_1$	$b_0$	$e$	# of ops	flag degree
0	x	x	x	x	x	x	0	64	7
1	0	x	x	x	x	-	0	16	6
1	1	x	x	x	-	-	1	8	4

Where column $e$ is used for degree reduction and is always computed as $e = b_6 \cdot b_5$. As can be seen from above, this scheme gives us 88 operations (64 supporting degree 2 constraints, 16 supporting degree 3 constraints, and 8 supporting degree 4 constraints).

By introducing one more column for degree reduction, we could get to 96 operations. The layout for this could look as follows:

$b_6$	$b_5$	$b_4$	$b_3$	$b_2$	$b_1$	$b_0$	$e_0$	$e_1$	# of ops	flag degree
0	x	x	x	x	x	x	0	0	64	7
1	0	0	x	x	x	-	0	0	8	6
1	0	1	x	x	x	x	1	0	16	5
1	1	x	x	x	-	-	0	1	8	4

Where both columns $e_0$ and $e_1$ are used for degree reduction and are computed as $e_0 = b_6 \cdot b_5$ and $e_1 = b_6 \cdot (1 - b_5) \cdot b_4$. With these scheme, we'd have the following operation groups:

64 operations supporting max constraint degree of 2 (same as before).
8 operations supporting max constraint degree of 3.
16 operations supporting max constraint degree of 4.
8 operations supporting max constraint degree of 5 (same as before).

Right now, adding a single column to the trace would result in a disproportionate impact (i.e. could slow down the VM by 5% or more). The reason for this is that for maximum efficiency we want the number of columns to be divisible by 8, and adding an extra column would push us to 73 columns total. But if we figure out how to remove a a column or if it runs out that we need to go up to the next multiple of 8, adding this column could be justified.

bobbinth · 2023-09-22T23:52:52Z

With #1055 we've added dynamic nodes to program MAST. However, this capability is not yet exposed to the user. To expose it, we should add instructions to Miden assembly. I'm thinking these instructions could be as follows:

dynexec - this will start executing the block specified via the top of the stack without switching context. In terms of program MAST, this should translate directly into adding a DYN node to the MAST.
dyncall - this will start executing the block specified via the top of the stack in a new context. In terms of program MAST, this will add a CALL block with the target set to DYN block.

bobbinth · 2023-09-23T00:13:42Z

Another thing which may be desirable to add is to limit the ability to make DYN blocks only to the root context. A simple, though, undesirable way to do it is to introduce a new column - e.g., in_root_context. This column would be set to 1 when we are in the root context, and to 0 otherwise.

Another potential approach is to use the info in the ctx column. This column is set to 0 when we are in the root context, and is set to other values when we are not. A challenge with using this column is that we need a helper register somewhere to contain the inverse of ctx whenever we want to check whether ctx is 0 or not. I'm not yet sure which register we could use for this.

bitwalker · 2023-09-23T01:05:30Z

dynexec - this will start executing the block specified via the top of the stack without switching context. In terms of program MAST, this should translate directly into adding a DYN node to the MAST.

dyncall - this will start executing the block specified via the top of the stack in a new context. In terms of program MAST, this will add a CALL block with the target set to DYN block.

Something that occurs to me while reading up on DYN, is that we don't have a way to know the hash of the block we want to dynamically execute in the compiler - this hash is computed by the assembler, and doesn't happen until runtime. I think we actually need a third meta-instruction, dynref.NAME (or something to that effect), which pushes the hash of a specified function (NAME) on the stack. The hash is known/computed by the assembler, so this is basically just a convenience for precomputing the hash and pushing it on the stack using push.XXX - in other words, it's not actually an instruction in the VM, but something recognized/expanded by the assembler.

This would be exceedingly useful for the compiler, since it would allow us to defer those low level details to the assembler, and avoid any duplication of effort.

Definitely looking forward to this though!

Another thing which may be desirable to add is to limit the ability to make DYN blocks only to the root context.

Is this desirable? That would limit its usefulness greatly IMO. If the only way to execute a DYN block is to do so in the root context, it would be difficult to use it in the compiler in any meaningful way, since we largely can't assume too much about what context a given function will be executed in (with the exception of kernel functions).

As far as I understand it, there isn't a downside to allowing it in all contexts, I would assume that DYN blocks inherit the current context, whatever it is. The dynamic block is required to be available to the VM or execution traps; and if there was a reason to limit what blocks could be dynamically executed, I would think it's more useful to restrict DYN blocks to those which are correspond to the entry of a function.

In any case, I'd like to know more about this, since it seems like it would have a significant impact on how it can be used by the compiler.

bobbinth · 2023-09-23T15:39:05Z

Something that occurs to me while reading up on DYN, is that we don't have a way to know the hash of the block we want to dynamically execute in the compiler - this hash is computed by the assembler, and doesn't happen until runtime. I think we actually need a third meta-instruction, dynref.NAME (or something to that effect), which pushes the hash of a specified function (NAME) on the stack. The hash is known/computed by the assembler, so this is basically just a convenience for precomputing the hash and pushing it on the stack using push.XXX - in other words, it's not actually an instruction in the VM, but something recognized/expanded by the assembler.

Good idea! I think adding this shouldn't be too difficult. I would only suggest that we use procref.NAME format as I think it is a bit more clear that we are getting a "reference" to a procedure.

Is this desirable? That would limit its usefulness greatly IMO. If the only way to execute a DYN block is to do so in the root context, it would be difficult to use it in the compiler in any meaningful way, since we largely can't assume too much about what context a given function will be executed in (with the exception of kernel functions).

My understanding is that dynamic dispatch is generally undesirable in the context of smart contracts as it makes static analysis and formal verification of programs much more difficult. For example, Move does not have dynamic dispatch (but maybe I'm misunderstanding something here).

The ideal approach, in my opinion, would be to have it as a VM parameter. Specifically:

When this parameter is set to true, we would allow DYN blocks started in any context.
When it is set to false, we'd allow DYN blocks started only in the root context.

But i'm not sure yet how to support such parameter (as opposed to hard-coding the functionality).

bitwalker · 2023-09-23T22:31:09Z

My understanding is that dynamic dispatch is generally undesirable in the context of smart contracts as it makes static analysis and formal verification of programs much more difficult.

Not all dynamic dispatch destroys analysis/formal verification, it depends on the language and what is permitted at the language level. One thing Miden provides is the ability to prove the execution trace of the program (i.e. you know exactly what was executed, even with dynamic dispatch).

That said, there's a difference between making dynamic dispatch undesirable vs impossible. I would think this is a choice that can be left up to the language/protocol which is targeting Miden. We could easily have a compiler flag that raises an error if the IR it is given contains dynamic dispatch.

The ideal approach, in my opinion, would be to have it as a VM parameter.
When this parameter is set to true, we would allow DYN blocks started in any context.
When it is set to false, we'd allow DYN blocks started only in the root context.

I would think it would be better to instead have the parameter disable the use of DYN entirely when the parameter is set, i.e. the VM traps if it encounters DYN-related instructions. After all, you either are OK with DYN, or you aren't, and if you aren't, you probably don't want it anywhere. Trying to restrict the context using a config options feels a bit like saying "trust us, we know what we're doing with DYN" in the kernel, while forbidding it for everyone else.

I think DYN is most useful for procedure calls though, rather than smart contract calls, so I'd actually be OK with restricting the use of DYN to the former (or making it configurable via parameter). That avoids one of the worst vulnerabilities that dynamic dispatch introduces in smart contracts (AFAIK, the risk is that someone manipulates things so that a different contract is called with funds it shouldn't receive, things like that). If you disallow DYN contract calls, that can't happen.

Anyway, I'll have to think on this a bit more, but I'm hoping we can find a middle ground here where DYN is still something we can largely assume is available in the compiler. Rust code is particularly likely to have some forms of dynamic dispatch, trait objects are an obvious case, and trying to eliminate all such dispatch will likely make for a frustrating developer experience. Supporting different "modes" in the VM seems acceptable, but it would be better if it's all-or-nothing, rather than mixed.

bobbinth · 2023-10-04T21:40:08Z

Closed by #1078 and #1080. I will create separate issues for some follow-up work on dynamic calls (i.e., procref instruction and specifying call targets via memory reference).

frisitano mentioned this issue Mar 1, 2023

Implement ProvenTransaction object 0xPolygonMiden/miden-base#38

Closed

bobbinth added enhancement New feature or request processor Related to Miden VM processor labels Mar 1, 2023

grjte changed the title ~~Blinded CALL targerts~~ Blinded CALL targets (PCALL instruction proposal) Apr 18, 2023

grjte mentioned this issue Apr 18, 2023

Tracking issue: finalize execution trace generation in the VM #867

Closed

grjte mentioned this issue Apr 28, 2023

Increase VM instruction slots #884

Closed

bobbinth assigned grjte Jun 30, 2023

bobbinth added the v0.7 Needed for v0.7 release of the VM label Jun 30, 2023

bobbinth mentioned this issue Jun 30, 2023

Tracking PR for v0.7.0 release #979

Merged

22 tasks

bobbinth removed the v0.7 Needed for v0.7 release of the VM label Jul 20, 2023

bobbinth added this to the v0.7 milestone Jul 20, 2023

This was referenced Sep 27, 2023

Add dynexec assembly instruction, tests, and docs #1078

Merged

Add dyncall assembly instruction, tests, and docs #1080

Merged

bobbinth closed this as completed Oct 4, 2023

This was referenced Oct 4, 2023

Implement procref instruction #1090

Closed

Consider imposing restrictions on DYN operation #1092

Open

frisitano mentioned this issue Oct 16, 2023

Introduce PCALL operation in TransactionKernel 0xPolygonMiden/miden-base#279

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blinded CALL targets (PCALL instruction proposal) #728

Blinded CALL targets (PCALL instruction proposal) #728

frisitano commented Mar 1, 2023 •

edited

bobbinth commented Mar 1, 2023

frisitano commented Mar 3, 2023

bobbinth commented Mar 3, 2023 •

edited

frisitano commented Mar 3, 2023 •

edited

bobbinth commented Mar 4, 2023

frisitano commented Mar 5, 2023 •

edited

bobbinth commented Mar 29, 2023 •

edited

bobbinth commented Sep 22, 2023

bobbinth commented Sep 23, 2023

bitwalker commented Sep 23, 2023 •

edited

bobbinth commented Sep 23, 2023

bitwalker commented Sep 23, 2023

bobbinth commented Oct 4, 2023

Blinded CALL targets (PCALL instruction proposal) #728

Blinded CALL targets (PCALL instruction proposal) #728

Comments

frisitano commented Mar 1, 2023 • edited

bobbinth commented Mar 1, 2023

frisitano commented Mar 3, 2023

bobbinth commented Mar 3, 2023 • edited

frisitano commented Mar 3, 2023 • edited

bobbinth commented Mar 4, 2023

frisitano commented Mar 5, 2023 • edited

bobbinth commented Mar 29, 2023 • edited

bobbinth commented Sep 22, 2023

bobbinth commented Sep 23, 2023

bitwalker commented Sep 23, 2023 • edited

bobbinth commented Sep 23, 2023

bitwalker commented Sep 23, 2023

bobbinth commented Oct 4, 2023

frisitano commented Mar 1, 2023 •

edited

bobbinth commented Mar 3, 2023 •

edited

frisitano commented Mar 3, 2023 •

edited

frisitano commented Mar 5, 2023 •

edited

bobbinth commented Mar 29, 2023 •

edited

bitwalker commented Sep 23, 2023 •

edited