Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Powdr ASM: Allow machines to be linked via permutaions #1073

Closed
georgwiese opened this issue Feb 20, 2024 · 8 comments
Closed

Powdr ASM: Allow machines to be linked via permutaions #1073

georgwiese opened this issue Feb 20, 2024 · 8 comments
Labels

Comments

@georgwiese
Copy link
Collaborator

georgwiese commented Feb 20, 2024

We currently link machines via lookups. Using permutations instead can be cheaper and allows us to distinguish blocks that are actually used from dummy blocks (relevant for #1055). This is an example from Polygon's zkEVM.

We don't have a way to generate this from ASM right now.

A few notes:

  • There would be a witness column for each permutation (used as the selector, Polygon calls these result). They should be constrained to be boolean and such that at most one is active.
  • It would be great if we had a way to specify a machine without it needing to know how many permutations will be used to call into it!

Proposed Spec (TLDR of my comments below)

We add a third machine argument to the machine declaration, like so:

machine MyMachine(LATCH, operation_id, selectors) {

    operation foo<2> x -> y;
    // ...

}

Also, machine operations can be called via permutations (instead of lookups) if the instruction is declared like this:

instr foo X -> Y = submachine.foo;

This compiles to:

instr_foo {2, X, Y} is (main_submachine.LATCH * main_submachine.selectors[<global counter for this machine>]) {
    main_submachine.operation_id, main_submachine.x, main_submachine.y
}

Also, in the MyMachine scope, the compiler adds:

col witness selectors[<number of permutations to this machine>];

// Selectors should be boolean
std::array::map(selectors, std::utils::force_bool);
@georgwiese
Copy link
Collaborator Author

georgwiese commented Feb 24, 2024

I just realized this about connecting machines via permutation (perhaps related to #1077):
There are two ways to distinguish different operations of the same machine:

1: Use different selectors
This is what we do for memory right now:

instr_mload { ... } is mem.is_read { ... }
instr_mstore { ... } is mem.is_write { ... }
instr_mstore_bootloader { ... } is mem.is_bootloader_write { ... }

2: Use an extra argument
This is closer to what we do with the operation_id in block machine, and also what Polygon does. E.g. for memory, they do this:

instr_m_op {m_is_write, ...} is mem.op { mem.is_write, ... }

In our case (where we have a third operation, the "bootloader write"), we could do this:

instr_m_op {m_operation, ...} is mem.op { mem.operation_id, ... }

Here, m_operation comes from the ROM. We could use 0 for read, 1 for write, and 2 for bootloader write. The memory machine could then bit-decompose the operation ID: operation_id = 2 * is_bootloader_write + is_write. Overall, this would save one witness column in the main machine (one instruction flag + one operation ID instead of 3 instruction flags) and add one witness column in the memory machine (mem.operation_id).

I think option 2 is usually better, especially if the number of operations is high (because the number of witness columns in the main machine is always 2, regardless of the number of operations). Also, it is more consistent with how we already do things in the block machine. The only difference between a block machine that's connected via permutation vs lookup is that the latch is a witness column (actually, a set of witness columns, one for each permutation argument), otherwise the PIL would be essentially the same (i.e., we'd also have an auto-generated operation_id, constrained to be constant within the block).

But option 2 isn't currently supported in ASM (perhaps it is after #1077, see my comment in the issue). Until that's fixed, we could do this:

instr_mload { 0, ... } is mem.selectors[0] { mem.operation_id, ... }
instr_mstore { 1, ... } is mem.selectors[1] { mem.operation_id, ... }
instr_mstore_bootloader { 2, ... } is mem.selectors[2] { mem.operation_id, ... }

This would work and be a small change to the status quo, but it would create a few redundant columns in the memory machine (e.g. selectors[1] == mem.is_write).

@georgwiese
Copy link
Collaborator Author

With that, let me suggest an ASM-to-PIL conversion.

Status Quo

This is a a sketch of how we currently define block machines:

machine MyBlockMachine(LATCH, operation_id) {

    operation foo<0> x -> y;

    col witness operation_id, x, y;
    col fixed LATCH = [0, 0, 0, 1]*;

    // Some constraints

}

machine Main {

    MyBlockMachine submachine;

    reg pc[@pc];
    reg X[<=];
    reg Y[<=];

    instr foo X -> Y = submachine.foo;

    function main {
        // Assembly program
    }
}

This compiles to the lookup:

instr_foo {0, X, Y} in main_submachine.LATCH {
    main_submachine.operation_id, main_submachine.x, main_submachine.y
}

Also, some constraints are added to the block machine to force the operation ID to be constant within a block:

    pol constant _block_enforcer_last_step = [0]* + [1];
    // BTW, this should be an intermediate polynomial!
    pol commit _operation_id_no_change;
    (_operation_id_no_change = ((1 - _block_enforcer_last_step) * (1 - LATCH)));
    ((_operation_id_no_change * (operation_id' - operation_id)) = 0);

Connecting via Permutations

We add a field to the machine declaration:

machine MyBlockMachine(LATCH, operation_id, used) {
    ...

This used polynomial is not declared in ASM, but can be used in constraints to check whether we're in a row of a block that actually corresponds to a call to this machine, as opposed to dummy blocks to fill up unused rows. If the machine has been connected via a lookup, this value will be meaningless (ideally, using it would fail!).

Then, there is a new syntax to say that a machine should be connected via a permutation, for example:

    instr foo X -> Y ~ poseidon.poseidon_permutation;

This compiles to:

instr_foo {0, X, Y} is (main_submachine.LATCH * main_submachine.selectors[0]) {
    main_submachine.operation_id, main_submachine.x, main_submachine.y
}

Note that:

  • As mentioned in the comment above, using a selector for each instruction is not ideal, but this could be a good first step.
  • Also, we have to support a variable number of selectors anyway if we support multiple machines calling into the same machine (e.g. Machines communicating through shared memory #1055)
  • The actual selectors is not selectors[0] but LATCH * selectors[0] (maybe we can find a better naming), so this does not constraint selectors[0] on other rows.

And to the submachine, we add:

// Note that the length of this array is only known at compile time,
// because it depends on the number of permutations that connect to this machine.
col witness selectors[1];

// Selectors should be boolean
std::array::map(selectors, std::utils::force_bool);

// Define the `used` polynomial
let used: expr = std::array::sum(selectors);

// At most one selector should be active
std::utils::force_bool(used);

// Like the operation ID, the selectors should not change within a block
std::array::map(selectors, |s| ((_operation_id_no_change * (s' - s)) = 0));

@georgwiese georgwiese changed the title Allow machines to be linked via permutaions Powdr ASM: Allow machines to be linked via permutaions Feb 24, 2024
@georgwiese
Copy link
Collaborator Author

georgwiese commented Feb 24, 2024

Putting all of this together, a classic read/write memory (no bootloader write) would could be specified like this:

machine Memory(LATCH, is_write, used) {
    // Not sure if the latch is mandatory?
    col fixed LATCH = [1]*;

    operation mload<0> addr -> val;
    operation mstore<1> addr, val ->;

    // is_write is our operation ID.
    // If we assume that only valid operation IDs are used (which I think we can, because it comes
    // from the ROM!), we don't actually need to constraint is_write and is_read to be boolean and
    // exclusive.
    col witness is_write;
    col is_read = used * (1 - is_write);

    // The rest of the constraints can be copy-pasted from here:
    // https://github.com/powdr-labs/powdr/blob/daffb65c0c6931c855c3dd35ba4fa317032b2287/riscv/src/compiler.rs#L681-L730
}

@georgwiese
Copy link
Collaborator Author

Actually, perhaps instead of the "magic" constraints added by the compiler, we should just pass the selectors array directly to the machine, which would then be responsible to constraint it further. So the compiler would only add:

col witness selectors[<number of permutation>];

// Or, should it also add this? After all, the selectors are to be used as a selector in a permutation and
// I'm not sure if any non-boolean value will be valid.
std::array::map(selectors, std::utils::force_bool);

But if we do that, then we should also remove the auto-generated constraints that force the operation_id to stay constant in a block. Otherwise we have magic in some areas but not in others.

@leonardoalt
Copy link
Member

I haven't read the whole thing yet, but I'd say the least magic the better

@pacheco
Copy link
Collaborator

pacheco commented Mar 8, 2024

From discussing with @georgwiese:

When a block machine (A) calls into another machine (B) via permutation, we need to properly handle A's "dummy" blocks such that they are not included in the permutation argument to B.

When A is only called into via permutations, the following used flag could be used to identify valid blocks:

// `used` is an intermediate polynomial, i.e., not materialized
let A_used = sum(A_selectors);
force_bool(A_used);
map(A_selectors, |s| unchanged_until(s, LATCH));

The used flag would then be used on the LHS selector of permutations from A to B, and being 0 on dummy blocks, would disable those rows for the permutation.

If A is called via lookups we still need to identify dummy blocks.
We could use the same mechanism above by using one of the selector entry for all lookups into A.
Lookups into A would have the form:
instr_foo {1, ...} in A_latch {A_selectors[0], ...}

This way, sum(A_selectors) effectively identifies valid blocks.
Also, if every machine would have a selector array, we don't have to declare them anymore.
Should there be need to reference them in the future, we could add a built-in for that.

@georgwiese
Copy link
Collaborator Author

Lookups into A would have the form:
instr_foo {1, ...} in A_latch {A_selectors[0], ...}

Or instr_foo {...} in (A_latch * A_selectors[0]) {...}. I think that would be better because:

  • It's the same as for permutation which is a bit cleaner.
  • The algorithm for computing the dummy block can also be shared and is simpler: Instead of detecting that the first argument is "special" and should be 0 in the dummy block, we can just solve for A_latch * A_selectors[0] = 0 (as we already do for machines connected via permutations).
  • I think it might be easier to optimize away an otherwise unused selector that just restricts the RHS (if the LHS is a a subset of the restricted RHS, it is also of the unrestricted)

@georgwiese
Copy link
Collaborator Author

Fixed by #1126

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants