Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impl StepCircuit for sha256 gadget in example #26

Merged
merged 8 commits into from
Oct 11, 2023

Conversation

cyphersnake
Copy link
Collaborator

Close #12

@cyphersnake
Copy link
Collaborator Author

The need to require the Circuite characteristic is very much in the way. I have to support both the old and new APIs, which forces me to make input consistency checking optional. This makes the code very overcomplicated.

I'll do formal enforcement of this rule for now, but maybe we don't need it

@cyphersnake cyphersnake marked this pull request as ready for review October 9, 2023 10:15
@cyphersnake
Copy link
Collaborator Author

cyphersnake commented Oct 9, 2023

The biggest problem was that I needed to support two APIs that were very poorly matched. So far I've left the Circuit trait with just todo!() instead of really supporting it. It doesn't seem to be very necessary if the gadget is not developed immediately for our requirements of operating first and end by cells. Please share your thoughts.

Otherwise, I've added a copy check to input and return output as cell. Since the output of the hash function is 8 long and the input must be at least 16, I simply double the input

If clarification is needed, let me know, we can have a meaningful chat.

I haven't finished cutting out the two using scenarios yet (the methods are just copied instead of replaced) so that if we do need the original circuit functionality, we can fix it quickly

Copy link
Collaborator

@chaosma chaosma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be easier to do the code review if we can separate the PR into (1) the code format (2) real change. @cyphersnake

@drouyang
Copy link
Contributor

drouyang commented Oct 9, 2023

Since the goals of this example include showing how simple it is to integrate a third-party circuit, we'd like to keep our changes minimal.

Reformatting is probably not necessary, and should definitely be in a separated PR.

An ideal show case is that we only need to touch a few lines of a third-party circuit.

@cyphersnake
Copy link
Collaborator Author

cyphersnake commented Oct 9, 2023

it will be easier to do the code review if we can separate the PR into (1) the code format (2) real change.

There's almost no formatting (except for imports, but those are easy to ignore), I just removed some of the code and added the needed code in the same file.

To minimally modify the gadget, I made a sha256 module in main.rs, but I can't separate it into a file because of the name. I can try to make a cleaner diff, but it will take time and because of the module's structure it is unlikely to give a big increase in understandability. WIP

Since the goals of this example include showing how simple it is to integrate a third-party circuit, we'd like to keep our changes minimal.

This circuit doesn't fit that purpose at all. We discussed with @chaosma that we will modify the simple example to demonstrate simplicity and use this monster only for benchmark.

cc @chaosma @drouyang

@drouyang
Copy link
Contributor

drouyang commented Oct 9, 2023

Then it could be a deeper problem.

Does it mean that it's easier to integrate some circuits, eg simpler circuits, compared with more complex ones?

What about a Poseidon hash circuit https://github.com/snarkify/poseidon_circuit

@drouyang
Copy link
Contributor

drouyang commented Oct 9, 2023

I checked how Celer Network benchmarked Nova with sha256

celer-network/Nova@7e16456#diff-6eef5816093a020b24dafa7957c3ce6cd9b301a82758832c6cfb931487499e05R3

Seems the implementation was from bellman or bellperson from Protocol Labs

@drouyang
Copy link
Contributor

drouyang commented Oct 9, 2023

In celer network's implementation, there is a step circuit implementation and the rest is benchmarking code. Can we achieve the same? @chaosma @cyphersnake

@chaosma
Copy link
Collaborator

chaosma commented Oct 9, 2023

In celer network's implementation, there is a step circuit implementation and the rest is benchmarking code. Can we achieve the same? @chaosma @cyphersnake

It worth to take a look.

There's almost no formatting (except for imports, but those are easy to ignore), I just removed some of the code and added the needed code in the same file.

There are several places that I am not sure if it is formatting or your change unless very carefully read the code. The problem now depends on whether you mixed the formatting and actually in one commit or not. It is relatively easy to separate the PR if not mixed.

@cyphersnake
Copy link
Collaborator Author

Does it mean that it's easier to integrate some circuits, eg simpler circuits, compared with more complex ones?

Halo2 doesn't have any good implementation standards at all, everyone does completely different projects on API design. Therefore, this particular circuit is adapted with a creak, because they do not operate with cells, but only with values.

The transition is quite trivial, but if you add an attempt to minimally modify the source code, it turns out to be quite tricky.

In celer network's implementation, there is a step circuit implementation and the rest is benchmarking code. Can we achieve the same?

Specifically with this gadget, it seems not. We can look for a better candidate, with a large codebase, rather than a just simple example.

It's all just about the design of the API. We need to have cells on input and output and keep track of where exactly they do assign_advice relative to input (to add copy_constraint to permutation). However, here it's behind three layers of abstraction.

@cyphersnake
Copy link
Collaborator Author

There are several places that I am not sure if it is formatting or your change unless very carefully read the code. The problem now depends on whether you mixed the formatting and actually in one commit or not. It is relatively easy to separate the PR if not mixed.

I reviewed again, the only formatting here is removing the Sha256Instructions take from main.rs and modifying the imports. I can separate this into a separate commit little later

@drouyang
Copy link
Contributor

drouyang commented Oct 9, 2023

Specifically with this gadget, it seems not. We can look for a better candidate, with a large codebase, rather than a just simple example.

It's all just about the design of the API. We need to have cells on input and output and keep track of where exactly they do assign_advice relative to input (to add copy_constraint to permutation). However, here it's behind three layers of abstraction.

API is what our users care most. Could you help me understand why the nova API is easier to integrate sha256? Is it inherently easier to design an API for folding r1cs?

If the information required by our APIs is behind three layers of abstraction, does it means that it's difficult for our library to integrate many existing circuits?

Is it possible to improve our API? Not regarding this particular PR, but how our API is designed is more important than choosing an example to make it work.

@cyphersnake
Copy link
Collaborator Author

@drouyang

The point is that inside halo2 we can operate with table cells (AssignedCell) inside which there are already Value with the field elements themselves. For copy_constraint (which will check that the cells are equal) we need them. When we supply the output of the previous step as input, we will also need to prove the fact that it is these supplied cells that are used as input.

Some projects, make their gadgets not easily reusable for more complex schemes, as this is not a typical scenario. For example halo2_gadget::sha256 takes u32 numbers as input and returns them as output. This is handy if you use the gadget as a simple circuit, but even if you want to use it as a part of several gadgets, you will be faced with difficulties and modifications.

There is no problem to operate at the level of external API - AssignedCell. It's a matter of designing specific solutions and I'll look tomorrow for examples of gadgets that will be easy for us to integrate.

As for Nova - there's just a smaller zoo of solutions and no such layering of abstractions. As far as I understand from studying existing projects, halo2 has evolved rapidly and approaches to writing gadgets have evolved, so there is no single style that we can easily adapt to.

In short, the minimum we require from the schema is meta-information about where the input and output lies in the table, in which cell. If the gadget hides this from us, modification is required. Not all solutions hide this information deeply.

@cyphersnake
Copy link
Collaborator Author

There is a second way - try to modify halo2 so that we can query already assigned cells and take meta-information from Layouter (responsible for assigning information to an existing table). But this option is much more difficult to support and implement and may not be theoretically possible, research needed. But here I'm more indicating a potential opportunity

@drouyang
Copy link
Contributor

drouyang commented Oct 9, 2023

Thank you for the information. Sounds like the circuit has to be written in specific ways to make it easy to integrate with Sirius.

Modifying halo2 makes our library more general, but it's also a longer shot. We can explore this option later.

@drouyang
Copy link
Contributor

drouyang commented Oct 9, 2023

Is this one easier than the sha256 circuit from halo2? https://github.com/filecoin-project/bellperson/blob/master/src/gadgets/sha256.rs

@drouyang
Copy link
Contributor

drouyang commented Oct 9, 2023

Just checked out example/sha256.

Based on my experience with hash algorithms, we are working on an unnecessarily complicated sha256 circuit codebase. I suggest pausing and revisiting our options.

The sha256 gadget from halo2 looks like lookup-based, where the table16 implementation is complicated.
https://github.com/snarkify/sirius/tree/main/examples/sha256/table16

The bellperson sha256 circuit used to benchmark nova is simple: https://github.com/filecoin-project/bellperson/blob/master/src/gadgets/sha256.rs

If our goal is to compare with Nova, then our current approach will not result in an apple-to-apple comparison.

@drouyang drouyang self-requested a review October 9, 2023 19:27
@drouyang
Copy link
Contributor

Discussed with @chaosma and the Celer Network. Turns out the bellperson circuit is not applicable to halo2 because it was r1cs-based. Celer network also used the halo2 sha256 gadget for (non folding) benchmarking. So we will stick with the halo2 sha256 gadget. Chao will review the necessary changes. cc @cyphersnake

in `StepCircuit::synthesize` method
This commit enhances the `sha256` function in `examples/sha256/main.rs` by implementing
 a feature that converts field values to block words. With this change,
 field values are first transformed into field_value (by `Repr` form), then into
 `u32` type before being converted into a block word. These block words are eventually
 passed as an argument to the `digest`
We didn't need a dependency on the usual `trait Circuit`, this on the contrary complicates the example code
- Incorrect associated type was used
- The original impl on Circuit is now unnecessary
@cyphersnake
Copy link
Collaborator Author

@chaosma Checked manually that everything was calculated correctly & copy-constraint work

self.cur_block.extend_from_slice(rem);
type B = pallas::Base;
// TODO
const ARITY: usize = BLOCK_SIZE / 2;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) When we choose ARITY to be a different value, e.g. ARITY=BLOCK_SIZE, what do we do for the input/output of StepCircuit? it seems in this case we don't need double the output for the next iteration.

(2) If we choose ARITY=1, then we could split the output of 256 bits into 8 chunks/blocks. This way both z_in, z_out has arity 1. And we are just calculating sha256^{n}(input). But I understand this approach need more modification.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Our output size is always 8 bytes. Because double hash is not allowed, we double the input
  2. Yes, in this case, this function will have to independently prove the conversion of output and input

Table16Chip::load(config.clone(), layouter)?;
let table16_chip = Table16Chip::construct(config);

let input: [AssignedCell<B, B>; BLOCK_SIZE] = iter::repeat(z_in.iter())
Copy link
Collaborator

@chaosma chaosma Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we choose ARITY=BLOCK_SIZE, it seems that we don't need to repeat

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to double the output, which is worse, because the output is a hash

@cyphersnake cyphersnake merged commit df71595 into main Oct 11, 2023
1 check passed
@cyphersnake cyphersnake deleted the 12-stepcircuit-sha256-example branch October 11, 2023 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add StepCircuit support
3 participants