Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add new parallel implementation for permute_expression_pair #189

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jonathanpwang
Copy link

A new algorithm to get (A', S') that is fully multi-threaded: this is a different algorithm than the original permute_expression_pair_seq.

I observed that the previous computation was still single-threaded at some places, which becomes a bottleneck for larger circuits. This is because the way A', S' need to be permuted is rather esoteric and not so parallel friendly.

My implementation isn't optimized by any means: I just aggressively use rayon and fold on BTreeMap (in Axiom's repo I used HashMap so I'm not sure of the performance difference with BTreeMap).

Also I'm not sure why indexing into Range wasn't implement on Polynomial before but it was on RangeTo. I could also add it for RangeFrom if there's interest.

get (A', S') that is fully multi-threaded: this is a different algorithm
than the original `permute_expression_pair_seq`
@CPerezz CPerezz self-requested a review July 7, 2023 09:17
Copy link
Member

@CPerezz CPerezz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to analyze the algorithm in depth. So far looks well. Will give it another look later today.

Also, left some comments. And wanted to mention that it would be nice to see at least some numbers in regards performance changes/memory consumption changes so that we know whether this is feasible to be merged.

if input_ranges.is_empty() {
input_ranges.push((coeff, 0..count));
} else {
let prev_end = input_ranges.last().unwrap().1.end;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're already checking for empty range, then we should unwrap

Suggested change
let prev_end = input_ranges.last().unwrap().1.end;
let prev_end = unsafe{ input_ranges.last().unwrap_unchecked().1.end};

},
)
.reduce_with(|r1, mut r2| {
let r1_end = r1.last().unwrap().1.end;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how we know we will never panic here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each r1 is the result of the previous fold step. As long as the fold is over nonempty iterator, the output should be nonempty. So r1 is nonempty unless input_uniques is empty I believe.

Comment on lines +484 to +488
// didn't want to bother with Sync rng or anything so just do this part sequentially
let blinding: Vec<(C::Scalar, C::Scalar)> = (usable_rows..params.n() as usize)
.into_iter()
.map(|_| (C::Scalar::random(&mut rng), C::Scalar::random(&mut rng)))
.collect();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm We can maybe file an issue in case we see this being critical.

@CPerezz
Copy link
Member

CPerezz commented Jul 7, 2023

Also I'm not sure why indexing into Range wasn't implement on Polynomial before but it was on RangeTo. I could also add it for RangeFrom if there's interest.

I'm planning to write a polynomial lib for ff/group-based libs. Something similar to ark-poly in arkworks. And not only for univariate but for multivariate polynomials too..
Will it be interesting to you guys? @jonathanpwang

@jonathanpwang
Copy link
Author

Still need to analyze the algorithm in depth. So far looks well. Will give it another look later today.

Also, left some comments. And wanted to mention that it would be nice to see at least some numbers in regards performance changes/memory consumption changes so that we know whether this is feasible to be merged.

Sure, what kind of machine do you usually bench on? Not sure my Macbook will be a good standard heh.

I'm planning to write a polynomial lib for ff/group-based libs. Something similar to ark-poly in arkworks. And not only for univariate but for multivariate polynomials too..

Sounds nice! Right now I don't have to use polynomial stuff much outside of just halo2_proofs, but having all the traits so you can use them as slices would be very helpful. Things like this indexing stuff, AsRef<[F]> and Deref<[F]> etc.

@CPerezz
Copy link
Member

CPerezz commented Jul 10, 2023

Sure, what kind of machine do you usually bench on? Not sure my Macbook will be a good standard heh.

I bench on my laptop too. It has 16CPUs. So enough to see if it's indeed better performance-wise.
I'd say with your mac it should be fine to see the general improvement tendencies. Otherwise @ed255 might know how we can bench that in our servers.

@ed255
Copy link
Member

ed255 commented Jul 10, 2023

Sure, what kind of machine do you usually bench on? Not sure my Macbook will be a good standard heh.

I bench on my laptop too. It has 16CPUs. So enough to see if it's indeed better performance-wise. I'd say with your mac it should be fine to see the general improvement tendencies. Otherwise @ed255 might know how we can bench that in our servers.

We have done some end-to-end benchmarking on AWS servers with number of cores between 16 and 128. We haven't really studied the bench results with high number of cores, so I think for this PR benchmarks with 8 or 16 cores should be good enough

@CPerezz
Copy link
Member

CPerezz commented Jul 12, 2023

@jonathanpwang Did you collect some numbers on your Mac?

@CPerezz
Copy link
Member

CPerezz commented Aug 1, 2023

ping @jonathanpwang

For now, the benchmarks on the `bench_lookup` I added are:

Using `permute_expression_pair_par`:
```
Benchmarking bench-lookup/14: Warming up for 3.0000 s[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 20.425916ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 20.426375ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 20.261ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.694833ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 20.660875ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.887375ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 20.239875ms

Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 6.8s.
Benchmarking bench-lookup/14: Collecting 10 samples in estimated 6.8417 s (10 iterations)[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 20.548208ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.5575ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 20.678708ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 20.956375ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 20.183791ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.00175ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.986916ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.358875ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 21.128708ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 18.92425ms
bench-lookup/14         time:   [678.46 ms 686.72 ms 694.16 ms]
                        change: [-0.7211% +1.2598% +3.1373%] (p = 0.25 > 0.05)
                        No change in performance detected.
Benchmarking bench-lookup/15: Warming up for 3.0000 s[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 40.454916ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.66425ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.290333ms

Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 12.5s.
Benchmarking bench-lookup/15: Collecting 10 samples in estimated 12.503 s (10 iterations)[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 40.871916ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 39.03175ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 44.727416ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 42.948333ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 40.489958ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 43.823041ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 39.592ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 40.593375ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.861708ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 45.023333ms
bench-lookup/15         time:   [1.2285 s 1.2341 s 1.2393 s]
                        change: [-6.0038% -4.4123% -2.7223%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) low mild
  1 (10.00%) high mild
Benchmarking bench-lookup/16: Warming up for 3.0000 s[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 92.282041ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 90.784875ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 95.368958ms

Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 23.9s.
Benchmarking bench-lookup/16: Collecting 10 samples in estimated 23.937 s (10 iterations)[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 93.599166ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 95.992583ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 91.913625ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 89.482625ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 90.111875ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 86.671916ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 96.854666ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 102.468125ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 97.830583ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 87.925708ms
bench-lookup/16         time:   [2.3417 s 2.3644 s 2.3901 s]
                        change: [+2.2283% +4.1485% +6.0108%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
Benchmarking bench-lookup/17: Warming up for 3.0000 s[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 214.102916ms

Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 45.0s.
Benchmarking bench-lookup/17: Collecting 10 samples in estimated 45.000 s (10 iterations)[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 199.65025ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 208.088875ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 208.299666ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 199.684416ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 199.761666ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 193.034458ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 202.182375ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 200.825375ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 226.314541ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 211.914291ms
bench-lookup/17         time:   [4.3987 s 4.4299 s 4.4605 s]
                        change: [+0.7989% +1.9668% +3.0962%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Benchmarking bench-lookup/18: Warming up for 3.0000 s[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 423.016291ms

Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 85.7s.
Benchmarking bench-lookup/18: Collecting 10 samples in estimated 85.748 s (10 iterations)[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 451.549291ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 469.336ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 429.5375ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 430.579041ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 435.976541ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 416.241875ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 423.361041ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 436.833625ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 456.685458ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 454.897541ms
bench-lookup/18         time:   [8.5101 s 8.5631 s 8.6175 s]
                        change: [+0.5067% +1.6515% +2.8407%] (p = 0.02 < 0.05)
                        Change within noise threshold.
```

Using `permute_expression_pair_seq`:
```
Benchmarking bench-lookup/14: Warming up for 3.0000 s[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.35325ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.101125ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 18.721708ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.291333ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 18.860208ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 18.553916ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 18.965375ms

Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 6.9s.
Benchmarking bench-lookup/14: Collecting 10 samples in estimated 6.8584 s (10 iterations)[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.117458ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 18.80025ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.169875ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.03325ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 18.636166ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.170166ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.000416ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 18.565958ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 18.970333ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 19.007291ms
bench-lookup/14         time:   [679.11 ms 687.62 ms 696.42 ms]
                        change: [-1.5707% +0.1313% +1.8844%] (p = 0.89 > 0.05)
                        No change in performance detected.
Benchmarking bench-lookup/15: Warming up for 3.0000 s[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.634625ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 40.915958ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 40.774625ms

Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 12.6s.
Benchmarking bench-lookup/15: Collecting 10 samples in estimated 12.569 s (10 iterations)[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.548583ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.074333ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.807125ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.106458ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.222541ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.021458ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.411666ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.024416ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.636541ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 41.869166ms
bench-lookup/15         time:   [1.2474 s 1.2597 s 1.2711 s]
                        change: [+0.9856% +2.0736% +3.1604%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Benchmarking bench-lookup/16: Warming up for 3.0000 s[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 90.202208ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 89.318083ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 88.717125ms

Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 23.8s.
Benchmarking bench-lookup/16: Collecting 10 samples in estimated 23.789 s (10 iterations)[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 89.929041ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 88.316333ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 89.588083ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 88.630916ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 88.872ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 88.961416ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 88.796833ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 89.80725ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 89.426916ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 89.039833ms
bench-lookup/16         time:   [2.3537 s 2.3854 s 2.4165 s]
                        change: [-0.7038% +0.8909% +2.5108%] (p = 0.33 > 0.05)
                        No change in performance detected.
Benchmarking bench-lookup/17: Warming up for 3.0000 s[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 187.758583ms

Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 43.2s.
Benchmarking bench-lookup/17: Collecting 10 samples in estimated 43.177 s (10 iterations)[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 187.717833ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 185.975333ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 187.108625ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 187.965416ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 188.279458ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 188.287166ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 188.590291ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 186.355708ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 186.600458ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 188.600875ms
bench-lookup/17         time:   [4.3965 s 4.4531 s 4.5112 s]
                        change: [-0.8127% +0.5248% +2.0814%] (p = 0.52 > 0.05)
                        No change in performance detected.
Benchmarking bench-lookup/18: Warming up for 3.0000 s[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 405.103375ms

Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 86.2s.
Benchmarking bench-lookup/18: Collecting 10 samples in estimated 86.153 s (10 iterations)[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 406.387125ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 402.833916ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 403.729208ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 404.397583ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 404.277833ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 409.32825ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 404.037583ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 403.349875ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 405.868083ms
[halo2_proofs/src/plonk/lookup/prover.rs:418] start.elapsed() = 403.907458ms
bench-lookup/18         time:   [8.5238 s 8.5804 s 8.6488 s]
                        change: [-0.7212% +0.2021% +1.2400%] (p = 0.70 > 0.05)
                        No change in performance detected.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
```
@jonathanpwang
Copy link
Author

Sorry, was busy. It was also a bit hard because there was no existing lookup benchmarking using a real prover. I originally wanted to test just the permute_expression_pair function, but due to crate privacy issues this wasn't possible.

I added a very basic lookup bench, which is not very comprehensive since the lookups are very uniform. It uses IPA, but I don't think that matters for benchmarking lookup permutation.

For now, the benchmarks on the bench_lookup I added are in this commit message: e0b4479

On my laptop (M2Max), it seems the parallelized version isn't any better, but I think this is because the circuit I used is too simple. I'd like to do the benchmarks on the zkevm keccak circuit, but I need to resolve some versioning issues for that first. I will post an update once I have those benchmarks.

@alexander-camuto
Copy link

alexander-camuto commented Aug 24, 2023

@jonathanpwang @CPerezz We have some pretty large circuits that we could benchmark on if that helps. Would need to update this branch with #192 once merged for it to work -- then can send some numbers yonder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants