Preliminary commit for min const generics #182

jon-chuang · 2021-01-18T20:58:15Z

Description

To do:

match to select literal loop-unrolled impl for square/mul (= very verbose macro-expanded source)...
Get rid of weird artifacts due to [T; N * 2] where N is generic being impossible

Tests should pass on nightly 1.51.

Thoughts:

~~limitations still exist for min const generics, especially for our literal-dependent loop unrolling use case, and also const eval with generics.~~ Currently, this has been worked around
Not sure how much real gains there are, yes fewer macros, but many breaking changes for imports and for many copy-pasta like BigInt::<4>([0, 0, ..])

Update:
TODO:

Check for performance regressions

Thoughts:

Currently, mul_without_reduce is instantiated using loops that are not forced to be unrolled, so it is unclear what the compilers behaviour will be. It is only used in runtime code when NO_CARRY is false however. So I think its not a big issue. Still.

Currently, this is how mul_assign looks macro expanded:

 impl<'a, P: FpParams<N>, const N: usize> MulAssign<&'a Self> for Fp<P, N> {
                #[inline]
                fn mul_assign(&mut self, other: &Self) {
                    if P::NO_CARRY {
                        let input = &mut (self.0).0;
                        let other_ = (other.0).0;
                        match N {
                            1 => mul_assign_id1::<P, N>(input, other_),
                            4 => mul_assign_id4::<P, N>(input, other_),
                            5 => mul_assign_id5::<P, N>(input, other_),
                            6 => mul_assign_id6::<P, N>(input, other_),
                            12 => mul_assign_id12::<P, N>(input, other_),
                            13 => mul_assign_id13::<P, N>(input, other_),
                            _ => {
                                ::core::panicking::panic("internal error: entered unreachable code")
                            }
                        };
                        self.reduce();
                    } else {
                        *self = self.mul_without_reduce(other, P::MODULUS, P::INV);
                        self.reduce();
                    }
                }
            }

So, rather neat. But notice we define functions for each n limbs. Since we can control exactly which numbers we instantiate for via the limb_instantiation!(1, 4, 5, 6, 12, 13); macro, I think unsavoury code bloat is substantially reduced.

Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

Targeted PR against correct branch (main)
Linked to Github issue with discussion and accepted design OR have an explanation in the PR that describes this work.
Wrote unit tests
Updated relevant documentation in the code
Added a relevant changelog entry to the Pending section in CHANGELOG.md
Re-reviewed Files changed in the Github PR explorer

Pratyush · 2021-01-19T18:52:42Z

ff/src/fields/arithmetic.rs

        #[inline]
-        #[ark_ff_asm::unroll_for_loops]
-        fn into_repr(&self) -> $BigIntegerType {
+        // #[ark_ff_asm::unroll_for_loops]


Do you feel like the compiler can still optimize this well? Or should we try to do some manual unrolling inside the macro?

Well, for now, since we can no longer take advantage of literals in macro instantiations, I've decided one should have match arms and do the loop unroll for each of the possible num limbs.

This is extremely ugly, and I'm not sure if there's a workaround; the only plausible workaround for unrolling loops parametrised by const is proposal for compiler hints #[optimize(..)] that is not moving atm.

So, basically, this is not the final form

One can make the actual impl very compact, but under the hood, with macro expand etc it will be very ugly, just like our asm code selecting the num_limbs match arms

I think I will simply modify the proc macro, exposing a new one that creates the match arms up to some limit, which can be a meta attr that can be parsed as usize for the match limit.

I think, for a specific problem, which is copying the data back over in each of the match arms, since there would be a type mismatch, it might make sense to do an unsafe transmute of a pointer; however, this would violate our desire to have safe code.

ff/src/biginteger/mod.rs

Pratyush · 2021-01-19T19:11:35Z

Amazing, thanks for this!

* match to select literal loop-unrolled impl for square/mul (= very verbose macro-expanded source)...

Maybe we can use techniques from the C++ world for this? Eg: https://www.informit.com/articles/article.aspx?p=30667&seqNum=7

It could get complicated though.

* Get rid of weird artifacts due to `[T; N * 2]` where N is generic being impossible

Hmm that's unfortunate; it's due to this issue: rust-lang/rust#43408

* Not sure how much real gains there are, yes fewer macros, but many breaking changes for imports and for many copy-pasta like `BigInt::<4>([0, 0, ..])`

Can't the breaking changes be worked around by still exporting {BigInteger,Fp}{256/320/384...}? Also, there would be nice gains in compile time, as the compiler is simply doing much less work.

Pratyush · 2021-01-19T19:14:25Z

ff/src/fields/models/fp.rs

+};
+use ark_serialize::*;
+
+pub trait FpParams<const N: usize>: FpParameters<BigInt = BigInt<N>> {}


I think we can get rid of this trait now, no?

Well, for now, I use it to instantiate NO_CARRY as a const param. Since otherwise, it was troublesome to ensure it would evaluate at compile time.

jon-chuang · 2021-01-21T07:01:26Z

Maybe we can use techniques from the C++ world for this? Eg: https://www.informit.com/articles/article.aspx?p=30667&seqNum=7

It could get complicated though.

Well for this, Im not sure if it will work; notice the metaprogramming examples use literals as the param. I think something a lot of people would like is the ability to do metaprogramming based on const. Unfortunately const is something evaluated at MIR level, so there is no longer any syntax to do metaprogramming with. Still, it would be really cool to have functionality to do metaprogramming at MIR level.

Can't the breaking changes be worked around by still exporting {BigInteger,Fp}{256/320/384...}? Also, there would be nice gains in compile time, as the compiler is simply doing much less work.

We could. But, one wonders if it is better to make the breaking change; personally I'm torn about this.

jon-chuang · 2021-01-21T13:01:58Z

So I would say at this point that Im quite happy with state of PR, except for having checked for performance regressions and careful review. Also the number of unresolved questions about API and so on. But the core of the code is looking pretty good if you ask me.

jon-chuang · 2021-01-21T19:02:18Z

Here is some preliminary data for the performance regression tests.

There appears to be some problems with the serialization, and more pertinently, the into_repr implementations. Based on old benchmarks however - #80 and arkworks-rs/snark#176, it does seem that into_repr is doing just fine, and the old impl is too fast - which implies something nefarious might be going on.

Apart from that, things seem alright.

So I discovered with serialize/deserialize, and discovered that using vec! instead of array is causing the slowdown, now it is fixed.

So perf wise, I would say there are no issues.

Jonch/const generics

Preliminary commit for min const generics

dfd7c14

jon-chuang force-pushed the master branch from 65c7185 to dfd7c14 Compare January 18, 2021 21:06

Pratyush reviewed Jan 19, 2021

View reviewed changes

ff/src/biginteger/mod.rs Outdated Show resolved Hide resolved

Pratyush reviewed Jan 19, 2021

View reviewed changes

ff/src/biginteger/mod.rs Outdated Show resolved Hide resolved

Pratyush reviewed Jan 19, 2021

View reviewed changes

jon-chuang force-pushed the master branch from bee23c8 to aba07ee Compare January 21, 2021 08:55

make NO_CARRY a const param now, instantiate fns for each # $limbs

295dc5f

jon-chuang force-pushed the master branch 2 times, most recently from d30db1d to db1878a Compare January 21, 2021 09:38

instantiate only for desired limbs

00418a9

jon-chuang force-pushed the master branch from db1878a to 00418a9 Compare January 21, 2021 09:40

jon-chuang added 2 commits January 21, 2021 18:00

include unroll for into_repr

599bdeb

Unify impl_..._ops_from_ref to take const X: type and mult params

b568f31

jon-chuang force-pushed the master branch 2 times, most recently from 84ccc09 to 1590ee0 Compare January 21, 2021 12:31

massively reduce repetition in ops macro

a809a67

jon-chuang force-pushed the master branch from 1590ee0 to a809a67 Compare January 21, 2021 12:36

Fix some errors, cargo +nightly test passes.

e52556c

jon-chuang force-pushed the master branch from 1df5554 to e52556c Compare January 21, 2021 14:57

jon-chuang added 2 commits January 21, 2021 23:23

Fix some errors, cargo +nightly test passes.

52ee6db

contain in module

8421107

jon-chuang added 3 commits January 22, 2021 12:54

fix deserialize perf issues via vec! -> array

5c4b812

Merge branch 'master' into jonch/const_generics

7feddec

Merge pull request #1 from jon-chuang/jonch/const_generics

14b74a2

Jonch/const generics

jon-chuang closed this Jan 22, 2021

jon-chuang mentioned this pull request Jan 22, 2021

min const generics #186

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preliminary commit for min const generics #182

Preliminary commit for min const generics #182

jon-chuang commented Jan 18, 2021 •

edited

Loading

Pratyush Jan 19, 2021

jon-chuang Jan 19, 2021

jon-chuang Jan 19, 2021

jon-chuang Jan 19, 2021

jon-chuang Jan 19, 2021

jon-chuang Jan 21, 2021

Pratyush commented Jan 19, 2021

Pratyush Jan 19, 2021

jon-chuang Jan 21, 2021

jon-chuang commented Jan 21, 2021 •

edited

Loading

jon-chuang commented Jan 21, 2021

jon-chuang commented Jan 21, 2021 •

edited

Loading

Preliminary commit for min const generics #182

Preliminary commit for min const generics #182

Conversation

jon-chuang commented Jan 18, 2021 • edited Loading

Description

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Pratyush commented Jan 19, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jon-chuang commented Jan 21, 2021 • edited Loading

jon-chuang commented Jan 21, 2021

jon-chuang commented Jan 21, 2021 • edited Loading

jon-chuang commented Jan 18, 2021 •

edited

Loading

jon-chuang commented Jan 21, 2021 •

edited

Loading

jon-chuang commented Jan 21, 2021 •

edited

Loading