Addition of shortcuts for bases that are powers of two and for base 10 to mp_radix_size #371

czurnieden · 2019-10-13T02:59:06Z

Both shortcuts are implemented as the internal functions s_mp_radix_size_radix_10 and s_mp_log_power_of_two so it would be easy to make a function mp_radix_size restricted to the bases {2,4,8,10,16,32,64} if that is wanted for version 2.0.0.

sjaeckel · 2019-10-14T12:52:24Z

bn_s_mp_radix_size_radix_10.c

+                const uint64_t inv_log_2_10 = {0x4d104d427de7fbccULL};
+             when MP_8BIT got the boot.
+    */
+   const uint16_t inv_log_2_10[4] = {0x4d10u, 0x4d42u, 0x7de7u, 0xfbccu};


how about

const uint8_t inv_log_2_10[] = {0x4du, 0x10u, 0x4du, 0x42u, 0x7du, 0xe7u, 0xfbu, 0xccu}; mp_int bi_bit_count, bi_k, t; int i, bit_count; if ((err = mp_init_multi(&bi_bit_count, &bi_k, &t, NULL)) != MP_OKAY) { return err; } if ((err = mp_from_ubin(&bi_k, inv_log_2_10, sizeof(inv_log_2_10))) != MP_OKAY) { return err; } ...

Yes, that would be simpler (and most likely even faster) but I didn't want to add a dependency on a function that you might not use elsewhere and once MP_8BIT is gone it is down to just two mp_get_u32, one big-shift and and one big-add.

But if you like it more with mp_from_ubin: drop me a note and I'll change it, no problem.

I would prefer from_ubin which is the canonical importer for large integers from binary data. Alternatively you statically initialize the const mp_int, but then you have to recompute the array for each supported digit size.

Added a branch for MP_16BIT which is the only one left, since MP_8BIT got the boot, to need that part. Every other size must support 64 bit integers to function, which allow for a simple mp_set_u64 for that rest.

Ok, but please unroll the loop. set, mul, set, add.

sjaeckel · 2019-10-14T13:20:29Z

👍

minad · 2019-10-14T16:41:35Z

Can we discuss this later for 2.0 and concentrate on finishing 1.2 now?

sjaeckel · 2019-10-14T17:03:57Z

Can we discuss this later for 2.0 and concentrate on finishing 1.2 now?

fine by me, can you please put milestones on the open issues that you think should still go into 1.2?

minad · 2019-10-14T18:15:10Z

Ok, I will do. If nothing remains will you create a separate branch and we start on 2.0 in develop?

czurnieden · 2019-10-14T19:12:21Z

@sjaeckel Yes, I must admit, I like this one most, too.

@minad

Can we discuss this later for 2.0 and concentrate on finishing 1.2 now?

Don't worry, we're still threadsafe.

Ok, I will do. If nothing remains will you create a separate branch and we start on 2.0 in develop?

I hope you're talking to @sjaeckel here? ;-)

minad · 2019-10-22T19:26:55Z

@czurnieden alternatively to #368 use this, which optimizes the function for base 10 only? And then don't provide mp_radix_overestimate?

minad · 2019-10-22T19:27:13Z

What do you prefer?

czurnieden · 2019-10-22T20:00:34Z

alternatively to #368 use this, which optimizes the function for base 10 only?

This one has the extra functions for base 10 and powers-of-two and does the rest with mp_log

And then don't provide mp_radix_overestimate?

The original mp_radix_overestimate (#369) is exact, not an overestimate (was an error in the test-rig) so it might be an alternative. Fast (O(1) with little overhead, especially since MP_8BIT is gone) but nearly twice the size.

The second version, #368 , is a really rough one (but can be tamed down with an additional small table) and is, in my opinion, not really an alternative, even with the extra table.

What do you prefer?

I prefer this one: O(1) for base 10 and powers-of-two and the the rest with mp_log (O(log n)). There are not much cases where you need anything more than the bases 2, (8), 16, 10, and 64.
It is slightly bigger than #368 (if #368 has the extra table), though, but not much

And there was the discussion that we restrict LTM to these bases which can be done with this PR quite quickly: just add a test for the bases given and replace the call to mp_log with a call to s_mp_log_power_of_two.

Yes, I really like this one.

And now back to rebaseing *sigh* ;-)

…0 to mp_radix_size

minad · 2019-10-22T20:13:31Z

I don't want to restrict ltm to only a few bases. But we could offer optimized versions for base 10 for sure.

Is the following a correct summary?

~~Minimal mp_radix_size_overestimate, very large error #368 - O(1) estimation, small, error < 29?~~
mp_radix_size replacement O(1) for all bases, large tables #369 - O(1) exact replacement for mp_radix_size, but larger than current mp_log based version
Addition of shortcuts for bases that are powers of two and for base 10 to mp_radix_size #371 - mp_radix_size specialised for base 10 in O(1). Power of two bases already optimized via mp_log.

If we choose 2. or 3. we wouldn't need the overestimate function? This would make the API nicer. And we could still have the slower log fallback available via conditional MP_HAS compilation of that is preferred.

Edit: so it is either 2 or 3.

minad · 2019-10-22T20:22:21Z

I think I agree with you then - we should take this version.

minad · 2019-10-22T20:27:53Z

s_mp_radix_size_radix_10.c

+
+#define LTM_RADIX_SIZE_SCALE 64
+#define LTM_RADIX_SIZE_CONST_SHIFT 32
+int s_mp_radix_size_radix_10(const mp_int *a, size_t *size)


@czurnieden why don't you add a function s_mp_log10 which is called from mp_log and used indirectly bymp_radix_size?

I was, and still am, a bit torn between calling that optimization in mp_log or in mp_radix_size.

Calling it in mp_radix_size allows for easy reduction to the restricted radix-set 2,4,8,10,16,32,64 and would also getting rid of the dependency mp_log which isn't a very small function.

Calling it in mp_log is cleaner if we want to keep the full radix-set.

Mmh…
*strokes sesquipedalian beard*
I don't know.

Well, but you can strip down mp_log by disabling s_mp_log and only enabling the power of two and base 10 versions. We had that discussion in #389. I think it should go to mp_log since it is more general.

If you strip down 'mp_log' you don't have a general log-function anymore.

If you strip down 'mp_radix_size' you still have a radix-size function, just not for the small range of radices 2-64, only for the smaller range of powers-of-two and base 10.

You expect of a log-function that it works over the whole range, with no holes in it.

You expect from a radix-size function to work over a very small range, it might even have holes in it. Restricting the string out/input to only a handfull of bases, sometimes down to only two (10 and 16) won't get many complains (vid. e.g.: printf).

Yes, but we don't get an optimized log function if we add the base10 optimization only to radix_size. We already have other functions with holes in it if configured as such.

czurnieden · 2019-10-22T20:45:45Z

If we choose 2. or 3. we wouldn't need the overestimate function?

Yepp, exactly.

minad · 2019-10-22T21:30:17Z

bn_mp_log_u32.c

+/* SPDX-License-Identifier: Unlicense */
+
+/* Compute log_{base}(a) */
+static mp_word s_pow(mp_word base, mp_word exponent)


This file is called mp_log_u32.c in develop

Ah, great, knew that this large renaming would get me at some point! ;-)

czurnieden · 2019-10-25T20:56:41Z

I think I'll put this to rest, too and bury it beside #401

minad · 2019-10-25T23:45:18Z

@czurnieden so shall we consider adding mp_radix_size_overestimate again?

czurnieden · 2019-10-26T01:57:28Z

so shall we consider adding mp_radix_size_overestimate again?

Back to where we started? Ok.
But which one?

The full table for all bases [-0, +1]
a. The smallest full table with the brutal error [-0,+28,000]
b. The small full table with the not-so-brutal error [-0,+200] (upper limit approx.)
Powers-of-two (from mp_log?) [±0] and base 10 only [-0,+1]

(You may add one to the upper limit for the sign to skip testing for the sign)

Using the full tables makes only sense when we have a fast number conversion for all bases. I hacked something together to give @MasterDuke17 an example of how he might be able to extend his Barrett_toDecimal. I can clean it up (it's a bit of a mess) but it still will be relatively large.

On the other side: the Barrett_toDecimal seems to work quite well as it is now and is not very large. The only thing left is to enlarge the leafs (4 decimal digits is way too small, 500-600 bits (tunable) seems to be a better cut-off) and maybe make use of the fact that 2 * 5 = 10.

So: shall all bases belong to us, or shall we restrict versions ≥2.0.0 to the small set {2, 4, 8, 10, 16, 32, 64}?

minad · 2019-10-26T05:27:05Z

We shall not restrict the bases but we can provide faster to_radix/to_radix_overestimate for 10, 2^n. This is what I would like :)

czurnieden requested review from minad and sjaeckel October 13, 2019 03:14

sjaeckel reviewed Oct 14, 2019

View reviewed changes

minad added this to the v2.0.0 milestone Oct 14, 2019

czurnieden force-pushed the radix_size_exact_table branch from eae202a to 5968fcc Compare October 14, 2019 18:49

czurnieden mentioned this pull request Oct 14, 2019

mp_radix_size replacement O(1) for all bases, large tables #369

Closed

czurnieden force-pushed the radix_size_exact_table branch 2 times, most recently from ae47fef to 393bf3d Compare October 15, 2019 19:13

minad added feedback required work in progress labels Oct 16, 2019

czurnieden force-pushed the radix_size_exact_table branch 2 times, most recently from b7a420e to bbdd25c Compare October 19, 2019 18:34

czurnieden added 4 commits October 22, 2019 22:05

Addition of shortcuts for bases that are powers of two and for base 1…

fd7c847

…0 to mp_radix_size

adaption of mp_radix_size to new mp_log_u32

223e7fc

Addition of BN_MP_LOG_U32 to tommath_superclass.h

ccedd6a

rebase

aab7e6c

further refinement in s_mp_radix_size_radix_10

ae492e0

czurnieden force-pushed the radix_size_exact_table branch from bbdd25c to ae492e0 Compare October 22, 2019 20:25

minad reviewed Oct 22, 2019

View reviewed changes

minad mentioned this pull request Oct 22, 2019

Minimal mp_radix_size_overestimate, very large error #368

Closed

added branch for MP_16BIT to s_mp_radix_size_radix_10

39cc86e

minad reviewed Oct 22, 2019

View reviewed changes

czurnieden added 2 commits October 22, 2019 23:35

unrolled loop

9bf64b3

cleanup

04c1332

minad mentioned this pull request Oct 22, 2019

add s_mp_log10 #401

Closed

czurnieden closed this Oct 25, 2019

minad mentioned this pull request Oct 25, 2019

mp_radix_size_overestimate #415

Closed

Addition of shortcuts for bases that are powers of two and for base 10 to mp_radix_size #371

Addition of shortcuts for bases that are powers of two and for base 10 to mp_radix_size #371

Uh oh!

Conversation

czurnieden commented Oct 13, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sjaeckel commented Oct 14, 2019

Uh oh!

minad commented Oct 14, 2019

Uh oh!

sjaeckel commented Oct 14, 2019

Uh oh!

minad commented Oct 14, 2019

Uh oh!

czurnieden commented Oct 14, 2019

Uh oh!

minad commented Oct 22, 2019

Uh oh!

minad commented Oct 22, 2019

Uh oh!

czurnieden commented Oct 22, 2019

Uh oh!

minad commented Oct 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

minad commented Oct 22, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

czurnieden commented Oct 22, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

czurnieden commented Oct 25, 2019

Uh oh!

minad commented Oct 25, 2019

Uh oh!

czurnieden commented Oct 26, 2019

Uh oh!

minad commented Oct 26, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

minad commented Oct 22, 2019 •

edited

Loading