Skip to content

Conversation

@czurnieden
Copy link
Contributor

Also changed mp_radix_size to use mp_ilogb while deprecating it at the same time and rewrote mp_fwrite to use mp_radix_sizeinbase instead.

See comment in bn_mp_fwrite.c for some of the consequences.

@czurnieden
Copy link
Contributor Author

Oh, using deprectated functions (added test for mp_radix_size in test.c) is a full error, ok.
But only with GCC not clang?

I don't understand the Windows error but I assume it is for the same reason?

@czurnieden
Copy link
Contributor Author

May I propose (again) to kill MP_8BIT?
'Cause it's killing me! ;-)

@sjaeckel
Copy link
Member

sjaeckel commented Sep 7, 2019

May I propose (again) to kill MP_8BIT?
'Cause it's killing me! ;-)

let it rot, disable what doesn't work and add it to the deprecation list, we'll remove it after the next release

@czurnieden
Copy link
Contributor Author

let it rot, disable what doesn't work and add it to the deprecation list, we'll remove it after the next release

One of the handful of 8-bit MCUs which have enough Flash and RAM to run a stripped down LTM with enough circulation to make it worth the hassle are the bigger versions of ATxmega (and the biggest versions of ATmega to some extent). Will see if I can get my hands on one the next couple of days. No virtual MCUs here, I need the power consumption with the internal clock (datasheet gives only the values with an external clock?), too.
Is there some kind of main forum where most of the fanbois sit?

And how much work would it be?

czurnieden ~/GITHUB/libtommath O (radix_sizeinbase)$ find ./ -type f  -name '*c' -exec grep  MP_8BIT {} \; -print
#ifdef MP_8BIT
       * MP_8BIT (It is unknown if the Lucas-Selfridge test works with 16-bit
#if defined (MP_8BIT) || defined (LTM_USE_FROBENIUS_TEST)
#ifdef MP_8BIT
#ifdef MP_8BIT
./bn_mp_prime_is_prime.c
#ifndef MP_8BIT
#ifdef MP_8BIT
#ifdef MP_8BIT
./demo/test.c
#ifdef MP_8BIT
./demo/main.c
#ifndef MP_8BIT
./bn_mp_read_unsigned_bin.c
#ifndef MP_8BIT
./bn_mp_to_unsigned_bin.c
#ifndef MP_8BIT
   /* CZ TODO: Some of them need the full 32 bit, hence the (temporary) exclusion of MP_8BIT */
./bn_mp_prime_strong_lucas_selfridge.c
#ifndef MP_8BIT
./bn_prime_tab.c
      Since mp_radix_sizeinbase() can overshoot by one (two with MP_8BIT)
./bn_mp_fwrite.c
#if !defined(MP_8BIT)
#if defined(MP_64BIT) || !(defined(MP_8BIT) || defined(MP_16BIT))
./bn_mp_montgomery_setup.c
#ifdef MP_8BIT
#if ( (defined MP_8BIT) || (defined MP_16BIT) )
#if ( (defined MP_8BIT) || (defined MP_16BIT) )
#if ( (defined MP_8BIT) || (defined MP_16BIT) )
   /* There is no mp_set_u16 for MP_8BIT */
#if ( (defined MP_8BIT) && (INT_MAX > 0xFFFF))
#if ( (defined MP_8BIT) || (defined MP_16BIT) )
./bn_mp_radix_sizeinbase.c
#ifdef MP_8BIT
./mtest/mtest.c
#ifdef MP_8BIT
      /* (32764^2 - 4) < 2^31, no bigint for >MP_8BIT needed) */
./bn_mp_prime_frobenius_underwood.

Oh, that's not much, thought it to be more!
Plus wiki and "issue"-entry, bn.tex, ChangeLog, and the little hints you metioned here and there (deprecation).
Nothing I would call exhausting.
OK. let's make it official (in another PR).

Oh, am away tonight, now, actually, (brother-in-law has his 50th and who says no to a free dinner? ;-) ), so till tomorrow.

Copy link
Member

@sjaeckel sjaeckel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

How about calling this mp_radix_size_approx()? ... or I already thought if it'd probably make sense to have a library-setting to select between the two modes ... but that's probably too much and brings more problems than advantages...

@sjaeckel
Copy link
Member

sjaeckel commented Sep 7, 2019

OK. let's make it official (in another PR).

👍

Oh, am away tonight, now, actually, (brother-in-law has his 50th and who says no to a free dinner? ;-) ), so till tomorrow.

Enjoy & have fun!

@sjaeckel
Copy link
Member

sjaeckel commented Sep 8, 2019

How about calling this mp_radix_size_approx()?

or mp_radix_size_fast()?

@minad
Copy link
Member

minad commented Sep 10, 2019

mp_radix_size_overestimate?

@czurnieden
Copy link
Contributor Author

czurnieden commented Sep 11, 2019

mp_radix_size_overestimate?

*grmbl*

This commit is just the renaming of mp_radix_sizeinbase to mp_radix_size_overestimate, all the rest will be handled in #348 if necessary.

(helper.pl --update-files does not seem to format tommath.def and tommath_class.h correctly. Is that a known problem?)

@minad minad changed the title Addition of mp_radix_sizeinbase (behaviour like the function in GMP) Addition of mp_radix_size_overestimate (behaviour like the function in GMP) Sep 30, 2019
@minad
Copy link
Member

minad commented Oct 1, 2019

@czurnieden the lookup tables look rather large. Could we use smaller tables and a more crude estimate? I think that might be preferable for embedded systems. We will pay more for the allocations then but the constant costs of the tables would be smaller.

However allocators are usually rounding themselves so a more crude estimate might not be too bad.

@czurnieden
Copy link
Contributor Author

the lookup tables look rather large

They are 520 bytes(!) large (260 for 8-bit as long as we have MP_8BIT).
Half a kilobyte. We will drop 8-bit support in the near future, so the most memory-restricted MCUs will not get supported anymore at that time.

But I know where you are coming from and as I am there myself, too, from time to time, I understand.
Mmh…
I can offer a stripped down version with the same accuracy that is much smaller but only supports all powers of two (as they are calculated directly) and a hardcoded base 10. This would get rid of the table(s) and all the logic to handle the tables and will safe about 1k (probably less).

However allocators are usually rounding themselves so a more crude estimate might not be too bad.

You can split the tables in halve (e.g.: using a 16-bit type instead of a 32-bit one) but that would loose way too much of the accuracy. Yes, I played with it while designing this function but had a hard time to guarantee the -0 in the -0/+x tolerance with such insufficient accuracy without an "extra" on top. How large does that "extra" need to be? Hard to tell without a proper error-analysis.

The alternatives, e.g: a fixed-point log, would add a lot of logic which I doubt needs less memory than the tables.

So the only thing I can come up with is the functionally stripped down version described above.

Which lead me to the question: do we really need all the bases? We got a quick base-10 (didn't take a closer look, but looks quite hardcoded, don't know how much work it would be to extend it) and we can rewrite the old loop to get the power-of-two bases directly.

@minad
Copy link
Member

minad commented Oct 2, 2019

Ok. 500 bytes are small enough I guess. What's the status?

@minad
Copy link
Member

minad commented Oct 2, 2019

We could add a configuration option? MP_RADIX_ALL_BASES? Otherwise only enable 10 and power of 2?

@czurnieden
Copy link
Contributor Author

What's the status?

We are only waiting for your OK, so if you are OK with it, drop me a note and I will squash&finish.

We could add a configuration option? MP_RADIX_ALL_BASES? Otherwise only enable 10 and power of 2?

So no "squash&finish" here?
But serious: what kind of configuration, just the macro MP_RADIX_ALL_BASES or something a bit more sophisticated (whatever that might be)? I haven't implemented the only base 10 and powers of 2 yet, please give me a day (which means "wee hours of the night" in my case ;-) ) to do so.

@czurnieden
Copy link
Contributor Author

@minad just a quick hack as an example. Do you want something like that?

@minad
Copy link
Member

minad commented Oct 3, 2019

Hmm I am not sure if I like further complications of this function

@czurnieden
Copy link
Contributor Author

czurnieden commented Oct 3, 2019

We could add a configuration option? MP_RADIX_ALL_BASES? Otherwise only enable 10 and power of 2?

just a quick hack as an example

Hmm I am not sure if I like further complications of this function

Uhm…ooookaaaay?

So, can I take that as the answer "No." to my question

Do you want something like that?

?

@minad
Copy link
Member

minad commented Oct 3, 2019

@sjaeckel @czurnieden What do you think? Is this ready or push this after 1.2 since it is a strict addition? I would love if we could get it out sooner than later since there are already so many changes. After that we can get rid of the deprecated stuff hopefully. 2.0 can then take a while to mature before release, including additions like the this or faster mp_to_radix. Since 1.2 is more like a backward compatibility release I think that would be fine.

@czurnieden
Copy link
Contributor Author

Is this ready or push this after 1.2 since it is a strict addition?

It was ready for 1.2 but isn't anymore with the new branch MP_RADIX_ALL_BASES. We need to either define MP_RADIX_ALL_BASES per default, invert the branch with MP_RADIX_REDUCED_BASES, or adapt mp_to_radix, too (a bit more work but not that much).

But the biggest problem ist the reduction of the functionality itself. Not much has been changed up to commit 724db0a that couldn't get squeezed into 1.2 but reducing functionality is a bit much. Either invert the logic (MP_RADIX_REDUCED_BASES) or roll back to 724db0a for 1.2

@minad
Copy link
Member

minad commented Oct 3, 2019

MP_RADIX_BASES was just me thinking too loud. As I said, i am not sure if we want that due to the complications it brings. But we can still discuss all that.

My point was more to get 1.2 out at some point since I am getting worried with all the changes (API types, deprecations etc etc). 2.0 would give us a bit of a clean slate (think size_t issues, 8bit gone etc).

Copy link
Member

@sjaeckel sjaeckel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good IMO after those minor nitpicks

if ((e = mp_init(&bi_k_bis)) != MP_OKAY) {
/* No "goto" to avoid cluttering this code with even more preprocessor branches */
mp_clear_multi(&bi_bit_count, &bi_k, NULL);
*size = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just asked myself if 0 is really the good value to return on error... whatever we choose, for sure we should set it once on function-entry and not x-times on each error-case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just asked myself if 0 is really the good value to return on error

The returns on error are MP_VALs?
Yes, we could set size to -1 but as we want to change the type of all size-related variables to size_t later we shouldn't introduce things that we know get changed again later.

And also: if the user skips the error-check and use the resulting size for malloc directly and if that value is -1 and it is a 32 bit machine with sizeof(size_t) == 4 and there is enough RAM and and/or space in the swap-partition/file…

set it once on function-entry

Yepp, that's right, will do.

Ahm, now that you mentioned it, would it be a good idea to check if we have something to write on in the first place? I mean if (size == NULL) ...?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also: if the user skips the error-check

that was the origin of my thought...

and use the resulting size for malloc directly and if that value is -1 and it is a 32 bit machine with sizeof(size_t) == 4 and there is enough RAM and and/or space in the swap-partition/file…

😆

Ahm, now that you mentioned it, would it be a good idea to check if we have something to write on in the first place? I mean if (size == NULL) ...?

puh, that's nearly nowhere checked... I think yes, we should do it, but maybe we should first discuss it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but maybe we should first discuss it?

Hu? Oh, left it in after testing, sorry.


/* And before we forget it: one extra character for the minus sign */
if (a->sign == MP_NEG) {
(*size)++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to be GMP compatible makes somehow sense, but also I wouldn't care if we'd choose to be sometimes off-by-two or even off-by-X if it would make the implementation a lot easier.

@minad Doesn't this boil down to only a style question here? Or what do you see as an advantage of writing *size += sign ? 1 : 0; over the way it is now?

x >>= 1;
}
*size = bit_count/y;
rem = bit_count - ((*size) * y);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

speaking of "keep in sync" shouldn't base then be called radix or the other way around? :-)

Casting from "long" to "int" can be done because "bi_bit_count" fits into an "int"
by definition.
*/
*size = (int)mp_get_l(&bi_bit_count) + 1 + 1 + (a->sign == MP_NEG);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please just write + 3 here

@czurnieden
Copy link
Contributor Author

Tried several variations but decide for the simplest one, the "minimalistic version".
Absolute error is quite high for very large input but OK for normal use.

100% tested.

Excerpt from the logs:

First occurence of an error equal to 10

base = radix
bits = floor(log_2(a))
largest error at a = 2^(2^31)

base      bits   floor(log_2(bits))      largest error
 2          n/a      n/a                         3
 3     11250067       23                      1339
 4          n/a      n/a                         3
 5      1315969       20                     11427
 6     29921688       24                       505
 7       531174       19                     28327
 8          n/a      n/a                         3
 9     22499649       24                       671
10      2694087       21                      5584
11      1033948       19                     14543
12     57550267       25                       264
13    361691491       28                        44      
14       976518       19                     15402
15      3533931       21                      4258
16          n/a      n/a                         3
17      1933701       20                      7779
18     38933860       25                       389
19     46634755       25                       325
20      4559686       22                      3300
21      1280545       20                     11747
22      1718219       20                      8753
23     60041844       25                       253
24     94134750       26                       162
25      2631938       21                      5715
26    583596936       29                        28
27     33749716       25                       448
28      1556410       20                      9662
29      2332545       21                      6449
30      5573791       22                      2700
31      1813652       20                      8293
32          n/a      n/a                         3
33      2157447       21                      6973
34      2995666       21                      5022
35     17448836       24                       864
36     59843376       25                       254
37      1849059       20                      8136
38     71174130       26                       214
39    106965320       26                       143
40      6914084       22                      2177
41     24933672       24                       605
42      1929630       20                      7795
43      1757578       20                      8557
44      2575132       21                      5841
45      6637893       22                      2267
46     89524677       26                       170
47      9221534       23                      1633
48    139675137       27                       110
49      2572340       21                      5849
 5      1315969       20                     11427
50      3887934       21                      3870
51      3630477       21                      4145
52    858309240       29                        20
53     15129752       23                       996
54     49436377       25                       307
55      2132680       21                      7052
56      2271965       21                      6621
57     55751906       25                       272
58      3391812       21                      4436 
59      3244460       21                      4638
60      8076397       22                      1864
61      3360593       21                      4476
62      2619894       21                      5742
63      2336967       21                      6438
64          n/a      n/a                         3          

@minad
Copy link
Member

minad commented Oct 10, 2019

I will take a look. Just one thing - could you please split your PRs.

  1. mp_to_radix improvements
  2. mp_to_radix_overestimate (old version for reference)
  3. mp_to_radix_overestimate (new version with larger error)

@minad
Copy link
Member

minad commented Oct 10, 2019

Concerning the error table - the error looks indeed quite large. Do you know the error bounds? It would also be helpful if your table would show base, bits, exact size, overestimated size, relative and absolute errors?

@sjaeckel
Copy link
Member

sjaeckel commented Oct 10, 2019

Concerning the error table - the error looks indeed quite large

I'd like to see what @czurnieden means by "very large" :-D

I just did a simple test of comparing mp_radix_size_overestimate() to mp_radix_size() with the worst error rate (base 7) and that's the result:

time overestimate: 1741
time regular: 14377
mp_radix_size_overestimate: result for number with 1626 bits (28 digits) in base 7 was 583 instead of 581 (diff 2)
time overestimate: 4103
time regular: 5284636963
mp_radix_size_overestimate: result for number with 10001626 bits (166694 digits) in base 7 was 3562787 instead of 3562652 (diff 135)
time overestimate: 9935
time regular: 16148608687
mp_radix_size_overestimate: result for number with 20001626 bits (333361 digits) in base 7 was 7124991 instead of 7124724 (diff 267)
time overestimate: 20535
time regular: 40617478921
mp_radix_size_overestimate: result for number with 30001626 bits (500028 digits) in base 7 was 10687194 instead of 10686796 (diff 398)
time overestimate: 15133
time regular: 59256129412
mp_radix_size_overestimate: result for number with 40001626 bits (666694 digits) in base 7 was 14249398 instead of 14248868 (diff 530)

@czurnieden
Copy link
Contributor Author

Just one thing - could you please split your PRs.
mp_to_radix improvements

Now that you mentioned it: I think those improvements (the power-of-two shortcuts) would be better placed into mp_ilogb?

mp_to_radix_overestimate (old version for reference)

But you didn't want it?

mp_to_radix_overestimate (new version with larger error)

Please take a look first, I'm already juggling way too many PRs :-)

I'd like to see what @czurnieden means by "very large" :-D

Size is in the eye of the beholder ;-)

I tarred&feathered the logs and put it into my Google Drive

Aaaand forgot to include the actual testing rig, yeah, typical.
But it's small, so here it is:

#include "tommath_private.h"

/*
   Table of {0, log_2([1..64])} times 2^p where p is the scale
   factor defined in LTM_RADIX_SIZE_SCALE.
 */
/* *INDENT-OFF* */
#define LTM_RADIX_SIZE_SCALE 13
static const uint16_t logbases[65] = {
       0u,     0u,  8192u, 12984u, 16384u,
   19021u, 21176u, 22997u, 24576u, 25968u,
   27213u, 28339u, 29368u, 30314u, 31189u,
   32005u, 32768u, 33484u, 34160u, 34799u,
   35405u, 35981u, 36531u, 37057u, 37560u,
   38042u, 38506u, 38952u, 39381u, 39796u,
   40197u, 40584u, 40960u, 41323u, 41676u,
   42019u, 42352u, 42675u, 42991u, 43298u,
   43597u, 43889u, 44173u, 44451u, 44723u,
   44989u, 45249u, 45503u, 45752u, 45995u,
   46234u, 46468u, 46698u, 46923u, 47144u,
   47360u, 47573u, 47783u, 47988u, 48190u,
   48389u, 48584u, 48776u, 48965u, 49152u
};
/* *INDENT-ON* */

static mp_err mp_radix_sizeinbase2(const mp_int *a, const int bits, const int base, int *size)
{
   mp_int bi_bit_count, bi_k;
   int bit_count;
   mp_err err = MP_OKAY;

   *size = 0;

   if ((base < 2) || (base > 64)) {
      return MP_VAL;
   }
   if (a == NULL) {
      bit_count = bits;
   } else {
     bit_count = mp_count_bits(a) + 1;
   }

   if (bit_count == 0) {
      *size = 2;
      return MP_OKAY;
   }

   if ((err = mp_init_multi(&bi_bit_count, &bi_k, NULL)) != MP_OKAY) {
      return err;
   }

   mp_set_l(&bi_bit_count, bit_count);
   mp_set_u32(&bi_k, logbases[base]);

   if ((err = mp_mul_2d(&bi_bit_count, LTM_RADIX_SIZE_SCALE, &bi_bit_count)) != MP_OKAY) {
      goto LTM_E1;
   }
   if ((err = mp_div(&bi_bit_count, &bi_k, &bi_bit_count, NULL)) != MP_OKAY) {
      goto LTM_E1;
   }

   *size = (int)mp_get_l(&bi_bit_count) + 3;
#if ( (defined MP_8BIT) && (INT_MAX > 0xFFFF))
   *size += 2;
#endif

LTM_E1:
   mp_clear_multi(&bi_bit_count, &bi_k, NULL);
   return err;
}


#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#include <math.h>

#ifndef M_LN2
#define M_LN2 0.69314718055994530942
#endif
int main(int argc, char **argv)
{

   int integer_size, double_size, base;
   double log2_base;
   int i;
   int is;
   int diff_max = 0;
   int max_int = INT_MAX;

   if (argc < 2) {
      printf("usage: %s base\n",argv[0]);
      exit(EXIT_FAILURE);
   }

   base = atoi(argv[1]);
   log2_base = log(base)/M_LN2;
#ifdef PRINT_DIFFS
   printf("# bits    base  max diff   ds    is\n");
#endif
   for (i = 1; i < max_int; i++) {
      double_size = (int)lrint(ceil((double)i / log2_base));
      mp_radix_sizeinbase2(NULL,i,base,&integer_size);
      if (integer_size < double_size) {
         fprintf(stderr,"FAILURE at i = %d, should be at least %d but is %d\n",i,double_size, integer_size);
         exit(EXIT_FAILURE);
      }
      is = integer_size - double_size;
#ifdef PRINT_DIFFS
      if(is > diff_max) {
         diff_max = is;
         printf("%d %d %d %d %d\n",i,base,diff_max,double_size, integer_size);
      }
#endif
#ifdef PRINT_PROGRESS
      if (i % 100000000 == 0) {
         fprintf(stderr,"%d: ds  = %d is =  %d\n",i,double_size, integer_size);
      }
#endif
   }
   exit(EXIT_SUCCESS);
}

As you can see: I used floating point math to get a value to compare against. That additional error does not influence the result for the max. input of 2^31.

@czurnieden
Copy link
Contributor Author

Just one thing - could you please split your PRs.

I put the the shortcuts for bases that are powers of two in mp_ilogb and it is in#367 so it might go together with #366 if it (#366) is still wanted.

@minad
Copy link
Member

minad commented Oct 14, 2019

@czurnieden @sjaeckel I guess it is okay to close this PR in favor of the other ones? From my side I would prefer if we could finish 1.2 now instead of delaying it further. The discussion regarding radix_size_overestimate is not yet concluded, so I would prefer if we move that too to 2.0. Then we can also decide if we add a specialzed to_radix and a specialied radix_size_overestimate only for base 10. Would that be ok?

@sjaeckel
Copy link
Member

I guess it is okay to close this PR in favor of the other ones? From my side I would prefer if we could finish 1.2 now instead of delaying it further. The discussion regarding radix_size_overestimate is not yet concluded, so I would prefer if we move that too to 2.0. Then we can also decide if we add a specialzed to_radix and a specialied radix_size_overestimate only for base 10. Would that be ok?

OK

@sjaeckel sjaeckel closed this Oct 14, 2019
@sjaeckel sjaeckel modified the milestones: v1.2.0, v2.0.0 Oct 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants