Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

added libtommath-0.34

  • Loading branch information...
commit 3d0fcaab0a2d411b541dc9df8abd0a6be2df268f 1 parent 4b7111d
Tom St Denis authored sjaeckel committed
Showing with 5,181 additions and 3,571 deletions.
  1. BIN  bn.pdf
  2. +17 −12 bn.tex
  3. +1 −2  bn_fast_mp_invmod.c
  4. +1 −2  bn_fast_mp_montgomery_reduce.c
  5. +2 −3 bn_fast_s_mp_mul_digs.c
  6. +2 −3 bn_fast_s_mp_mul_high_digs.c
  7. +1 −1  bn_fast_s_mp_sqr.c
  8. +11 −3 bn_mp_exptmod.c
  9. +1 −2  bn_mp_exptmod_fast.c
  10. +2 −1  bn_mp_mul_d.c
  11. +2 −2 bn_mp_prime_random_ex.c
  12. +1 −1  bn_mp_read_radix.c
  13. +1 −2  bn_mp_reduce.c
  14. +1 −2  bn_mp_reduce_2k.c
  15. +58 −0 bn_mp_reduce_2k_l.c
  16. +1 −2  bn_mp_reduce_2k_setup.c
  17. +40 −0 bn_mp_reduce_2k_setup_l.c
  18. +4 −4 bn_mp_reduce_is_2k.c
  19. +40 −0 bn_mp_reduce_is_2k_l.c
  20. +1 −2  bn_mp_to_signed_bin.c
  21. +27 −0 bn_mp_to_signed_bin_n.c
  22. +1 −2  bn_mp_to_unsigned_bin.c
  23. +27 −0 bn_mp_to_unsigned_bin_n.c
  24. +1 −2  bn_mp_unsigned_bin_size.c
  25. +24 −11 bn_s_mp_exptmod.c
  26. +3 −2 bncore.c
  27. +2,712 −1,863 callgraph.txt
  28. +12 −0 changes.txt
  29. +561 −375 demo/demo.c
  30. +213 −185 demo/timing.c
  31. +2 −0  dep.pl
  32. +33 −2 etc/tune.c
  33. +7 −7 logs/add.log
  34. +7 −7 logs/expt.log
  35. +5 −6 logs/expt_2k.log
  36. +4 −0 logs/expt_2kl.log
  37. +7 −7 logs/expt_dr.log
  38. +84 −143 logs/mult.log
  39. +84 −33 logs/mult_kara.log
  40. +84 −143 logs/sqr.log
  41. +84 −33 logs/sqr_kara.log
  42. +15 −15 logs/sub.log
  43. +4 −2 makefile
  44. +3 −1 makefile.bcc
  45. +3 −1 makefile.cygwin_dll
  46. +3 −1 makefile.icc
  47. +3 −1 makefile.msvc
  48. +5 −2 makefile.shared
  49. BIN  poster.pdf
  50. +273 −49 pre_gen/mpi.c
  51. +13 −2 tommath.h
  52. BIN  tommath.pdf
  53. +1 −1  tommath.src
  54. +647 −629 tommath.tex
  55. +42 −2 tommath_class.h
BIN  bn.pdf
View
Binary file not shown
29 bn.tex
View
@@ -49,7 +49,7 @@
\begin{document}
\frontmatter
\pagestyle{empty}
-\title{LibTomMath User Manual \\ v0.33}
+\title{LibTomMath User Manual \\ v0.34}
\author{Tom St Denis \\ tomstdenis@iahu.ca}
\maketitle
This text, the library and the accompanying textbook are all hereby placed in the public domain. This book has been
@@ -263,12 +263,12 @@ \section{Purpose of LibTomMath}
\begin{center}
\begin{tabular}{|l|c|c|l|}
\hline \textbf{Criteria} & \textbf{Pro} & \textbf{Con} & \textbf{Notes} \\
-\hline Few lines of code per file & X & & GnuPG $ = 300.9$, LibTomMath $ = 76.04$ \\
+\hline Few lines of code per file & X & & GnuPG $ = 300.9$, LibTomMath $ = 71.97$ \\
\hline Commented function prototypes & X && GnuPG function names are cryptic. \\
\hline Speed && X & LibTomMath is slower. \\
\hline Totally free & X & & GPL has unfavourable restrictions.\\
\hline Large function base & X & & GnuPG is barebones. \\
-\hline Four modular reduction algorithms & X & & Faster modular exponentiation. \\
+\hline Five modular reduction algorithms & X & & Faster modular exponentiation for a variety of moduli. \\
\hline Portable & X & & GnuPG requires configuration to build. \\
\hline
\end{tabular}
@@ -284,9 +284,12 @@ \section{Purpose of LibTomMath}
So it may feel tempting to just rip the math code out of GnuPG (or GnuMP where it was taken from originally) in your
own application but I think there are reasons not to. While LibTomMath is slower than libraries such as GnuMP it is
not normally significantly slower. On x86 machines the difference is normally a factor of two when performing modular
-exponentiations.
+exponentiations. It depends largely on the processor, compiler and the moduli being used.
-Essentially the only time you wouldn't use LibTomMath is when blazing speed is the primary concern.
+Essentially the only time you wouldn't use LibTomMath is when blazing speed is the primary concern. However,
+on the other side of the coin LibTomMath offers you a totally free (public domain) well structured math library
+that is very flexible, complete and performs well in resource contrained environments. Fast RSA for example can
+be performed with as little as 8KB of ram for data (again depending on build options).
\chapter{Getting Started with LibTomMath}
\section{Building Programs}
@@ -809,7 +812,7 @@ \subsection{Unsigned comparison}
\index{mp\_cmp\_mag}
\begin{alltt}
-int mp_cmp(mp_int * a, mp_int * b);
+int mp_cmp_mag(mp_int * a, mp_int * b);
\end{alltt}
This will compare $a$ to $b$ placing $a$ to the left of $b$. This function cannot fail and will return one of the
three compare codes listed in figure \ref{fig:CMP}.
@@ -1220,12 +1223,13 @@ \section{Squaring}
\end{alltt}
Will square $a$ and store it in $b$. Like the case of multiplication there are four different squaring
-algorithms all which can be called from mp\_sqr(). It is ideal to use mp\_sqr over mp\_mul when squaring terms.
+algorithms all which can be called from mp\_sqr(). It is ideal to use mp\_sqr over mp\_mul when squaring terms because
+of the speed difference.
\section{Tuning Polynomial Basis Routines}
Both of the Toom-Cook and Karatsuba multiplication algorithms are faster than the traditional $O(n^2)$ approach that
-the Comba and baseline algorithms use. At $O(n^{1.464973})$ and $O(n^{1.584962})$ running times respectfully they require
+the Comba and baseline algorithms use. At $O(n^{1.464973})$ and $O(n^{1.584962})$ running times respectively they require
considerably less work. For example, a 10000-digit multiplication would take roughly 724,000 single precision
multiplications with Toom-Cook or 100,000,000 single precision multiplications with the standard Comba (a factor
of 138).
@@ -1297,14 +1301,14 @@ \section{Straight Division}
\section{Barrett Reduction}
Barrett reduction is a generic optimized reduction algorithm that requires pre--computation to achieve
-a decent speedup over straight division. First a $mu$ value must be precomputed with the following function.
+a decent speedup over straight division. First a $\mu$ value must be precomputed with the following function.
\index{mp\_reduce\_setup}
\begin{alltt}
int mp_reduce_setup(mp_int *a, mp_int *b);
\end{alltt}
-Given a modulus in $b$ this produces the required $mu$ value in $a$. For any given modulus this only has to
+Given a modulus in $b$ this produces the required $\mu$ value in $a$. For any given modulus this only has to
be computed once. Modular reduction can now be performed with the following.
\index{mp\_reduce}
@@ -1312,7 +1316,7 @@ \section{Barrett Reduction}
int mp_reduce(mp_int *a, mp_int *b, mp_int *c);
\end{alltt}
-This will reduce $a$ in place modulo $b$ with the precomputed $mu$ value in $c$. $a$ must be in the range
+This will reduce $a$ in place modulo $b$ with the precomputed $\mu$ value in $c$. $a$ must be in the range
$0 \le a < b^2$.
\begin{alltt}
@@ -1578,7 +1582,8 @@ \section{Root Finding}
This algorithm uses the ``Newton Approximation'' method and will converge on the correct root fairly quickly. Since
the algorithm requires raising $a$ to the power of $b$ it is not ideal to attempt to find roots for large
values of $b$. If particularly large roots are required then a factor method could be used instead. For example,
-$a^{1/16}$ is equivalent to $\left (a^{1/4} \right)^{1/4}$.
+$a^{1/16}$ is equivalent to $\left (a^{1/4} \right)^{1/4}$ or simply
+$\left ( \left ( \left ( a^{1/2} \right )^{1/2} \right )^{1/2} \right )^{1/2}$
\chapter{Prime Numbers}
\section{Trial Division}
3  bn_fast_mp_invmod.c
View
@@ -21,8 +21,7 @@
* Based on slow invmod except this is optimized for the case where b is
* odd as per HAC Note 14.64 on pp. 610
*/
-int
-fast_mp_invmod (mp_int * a, mp_int * b, mp_int * c)
+int fast_mp_invmod (mp_int * a, mp_int * b, mp_int * c)
{
mp_int x, y, u, v, B, D;
int res, neg;
3  bn_fast_mp_montgomery_reduce.c
View
@@ -23,8 +23,7 @@
*
* Based on Algorithm 14.32 on pp.601 of HAC.
*/
-int
-fast_mp_montgomery_reduce (mp_int * x, mp_int * n, mp_digit rho)
+int fast_mp_montgomery_reduce (mp_int * x, mp_int * n, mp_digit rho)
{
int ix, res, olduse;
mp_word W[MP_WARRAY];
5 bn_fast_s_mp_mul_digs.c
View
@@ -31,8 +31,7 @@
* Based on Algorithm 14.12 on pp.595 of HAC.
*
*/
-int
-fast_s_mp_mul_digs (mp_int * a, mp_int * b, mp_int * c, int digs)
+int fast_s_mp_mul_digs (mp_int * a, mp_int * b, mp_int * c, int digs)
{
int olduse, res, pa, ix, iz;
mp_digit W[MP_WARRAY];
@@ -81,7 +80,7 @@ fast_s_mp_mul_digs (mp_int * a, mp_int * b, mp_int * c, int digs)
}
/* store final carry */
- W[ix] = _W;
+ W[ix] = _W & MP_MASK;
/* setup dest */
olduse = c->used;
5 bn_fast_s_mp_mul_high_digs.c
View
@@ -24,8 +24,7 @@
*
* Based on Algorithm 14.12 on pp.595 of HAC.
*/
-int
-fast_s_mp_mul_high_digs (mp_int * a, mp_int * b, mp_int * c, int digs)
+int fast_s_mp_mul_high_digs (mp_int * a, mp_int * b, mp_int * c, int digs)
{
int olduse, res, pa, ix, iz;
mp_digit W[MP_WARRAY];
@@ -72,7 +71,7 @@ fast_s_mp_mul_high_digs (mp_int * a, mp_int * b, mp_int * c, int digs)
}
/* store final carry */
- W[ix] = _W;
+ W[ix] = _W & MP_MASK;
/* setup dest */
olduse = c->used;
2  bn_fast_s_mp_sqr.c
View
@@ -101,7 +101,7 @@ int fast_s_mp_sqr (mp_int * a, mp_int * b)
}
/* store it */
- W[ix] = _W;
+ W[ix] = _W & MP_MASK;
/* make next carry */
W1 = _W >> ((mp_word)DIGIT_BIT);
14 bn_mp_exptmod.c
View
@@ -65,21 +65,29 @@ int mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y)
#endif
}
+/* modified diminished radix reduction */
+#if defined(BN_MP_REDUCE_IS_2K_L_C) && defined(BN_MP_REDUCE_2K_L_C)
+ if (mp_reduce_is_2k_l(P) == MP_YES) {
+ return s_mp_exptmod(G, X, P, Y, 1);
+ }
+#endif
+
#ifdef BN_MP_DR_IS_MODULUS_C
/* is it a DR modulus? */
dr = mp_dr_is_modulus(P);
#else
+ /* default to no */
dr = 0;
#endif
#ifdef BN_MP_REDUCE_IS_2K_C
- /* if not, is it a uDR modulus? */
+ /* if not, is it a unrestricted DR modulus? */
if (dr == 0) {
dr = mp_reduce_is_2k(P) << 1;
}
#endif
- /* if the modulus is odd or dr != 0 use the fast method */
+ /* if the modulus is odd or dr != 0 use the montgomery method */
#ifdef BN_MP_EXPTMOD_FAST_C
if (mp_isodd (P) == 1 || dr != 0) {
return mp_exptmod_fast (G, X, P, Y, dr);
@@ -87,7 +95,7 @@ int mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y)
#endif
#ifdef BN_S_MP_EXPTMOD_C
/* otherwise use the generic Barrett reduction technique */
- return s_mp_exptmod (G, X, P, Y);
+ return s_mp_exptmod (G, X, P, Y, 0);
#else
/* no exptmod for evens */
return MP_VAL;
3  bn_mp_exptmod_fast.c
View
@@ -29,8 +29,7 @@
#define TAB_SIZE 256
#endif
-int
-mp_exptmod_fast (mp_int * G, mp_int * X, mp_int * P, mp_int * Y, int redmode)
+int mp_exptmod_fast (mp_int * G, mp_int * X, mp_int * P, mp_int * Y, int redmode)
{
mp_int M[TAB_SIZE], res;
mp_digit buf, mp;
3  bn_mp_mul_d.c
View
@@ -57,8 +57,9 @@ mp_mul_d (mp_int * a, mp_digit b, mp_int * c)
u = (mp_digit) (r >> ((mp_word) DIGIT_BIT));
}
- /* store final carry [if any] */
+ /* store final carry [if any] and increment ix offset */
*tmpc++ = u;
+ ++ix;
/* now zero digits above the top */
while (ix++ < olduse) {
4 bn_mp_prime_random_ex.c
View
@@ -60,7 +60,7 @@ int mp_prime_random_ex(mp_int *a, int t, int size, int flags, ltm_prime_callback
/* calc the maskOR_msb */
maskOR_msb = 0;
- maskOR_msb_offset = (size - 2) >> 3;
+ maskOR_msb_offset = ((size & 7) == 1) ? 1 : 0;
if (flags & LTM_PRIME_2MSB_ON) {
maskOR_msb |= 1 << ((size - 2) & 7);
} else if (flags & LTM_PRIME_2MSB_OFF) {
@@ -68,7 +68,7 @@ int mp_prime_random_ex(mp_int *a, int t, int size, int flags, ltm_prime_callback
}
/* get the maskOR_lsb */
- maskOR_lsb = 0;
+ maskOR_lsb = 1;
if (flags & LTM_PRIME_BBS) {
maskOR_lsb |= 3;
}
2  bn_mp_read_radix.c
View
@@ -16,7 +16,7 @@
*/
/* read a string [ASCII] in a given radix */
-int mp_read_radix (mp_int * a, char *str, int radix)
+int mp_read_radix (mp_int * a, const char *str, int radix)
{
int y, res, neg;
char ch;
3  bn_mp_reduce.c
View
@@ -19,8 +19,7 @@
* precomputed via mp_reduce_setup.
* From HAC pp.604 Algorithm 14.42
*/
-int
-mp_reduce (mp_int * x, mp_int * m, mp_int * mu)
+int mp_reduce (mp_int * x, mp_int * m, mp_int * mu)
{
mp_int q;
int res, um = m->used;
3  bn_mp_reduce_2k.c
View
@@ -16,8 +16,7 @@
*/
/* reduces a modulo n where n is of the form 2**p - d */
-int
-mp_reduce_2k(mp_int *a, mp_int *n, mp_digit d)
+int mp_reduce_2k(mp_int *a, mp_int *n, mp_digit d)
{
mp_int q;
int p, res;
58 bn_mp_reduce_2k_l.c
View
@@ -0,0 +1,58 @@
+#include <tommath.h>
+#ifdef BN_MP_REDUCE_2K_L_C
+/* LibTomMath, multiple-precision integer library -- Tom St Denis
+ *
+ * LibTomMath is a library that provides multiple-precision
+ * integer arithmetic as well as number theoretic functionality.
+ *
+ * The library was designed directly after the MPI library by
+ * Michael Fromberger but has been written from scratch with
+ * additional optimizations in place.
+ *
+ * The library is free for all purposes without any express
+ * guarantee it works.
+ *
+ * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
+ */
+
+/* reduces a modulo n where n is of the form 2**p - d
+ This differs from reduce_2k since "d" can be larger
+ than a single digit.
+*/
+int mp_reduce_2k_l(mp_int *a, mp_int *n, mp_int *d)
+{
+ mp_int q;
+ int p, res;
+
+ if ((res = mp_init(&q)) != MP_OKAY) {
+ return res;
+ }
+
+ p = mp_count_bits(n);
+top:
+ /* q = a/2**p, a = a mod 2**p */
+ if ((res = mp_div_2d(a, p, &q, a)) != MP_OKAY) {
+ goto ERR;
+ }
+
+ /* q = q * d */
+ if ((res = mp_mul(&q, d, &q)) != MP_OKAY) {
+ goto ERR;
+ }
+
+ /* a = a + q */
+ if ((res = s_mp_add(a, &q, a)) != MP_OKAY) {
+ goto ERR;
+ }
+
+ if (mp_cmp_mag(a, n) != MP_LT) {
+ s_mp_sub(a, n, a);
+ goto top;
+ }
+
+ERR:
+ mp_clear(&q);
+ return res;
+}
+
+#endif
3  bn_mp_reduce_2k_setup.c
View
@@ -16,8 +16,7 @@
*/
/* determines the setup value */
-int
-mp_reduce_2k_setup(mp_int *a, mp_digit *d)
+int mp_reduce_2k_setup(mp_int *a, mp_digit *d)
{
int res, p;
mp_int tmp;
40 bn_mp_reduce_2k_setup_l.c
View
@@ -0,0 +1,40 @@
+#include <tommath.h>
+#ifdef BN_MP_REDUCE_2K_SETUP_L_C
+/* LibTomMath, multiple-precision integer library -- Tom St Denis
+ *
+ * LibTomMath is a library that provides multiple-precision
+ * integer arithmetic as well as number theoretic functionality.
+ *
+ * The library was designed directly after the MPI library by
+ * Michael Fromberger but has been written from scratch with
+ * additional optimizations in place.
+ *
+ * The library is free for all purposes without any express
+ * guarantee it works.
+ *
+ * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
+ */
+
+/* determines the setup value */
+int mp_reduce_2k_setup_l(mp_int *a, mp_int *d)
+{
+ int res;
+ mp_int tmp;
+
+ if ((res = mp_init(&tmp)) != MP_OKAY) {
+ return res;
+ }
+
+ if ((res = mp_2expt(&tmp, mp_count_bits(a))) != MP_OKAY) {
+ goto ERR;
+ }
+
+ if ((res = s_mp_sub(&tmp, a, d)) != MP_OKAY) {
+ goto ERR;
+ }
+
+ERR:
+ mp_clear(&tmp);
+ return res;
+}
+#endif
8 bn_mp_reduce_is_2k.c
View
@@ -22,9 +22,9 @@ int mp_reduce_is_2k(mp_int *a)
mp_digit iz;
if (a->used == 0) {
- return 0;
+ return MP_NO;
} else if (a->used == 1) {
- return 1;
+ return MP_YES;
} else if (a->used > 1) {
iy = mp_count_bits(a);
iz = 1;
@@ -33,7 +33,7 @@ int mp_reduce_is_2k(mp_int *a)
/* Test every bit from the second digit up, must be 1 */
for (ix = DIGIT_BIT; ix < iy; ix++) {
if ((a->dp[iw] & iz) == 0) {
- return 0;
+ return MP_NO;
}
iz <<= 1;
if (iz > (mp_digit)MP_MASK) {
@@ -42,7 +42,7 @@ int mp_reduce_is_2k(mp_int *a)
}
}
}
- return 1;
+ return MP_YES;
}
#endif
40 bn_mp_reduce_is_2k_l.c
View
@@ -0,0 +1,40 @@
+#include <tommath.h>
+#ifdef BN_MP_REDUCE_IS_2K_L_C
+/* LibTomMath, multiple-precision integer library -- Tom St Denis
+ *
+ * LibTomMath is a library that provides multiple-precision
+ * integer arithmetic as well as number theoretic functionality.
+ *
+ * The library was designed directly after the MPI library by
+ * Michael Fromberger but has been written from scratch with
+ * additional optimizations in place.
+ *
+ * The library is free for all purposes without any express
+ * guarantee it works.
+ *
+ * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
+ */
+
+/* determines if reduce_2k_l can be used */
+int mp_reduce_is_2k_l(mp_int *a)
+{
+ int ix, iy;
+
+ if (a->used == 0) {
+ return MP_NO;
+ } else if (a->used == 1) {
+ return MP_YES;
+ } else if (a->used > 1) {
+ /* if more than half of the digits are -1 we're sold */
+ for (iy = ix = 0; ix < a->used; ix++) {
+ if (a->dp[ix] == MP_MASK) {
+ ++iy;
+ }
+ }
+ return (iy >= (a->used/2)) ? MP_YES : MP_NO;
+
+ }
+ return MP_NO;
+}
+
+#endif
3  bn_mp_to_signed_bin.c
View
@@ -16,8 +16,7 @@
*/
/* store in signed [big endian] format */
-int
-mp_to_signed_bin (mp_int * a, unsigned char *b)
+int mp_to_signed_bin (mp_int * a, unsigned char *b)
{
int res;
27 bn_mp_to_signed_bin_n.c
View
@@ -0,0 +1,27 @@
+#include <tommath.h>
+#ifdef BN_MP_TO_SIGNED_BIN_N_C
+/* LibTomMath, multiple-precision integer library -- Tom St Denis
+ *
+ * LibTomMath is a library that provides multiple-precision
+ * integer arithmetic as well as number theoretic functionality.
+ *
+ * The library was designed directly after the MPI library by
+ * Michael Fromberger but has been written from scratch with
+ * additional optimizations in place.
+ *
+ * The library is free for all purposes without any express
+ * guarantee it works.
+ *
+ * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
+ */
+
+/* store in signed [big endian] format */
+int mp_to_signed_bin_n (mp_int * a, unsigned char *b, unsigned long *outlen)
+{
+ if (*outlen < (unsigned long)mp_signed_bin_size(a)) {
+ return MP_VAL;
+ }
+ *outlen = mp_signed_bin_size(a);
+ return mp_to_signed_bin(a, b);
+}
+#endif
3  bn_mp_to_unsigned_bin.c
View
@@ -16,8 +16,7 @@
*/
/* store in unsigned [big endian] format */
-int
-mp_to_unsigned_bin (mp_int * a, unsigned char *b)
+int mp_to_unsigned_bin (mp_int * a, unsigned char *b)
{
int x, res;
mp_int t;
27 bn_mp_to_unsigned_bin_n.c
View
@@ -0,0 +1,27 @@
+#include <tommath.h>
+#ifdef BN_MP_TO_UNSIGNED_BIN_N_C
+/* LibTomMath, multiple-precision integer library -- Tom St Denis
+ *
+ * LibTomMath is a library that provides multiple-precision
+ * integer arithmetic as well as number theoretic functionality.
+ *
+ * The library was designed directly after the MPI library by
+ * Michael Fromberger but has been written from scratch with
+ * additional optimizations in place.
+ *
+ * The library is free for all purposes without any express
+ * guarantee it works.
+ *
+ * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
+ */
+
+/* store in unsigned [big endian] format */
+int mp_to_unsigned_bin_n (mp_int * a, unsigned char *b, unsigned long *outlen)
+{
+ if (*outlen < (unsigned long)mp_unsigned_bin_size(a)) {
+ return MP_VAL;
+ }
+ *outlen = mp_unsigned_bin_size(a);
+ return mp_to_unsigned_bin(a, b);
+}
+#endif
3  bn_mp_unsigned_bin_size.c
View
@@ -16,8 +16,7 @@
*/
/* get the size for an unsigned equivalent */
-int
-mp_unsigned_bin_size (mp_int * a)
+int mp_unsigned_bin_size (mp_int * a)
{
int size = mp_count_bits (a);
return (size / 8 + ((size & 7) != 0 ? 1 : 0));
35 bn_s_mp_exptmod.c
View
@@ -21,11 +21,12 @@
#define TAB_SIZE 256
#endif
-int s_mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y)
+int s_mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y, int redmode)
{
mp_int M[TAB_SIZE], res, mu;
mp_digit buf;
int err, bitbuf, bitcpy, bitcnt, mode, digidx, x, y, winsize;
+ int (*redux)(mp_int*,mp_int*,mp_int*);
/* find window size */
x = mp_count_bits (X);
@@ -72,9 +73,18 @@ int s_mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y)
if ((err = mp_init (&mu)) != MP_OKAY) {
goto LBL_M;
}
- if ((err = mp_reduce_setup (&mu, P)) != MP_OKAY) {
- goto LBL_MU;
- }
+
+ if (redmode == 0) {
+ if ((err = mp_reduce_setup (&mu, P)) != MP_OKAY) {
+ goto LBL_MU;
+ }
+ redux = mp_reduce;
+ } else {
+ if ((err = mp_reduce_2k_setup_l (P, &mu)) != MP_OKAY) {
+ goto LBL_MU;
+ }
+ redux = mp_reduce_2k_l;
+ }
/* create M table
*
@@ -96,11 +106,14 @@ int s_mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y)
}
for (x = 0; x < (winsize - 1); x++) {
+ /* square it */
if ((err = mp_sqr (&M[1 << (winsize - 1)],
&M[1 << (winsize - 1)])) != MP_OKAY) {
goto LBL_MU;
}
- if ((err = mp_reduce (&M[1 << (winsize - 1)], P, &mu)) != MP_OKAY) {
+
+ /* reduce modulo P */
+ if ((err = redux (&M[1 << (winsize - 1)], P, &mu)) != MP_OKAY) {
goto LBL_MU;
}
}
@@ -112,7 +125,7 @@ int s_mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y)
if ((err = mp_mul (&M[x - 1], &M[1], &M[x])) != MP_OKAY) {
goto LBL_MU;
}
- if ((err = mp_reduce (&M[x], P, &mu)) != MP_OKAY) {
+ if ((err = redux (&M[x], P, &mu)) != MP_OKAY) {
goto LBL_MU;
}
}
@@ -161,7 +174,7 @@ int s_mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y)
if ((err = mp_sqr (&res, &res)) != MP_OKAY) {
goto LBL_RES;
}
- if ((err = mp_reduce (&res, P, &mu)) != MP_OKAY) {
+ if ((err = redux (&res, P, &mu)) != MP_OKAY) {
goto LBL_RES;
}
continue;
@@ -178,7 +191,7 @@ int s_mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y)
if ((err = mp_sqr (&res, &res)) != MP_OKAY) {
goto LBL_RES;
}
- if ((err = mp_reduce (&res, P, &mu)) != MP_OKAY) {
+ if ((err = redux (&res, P, &mu)) != MP_OKAY) {
goto LBL_RES;
}
}
@@ -187,7 +200,7 @@ int s_mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y)
if ((err = mp_mul (&res, &M[bitbuf], &res)) != MP_OKAY) {
goto LBL_RES;
}
- if ((err = mp_reduce (&res, P, &mu)) != MP_OKAY) {
+ if ((err = redux (&res, P, &mu)) != MP_OKAY) {
goto LBL_RES;
}
@@ -205,7 +218,7 @@ int s_mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y)
if ((err = mp_sqr (&res, &res)) != MP_OKAY) {
goto LBL_RES;
}
- if ((err = mp_reduce (&res, P, &mu)) != MP_OKAY) {
+ if ((err = redux (&res, P, &mu)) != MP_OKAY) {
goto LBL_RES;
}
@@ -215,7 +228,7 @@ int s_mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y)
if ((err = mp_mul (&res, &M[1], &res)) != MP_OKAY) {
goto LBL_RES;
}
- if ((err = mp_reduce (&res, P, &mu)) != MP_OKAY) {
+ if ((err = redux (&res, P, &mu)) != MP_OKAY) {
goto LBL_RES;
}
}
5 bncore.c
View
@@ -20,11 +20,12 @@
CPU /Compiler /MUL CUTOFF/SQR CUTOFF
-------------------------------------------------------------
Intel P4 Northwood /GCC v3.4.1 / 88/ 128/LTM 0.32 ;-)
+ AMD Athlon64 /GCC v3.4.4 / 74/ 124/LTM 0.34
*/
-int KARATSUBA_MUL_CUTOFF = 88, /* Min. number of digits before Karatsuba multiplication is used. */
- KARATSUBA_SQR_CUTOFF = 128, /* Min. number of digits before Karatsuba squaring is used. */
+int KARATSUBA_MUL_CUTOFF = 74, /* Min. number of digits before Karatsuba multiplication is used. */
+ KARATSUBA_SQR_CUTOFF = 124, /* Min. number of digits before Karatsuba squaring is used. */
TOOM_MUL_CUTOFF = 350, /* no optimal values of these are known yet so set em high */
TOOM_SQR_CUTOFF = 400;
4,575 callgraph.txt
View
2,712 additions, 1,863 deletions not shown
12 changes.txt
View
@@ -1,3 +1,15 @@
+February 12th, 2005
+v0.34 -- Fixed two more small errors in mp_prime_random_ex()
+ -- Fixed overflow in mp_mul_d() [Kevin Kenny]
+ -- Added mp_to_(un)signed_bin_n() functions which do bounds checking for ya [and report the size]
+ -- Added "large" diminished radix support. Speeds up things like DSA where the moduli is of the form 2^k - P for some P < 2^(k/2) or so
+ Actually is faster than Montgomery on my AMD64 (and probably much faster on a P4)
+ -- Updated the manual a bit
+ -- Ok so I haven't done the textbook work yet... My current freelance gig has landed me in France till the
+ end of Feb/05. Once I get back I'll have tons of free time and I plan to go to town on the book.
+ As of this release the API will freeze. At least until the book catches up with all the changes. I welcome
+ bug reports but new algorithms will have to wait.
+
December 23rd, 2004
v0.33 -- Fixed "small" variant for mp_div() which would munge with negative dividends...
-- Fixed bug in mp_prime_random_ex() which would set the most significant byte to zero when
936 demo/demo.c
View
@@ -9,15 +9,16 @@
#include "tommath.h"
-void ndraw(mp_int *a, char *name)
+void ndraw(mp_int * a, char *name)
{
char buf[16000];
+
printf("%s: ", name);
mp_toradix(a, buf, 10);
printf("%s\n", buf);
}
-static void draw(mp_int *a)
+static void draw(mp_int * a)
{
ndraw(a, "");
}
@@ -39,18 +40,20 @@ int lbit(void)
int myrng(unsigned char *dst, int len, void *dat)
{
int x;
- for (x = 0; x < len; x++) dst[x] = rand() & 0xFF;
+
+ for (x = 0; x < len; x++)
+ dst[x] = rand() & 0xFF;
return len;
}
- char cmd[4096], buf[4096];
+char cmd[4096], buf[4096];
int main(void)
{
mp_int a, b, c, d, e, f;
- unsigned long expt_n, add_n, sub_n, mul_n, div_n, sqr_n, mul2d_n, div2d_n, gcd_n, lcm_n, inv_n,
- div2_n, mul2_n, add_d_n, sub_d_n, t;
+ unsigned long expt_n, add_n, sub_n, mul_n, div_n, sqr_n, mul2d_n, div2d_n,
+ gcd_n, lcm_n, inv_n, div2_n, mul2_n, add_d_n, sub_d_n, t;
unsigned rr;
int i, n, err, cnt, ix, old_kara_m, old_kara_s;
@@ -65,108 +68,118 @@ int main(void)
srand(time(NULL));
#if 0
- // test mp_get_int
- printf("Testing: mp_get_int\n");
- for(i=0;i<1000;++i) {
- t = ((unsigned long)rand()*rand()+1)&0xFFFFFFFF;
- mp_set_int(&a,t);
- if (t!=mp_get_int(&a)) {
+ // test mp_get_int
+ printf("Testing: mp_get_int\n");
+ for (i = 0; i < 1000; ++i) {
+ t = ((unsigned long) rand() * rand() + 1) & 0xFFFFFFFF;
+ mp_set_int(&a, t);
+ if (t != mp_get_int(&a)) {
+ printf("mp_get_int() bad result!\n");
+ return 1;
+ }
+ }
+ mp_set_int(&a, 0);
+ if (mp_get_int(&a) != 0) {
printf("mp_get_int() bad result!\n");
return 1;
- }
- }
- mp_set_int(&a,0);
- if (mp_get_int(&a)!=0)
- { printf("mp_get_int() bad result!\n");
- return 1;
- }
- mp_set_int(&a,0xffffffff);
- if (mp_get_int(&a)!=0xffffffff)
- { printf("mp_get_int() bad result!\n");
- return 1;
- }
-
- // test mp_sqrt
- printf("Testing: mp_sqrt\n");
- for (i=0;i<1000;++i) {
- printf("%6d\r", i); fflush(stdout);
- n = (rand()&15)+1;
- mp_rand(&a,n);
- if (mp_sqrt(&a,&b) != MP_OKAY)
- { printf("mp_sqrt() error!\n");
- return 1;
- }
- mp_n_root(&a,2,&a);
- if (mp_cmp_mag(&b,&a) != MP_EQ)
- { printf("mp_sqrt() bad result!\n");
- return 1;
- }
- }
-
- printf("\nTesting: mp_is_square\n");
- for (i=0;i<1000;++i) {
- printf("%6d\r", i); fflush(stdout);
-
- /* test mp_is_square false negatives */
- n = (rand()&7)+1;
- mp_rand(&a,n);
- mp_sqr(&a,&a);
- if (mp_is_square(&a,&n)!=MP_OKAY) {
- printf("fn:mp_is_square() error!\n");
- return 1;
- }
- if (n==0) {
- printf("fn:mp_is_square() bad result!\n");
+ }
+ mp_set_int(&a, 0xffffffff);
+ if (mp_get_int(&a) != 0xffffffff) {
+ printf("mp_get_int() bad result!\n");
return 1;
- }
+ }
+ // test mp_sqrt
+ printf("Testing: mp_sqrt\n");
+ for (i = 0; i < 1000; ++i) {
+ printf("%6d\r", i);
+ fflush(stdout);
+ n = (rand() & 15) + 1;
+ mp_rand(&a, n);
+ if (mp_sqrt(&a, &b) != MP_OKAY) {
+ printf("mp_sqrt() error!\n");
+ return 1;
+ }
+ mp_n_root(&a, 2, &a);
+ if (mp_cmp_mag(&b, &a) != MP_EQ) {
+ printf("mp_sqrt() bad result!\n");
+ return 1;
+ }
+ }
- /* test for false positives */
- mp_add_d(&a, 1, &a);
- if (mp_is_square(&a,&n)!=MP_OKAY) {
- printf("fp:mp_is_square() error!\n");
- return 1;
- }
- if (n==1) {
- printf("fp:mp_is_square() bad result!\n");
- return 1;
- }
+ printf("\nTesting: mp_is_square\n");
+ for (i = 0; i < 1000; ++i) {
+ printf("%6d\r", i);
+ fflush(stdout);
+
+ /* test mp_is_square false negatives */
+ n = (rand() & 7) + 1;
+ mp_rand(&a, n);
+ mp_sqr(&a, &a);
+ if (mp_is_square(&a, &n) != MP_OKAY) {
+ printf("fn:mp_is_square() error!\n");
+ return 1;
+ }
+ if (n == 0) {
+ printf("fn:mp_is_square() bad result!\n");
+ return 1;
+ }
+
+ /* test for false positives */
+ mp_add_d(&a, 1, &a);
+ if (mp_is_square(&a, &n) != MP_OKAY) {
+ printf("fp:mp_is_square() error!\n");
+ return 1;
+ }
+ if (n == 1) {
+ printf("fp:mp_is_square() bad result!\n");
+ return 1;
+ }
- }
- printf("\n\n");
+ }
+ printf("\n\n");
/* test for size */
for (ix = 10; ix < 256; ix++) {
- printf("Testing (not safe-prime): %9d bits \r", ix); fflush(stdout);
- err = mp_prime_random_ex(&a, 8, ix, (rand()&1)?LTM_PRIME_2MSB_OFF:LTM_PRIME_2MSB_ON, myrng, NULL);
- if (err != MP_OKAY) {
- printf("failed with err code %d\n", err);
- return EXIT_FAILURE;
- }
- if (mp_count_bits(&a) != ix) {
- printf("Prime is %d not %d bits!!!\n", mp_count_bits(&a), ix);
- return EXIT_FAILURE;
- }
+ printf("Testing (not safe-prime): %9d bits \r", ix);
+ fflush(stdout);
+ err =
+ mp_prime_random_ex(&a, 8, ix,
+ (rand() & 1) ? LTM_PRIME_2MSB_OFF :
+ LTM_PRIME_2MSB_ON, myrng, NULL);
+ if (err != MP_OKAY) {
+ printf("failed with err code %d\n", err);
+ return EXIT_FAILURE;
+ }
+ if (mp_count_bits(&a) != ix) {
+ printf("Prime is %d not %d bits!!!\n", mp_count_bits(&a), ix);
+ return EXIT_FAILURE;
+ }
}
for (ix = 16; ix < 256; ix++) {
- printf("Testing ( safe-prime): %9d bits \r", ix); fflush(stdout);
- err = mp_prime_random_ex(&a, 8, ix, ((rand()&1)?LTM_PRIME_2MSB_OFF:LTM_PRIME_2MSB_ON)|LTM_PRIME_SAFE, myrng, NULL);
- if (err != MP_OKAY) {
- printf("failed with err code %d\n", err);
- return EXIT_FAILURE;
- }
- if (mp_count_bits(&a) != ix) {
- printf("Prime is %d not %d bits!!!\n", mp_count_bits(&a), ix);
- return EXIT_FAILURE;
- }
- /* let's see if it's really a safe prime */
- mp_sub_d(&a, 1, &a);
- mp_div_2(&a, &a);
- mp_prime_is_prime(&a, 8, &cnt);
- if (cnt != MP_YES) {
- printf("sub is not prime!\n");
- return EXIT_FAILURE;
- }
+ printf("Testing ( safe-prime): %9d bits \r", ix);
+ fflush(stdout);
+ err =
+ mp_prime_random_ex(&a, 8, ix,
+ ((rand() & 1) ? LTM_PRIME_2MSB_OFF :
+ LTM_PRIME_2MSB_ON) | LTM_PRIME_SAFE, myrng,
+ NULL);
+ if (err != MP_OKAY) {
+ printf("failed with err code %d\n", err);
+ return EXIT_FAILURE;
+ }
+ if (mp_count_bits(&a) != ix) {
+ printf("Prime is %d not %d bits!!!\n", mp_count_bits(&a), ix);
+ return EXIT_FAILURE;
+ }
+ /* let's see if it's really a safe prime */
+ mp_sub_d(&a, 1, &a);
+ mp_div_2(&a, &a);
+ mp_prime_is_prime(&a, 8, &cnt);
+ if (cnt != MP_YES) {
+ printf("sub is not prime!\n");
+ return EXIT_FAILURE;
+ }
}
printf("\n\n");
@@ -194,51 +207,56 @@ int main(void)
printf("testing mp_cnt_lsb...\n");
mp_set(&a, 1);
for (ix = 0; ix < 1024; ix++) {
- if (mp_cnt_lsb(&a) != ix) {
- printf("Failed at %d, %d\n", ix, mp_cnt_lsb(&a));
- return 0;
- }
- mp_mul_2(&a, &a);
+ if (mp_cnt_lsb(&a) != ix) {
+ printf("Failed at %d, %d\n", ix, mp_cnt_lsb(&a));
+ return 0;
+ }
+ mp_mul_2(&a, &a);
}
/* test mp_reduce_2k */
printf("Testing mp_reduce_2k...\n");
for (cnt = 3; cnt <= 128; ++cnt) {
- mp_digit tmp;
- mp_2expt(&a, cnt);
- mp_sub_d(&a, 2, &a); /* a = 2**cnt - 2 */
-
-
- printf("\nTesting %4d bits", cnt);
- printf("(%d)", mp_reduce_is_2k(&a));
- mp_reduce_2k_setup(&a, &tmp);
- printf("(%d)", tmp);
- for (ix = 0; ix < 1000; ix++) {
- if (!(ix & 127)) {printf("."); fflush(stdout); }
- mp_rand(&b, (cnt/DIGIT_BIT + 1) * 2);
- mp_copy(&c, &b);
- mp_mod(&c, &a, &c);
- mp_reduce_2k(&b, &a, 1);
- if (mp_cmp(&c, &b)) {
- printf("FAILED\n");
- exit(0);
- }
- }
- }
+ mp_digit tmp;
+
+ mp_2expt(&a, cnt);
+ mp_sub_d(&a, 2, &a); /* a = 2**cnt - 2 */
+
+
+ printf("\nTesting %4d bits", cnt);
+ printf("(%d)", mp_reduce_is_2k(&a));
+ mp_reduce_2k_setup(&a, &tmp);
+ printf("(%d)", tmp);
+ for (ix = 0; ix < 1000; ix++) {
+ if (!(ix & 127)) {
+ printf(".");
+ fflush(stdout);
+ }
+ mp_rand(&b, (cnt / DIGIT_BIT + 1) * 2);
+ mp_copy(&c, &b);
+ mp_mod(&c, &a, &c);
+ mp_reduce_2k(&b, &a, 1);
+ if (mp_cmp(&c, &b)) {
+ printf("FAILED\n");
+ exit(0);
+ }
+ }
+ }
/* test mp_div_3 */
printf("Testing mp_div_3...\n");
mp_set(&d, 3);
- for (cnt = 0; cnt < 10000; ) {
+ for (cnt = 0; cnt < 10000;) {
mp_digit r1, r2;
- if (!(++cnt & 127)) printf("%9d\r", cnt);
+ if (!(++cnt & 127))
+ printf("%9d\r", cnt);
mp_rand(&a, abs(rand()) % 128 + 1);
mp_div(&a, &d, &b, &e);
mp_div_3(&a, &c, &r2);
if (mp_cmp(&b, &c) || mp_cmp_d(&e, r2)) {
- printf("\n\nmp_div_3 => Failure\n");
+ printf("\n\nmp_div_3 => Failure\n");
}
}
printf("\n\nPassed div_3 testing\n");
@@ -246,270 +264,438 @@ int main(void)
/* test the DR reduction */
printf("testing mp_dr_reduce...\n");
for (cnt = 2; cnt < 32; cnt++) {
- printf("%d digit modulus\n", cnt);
- mp_grow(&a, cnt);
- mp_zero(&a);
- for (ix = 1; ix < cnt; ix++) {
- a.dp[ix] = MP_MASK;
- }
- a.used = cnt;
- a.dp[0] = 3;
-
- mp_rand(&b, cnt - 1);
- mp_copy(&b, &c);
+ printf("%d digit modulus\n", cnt);
+ mp_grow(&a, cnt);
+ mp_zero(&a);
+ for (ix = 1; ix < cnt; ix++) {
+ a.dp[ix] = MP_MASK;
+ }
+ a.used = cnt;
+ a.dp[0] = 3;
+
+ mp_rand(&b, cnt - 1);
+ mp_copy(&b, &c);
rr = 0;
do {
- if (!(rr & 127)) { printf("%9lu\r", rr); fflush(stdout); }
- mp_sqr(&b, &b); mp_add_d(&b, 1, &b);
- mp_copy(&b, &c);
-
- mp_mod(&b, &a, &b);
- mp_dr_reduce(&c, &a, (((mp_digit)1)<<DIGIT_BIT)-a.dp[0]);
-
- if (mp_cmp(&b, &c) != MP_EQ) {
- printf("Failed on trial %lu\n", rr); exit(-1);
-
- }
+ if (!(rr & 127)) {
+ printf("%9lu\r", rr);
+ fflush(stdout);
+ }
+ mp_sqr(&b, &b);
+ mp_add_d(&b, 1, &b);
+ mp_copy(&b, &c);
+
+ mp_mod(&b, &a, &b);
+ mp_dr_reduce(&c, &a, (((mp_digit) 1) << DIGIT_BIT) - a.dp[0]);
+
+ if (mp_cmp(&b, &c) != MP_EQ) {
+ printf("Failed on trial %lu\n", rr);
+ exit(-1);
+
+ }
} while (++rr < 500);
printf("Passed DR test for %d digits\n", cnt);
}
#endif
+/* test the mp_reduce_2k_l code */
+#if 0
+#if 0
+/* first load P with 2^1024 - 0x2A434 B9FDEC95 D8F9D550 FFFFFFFF FFFFFFFF */
+ mp_2expt(&a, 1024);
+ mp_read_radix(&b, "2A434B9FDEC95D8F9D550FFFFFFFFFFFFFFFF", 16);
+ mp_sub(&a, &b, &a);
+#elif 1
+/* p = 2^2048 - 0x1 00000000 00000000 00000000 00000000 4945DDBF 8EA2A91D 5776399B B83E188F */
+ mp_2expt(&a, 2048);
+ mp_read_radix(&b,
+ "1000000000000000000000000000000004945DDBF8EA2A91D5776399BB83E188F",
+ 16);
+ mp_sub(&a, &b, &a);
+#endif
+
+ mp_todecimal(&a, buf);
+ printf("p==%s\n", buf);
+/* now mp_reduce_is_2k_l() should return */
+ if (mp_reduce_is_2k_l(&a) != 1) {
+ printf("mp_reduce_is_2k_l() return 0, should be 1\n");
+ return EXIT_FAILURE;
+ }
+ mp_reduce_2k_setup_l(&a, &d);
+ /* now do a million square+1 to see if it varies */
+ mp_rand(&b, 64);
+ mp_mod(&b, &a, &b);
+ mp_copy(&b, &c);
+ printf("testing mp_reduce_2k_l...");
+ fflush(stdout);
+ for (cnt = 0; cnt < (1UL << 20); cnt++) {
+ mp_sqr(&b, &b);
+ mp_add_d(&b, 1, &b);
+ mp_reduce_2k_l(&b, &a, &d);
+ mp_sqr(&c, &c);
+ mp_add_d(&c, 1, &c);
+ mp_mod(&c, &a, &c);
+ if (mp_cmp(&b, &c) != MP_EQ) {
+ printf("mp_reduce_2k_l() failed at step %lu\n", cnt);
+ mp_tohex(&b, buf);
+ printf("b == %s\n", buf);
+ mp_tohex(&c, buf);
+ printf("c == %s\n", buf);
+ return EXIT_FAILURE;
+ }
+ }
+ printf("...Passed\n");
+#endif
+
div2_n = mul2_n = inv_n = expt_n = lcm_n = gcd_n = add_n =
- sub_n = mul_n = div_n = sqr_n = mul2d_n = div2d_n = cnt = add_d_n = sub_d_n= 0;
+ sub_n = mul_n = div_n = sqr_n = mul2d_n = div2d_n = cnt = add_d_n =
+ sub_d_n = 0;
/* force KARA and TOOM to enable despite cutoffs */
KARATSUBA_SQR_CUTOFF = KARATSUBA_MUL_CUTOFF = 110;
- TOOM_SQR_CUTOFF = TOOM_MUL_CUTOFF = 150;
+ TOOM_SQR_CUTOFF = TOOM_MUL_CUTOFF = 150;
for (;;) {
- /* randomly clear and re-init one variable, this has the affect of triming the alloc space */
- switch (abs(rand()) % 7) {
- case 0: mp_clear(&a); mp_init(&a); break;
- case 1: mp_clear(&b); mp_init(&b); break;
- case 2: mp_clear(&c); mp_init(&c); break;
- case 3: mp_clear(&d); mp_init(&d); break;
- case 4: mp_clear(&e); mp_init(&e); break;
- case 5: mp_clear(&f); mp_init(&f); break;
- case 6: break; /* don't clear any */
- }
-
-
- printf("%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu ", add_n, sub_n, mul_n, div_n, sqr_n, mul2d_n, div2d_n, gcd_n, lcm_n, expt_n, inv_n, div2_n, mul2_n, add_d_n, sub_d_n);
- fgets(cmd, 4095, stdin);
- cmd[strlen(cmd)-1] = 0;
- printf("%s ]\r",cmd); fflush(stdout);
- if (!strcmp(cmd, "mul2d")) { ++mul2d_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); sscanf(buf, "%d", &rr);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
-
- mp_mul_2d(&a, rr, &a);
- a.sign = b.sign;
- if (mp_cmp(&a, &b) != MP_EQ) {
- printf("mul2d failed, rr == %d\n",rr);
- draw(&a);
- draw(&b);
- return 0;
- }
- } else if (!strcmp(cmd, "div2d")) { ++div2d_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); sscanf(buf, "%d", &rr);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
-
- mp_div_2d(&a, rr, &a, &e);
- a.sign = b.sign;
- if (a.used == b.used && a.used == 0) { a.sign = b.sign = MP_ZPOS; }
- if (mp_cmp(&a, &b) != MP_EQ) {
- printf("div2d failed, rr == %d\n",rr);
- draw(&a);
- draw(&b);
- return 0;
- }
- } else if (!strcmp(cmd, "add")) { ++add_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&c, buf, 64);
- mp_copy(&a, &d);
- mp_add(&d, &b, &d);
- if (mp_cmp(&c, &d) != MP_EQ) {
- printf("add %lu failure!\n", add_n);
-draw(&a);draw(&b);draw(&c);draw(&d);
- return 0;
- }
-
- /* test the sign/unsigned storage functions */
-
- rr = mp_signed_bin_size(&c);
- mp_to_signed_bin(&c, (unsigned char *)cmd);
- memset(cmd+rr, rand()&255, sizeof(cmd)-rr);
- mp_read_signed_bin(&d, (unsigned char *)cmd, rr);
- if (mp_cmp(&c, &d) != MP_EQ) {
- printf("mp_signed_bin failure!\n");
- draw(&c);
- draw(&d);
- return 0;
- }
-
-
- rr = mp_unsigned_bin_size(&c);
- mp_to_unsigned_bin(&c, (unsigned char *)cmd);
- memset(cmd+rr, rand()&255, sizeof(cmd)-rr);
- mp_read_unsigned_bin(&d, (unsigned char *)cmd, rr);
- if (mp_cmp_mag(&c, &d) != MP_EQ) {
- printf("mp_unsigned_bin failure!\n");
- draw(&c);
- draw(&d);
- return 0;
- }
-
- } else if (!strcmp(cmd, "sub")) { ++sub_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&c, buf, 64);
- mp_copy(&a, &d);
- mp_sub(&d, &b, &d);
- if (mp_cmp(&c, &d) != MP_EQ) {
- printf("sub %lu failure!\n", sub_n);
-draw(&a);draw(&b);draw(&c);draw(&d);
- return 0;
- }
- } else if (!strcmp(cmd, "mul")) { ++mul_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&c, buf, 64);
- mp_copy(&a, &d);
- mp_mul(&d, &b, &d);
- if (mp_cmp(&c, &d) != MP_EQ) {
- printf("mul %lu failure!\n", mul_n);
-draw(&a);draw(&b);draw(&c);draw(&d);
- return 0;
- }
- } else if (!strcmp(cmd, "div")) { ++div_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&c, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&d, buf, 64);
-
- mp_div(&a, &b, &e, &f);
- if (mp_cmp(&c, &e) != MP_EQ || mp_cmp(&d, &f) != MP_EQ) {
- printf("div %lu %d, %d, failure!\n", div_n, mp_cmp(&c, &e), mp_cmp(&d, &f));
-draw(&a);draw(&b);draw(&c);draw(&d); draw(&e); draw(&f);
- return 0;
- }
-
- } else if (!strcmp(cmd, "sqr")) { ++sqr_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
- mp_copy(&a, &c);
- mp_sqr(&c, &c);
- if (mp_cmp(&b, &c) != MP_EQ) {
- printf("sqr %lu failure!\n", sqr_n);
-draw(&a);draw(&b);draw(&c);
- return 0;
- }
- } else if (!strcmp(cmd, "gcd")) { ++gcd_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&c, buf, 64);
- mp_copy(&a, &d);
- mp_gcd(&d, &b, &d);
- d.sign = c.sign;
- if (mp_cmp(&c, &d) != MP_EQ) {
- printf("gcd %lu failure!\n", gcd_n);
-draw(&a);draw(&b);draw(&c);draw(&d);
- return 0;
- }
- } else if (!strcmp(cmd, "lcm")) { ++lcm_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&c, buf, 64);
- mp_copy(&a, &d);
- mp_lcm(&d, &b, &d);
- d.sign = c.sign;
- if (mp_cmp(&c, &d) != MP_EQ) {
- printf("lcm %lu failure!\n", lcm_n);
- draw(&a);draw(&b);draw(&c);draw(&d);
- return 0;
- }
- } else if (!strcmp(cmd, "expt")) { ++expt_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&c, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&d, buf, 64);
- mp_copy(&a, &e);
- mp_exptmod(&e, &b, &c, &e);
- if (mp_cmp(&d, &e) != MP_EQ) {
- printf("expt %lu failure!\n", expt_n);
- draw(&a);draw(&b);draw(&c);draw(&d); draw(&e);
- return 0;
- }
- } else if (!strcmp(cmd, "invmod")) { ++inv_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&c, buf, 64);
- mp_invmod(&a, &b, &d);
- mp_mulmod(&d,&a,&b,&e);
- if (mp_cmp_d(&e, 1) != MP_EQ) {
- printf("inv [wrong value from MPI?!] failure\n");
- draw(&a);draw(&b);draw(&c);draw(&d);
- mp_gcd(&a, &b, &e);
- draw(&e);
- return 0;
- }
-
- } else if (!strcmp(cmd, "div2")) { ++div2_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
- mp_div_2(&a, &c);
- if (mp_cmp(&c, &b) != MP_EQ) {
- printf("div_2 %lu failure\n", div2_n);
- draw(&a);
- draw(&b);
- draw(&c);
- return 0;
- }
- } else if (!strcmp(cmd, "mul2")) { ++mul2_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
- mp_mul_2(&a, &c);
- if (mp_cmp(&c, &b) != MP_EQ) {
- printf("mul_2 %lu failure\n", mul2_n);
- draw(&a);
- draw(&b);
- draw(&c);
- return 0;
- }
- } else if (!strcmp(cmd, "add_d")) { ++add_d_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); sscanf(buf, "%d", &ix);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
- mp_add_d(&a, ix, &c);
- if (mp_cmp(&b, &c) != MP_EQ) {
- printf("add_d %lu failure\n", add_d_n);
- draw(&a);
- draw(&b);
- draw(&c);
- printf("d == %d\n", ix);
- return 0;
- }
- } else if (!strcmp(cmd, "sub_d")) { ++sub_d_n;
- fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 64);
- fgets(buf, 4095, stdin); sscanf(buf, "%d", &ix);
- fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 64);
- mp_sub_d(&a, ix, &c);
- if (mp_cmp(&b, &c) != MP_EQ) {
- printf("sub_d %lu failure\n", sub_d_n);
- draw(&a);
- draw(&b);
- draw(&c);
- printf("d == %d\n", ix);
- return 0;
- }
- }
+ /* randomly clear and re-init one variable, this has the affect of triming the alloc space */
+ switch (abs(rand()) % 7) {
+ case 0:
+ mp_clear(&a);
+ mp_init(&a);
+ break;
+ case 1:
+ mp_clear(&b);
+ mp_init(&b);
+ break;
+ case 2:
+ mp_clear(&c);
+ mp_init(&c);
+ break;
+ case 3:
+ mp_clear(&d);
+ mp_init(&d);
+ break;
+ case 4:
+ mp_clear(&e);
+ mp_init(&e);
+ break;
+ case 5:
+ mp_clear(&f);
+ mp_init(&f);
+ break;
+ case 6:
+ break; /* don't clear any */
+ }
+
+
+ printf
+ ("%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu/%4lu ",
+ add_n, sub_n, mul_n, div_n, sqr_n, mul2d_n, div2d_n, gcd_n, lcm_n,
+ expt_n, inv_n, div2_n, mul2_n, add_d_n, sub_d_n);
+ fgets(cmd, 4095, stdin);
+ cmd[strlen(cmd) - 1] = 0;
+ printf("%s ]\r", cmd);
+ fflush(stdout);
+ if (!strcmp(cmd, "mul2d")) {
+ ++mul2d_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ sscanf(buf, "%d", &rr);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+
+ mp_mul_2d(&a, rr, &a);
+ a.sign = b.sign;
+ if (mp_cmp(&a, &b) != MP_EQ) {
+ printf("mul2d failed, rr == %d\n", rr);
+ draw(&a);
+ draw(&b);
+ return 0;
+ }
+ } else if (!strcmp(cmd, "div2d")) {
+ ++div2d_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ sscanf(buf, "%d", &rr);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+
+ mp_div_2d(&a, rr, &a, &e);
+ a.sign = b.sign;
+ if (a.used == b.used && a.used == 0) {
+ a.sign = b.sign = MP_ZPOS;
+ }
+ if (mp_cmp(&a, &b) != MP_EQ) {
+ printf("div2d failed, rr == %d\n", rr);
+ draw(&a);
+ draw(&b);
+ return 0;
+ }
+ } else if (!strcmp(cmd, "add")) {
+ ++add_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&c, buf, 64);
+ mp_copy(&a, &d);
+ mp_add(&d, &b, &d);
+ if (mp_cmp(&c, &d) != MP_EQ) {
+ printf("add %lu failure!\n", add_n);
+ draw(&a);
+ draw(&b);
+ draw(&c);
+ draw(&d);
+ return 0;
+ }
+
+ /* test the sign/unsigned storage functions */
+
+ rr = mp_signed_bin_size(&c);
+ mp_to_signed_bin(&c, (unsigned char *) cmd);
+ memset(cmd + rr, rand() & 255, sizeof(cmd) - rr);
+ mp_read_signed_bin(&d, (unsigned char *) cmd, rr);
+ if (mp_cmp(&c, &d) != MP_EQ) {
+ printf("mp_signed_bin failure!\n");
+ draw(&c);
+ draw(&d);
+ return 0;
+ }
+
+
+ rr = mp_unsigned_bin_size(&c);
+ mp_to_unsigned_bin(&c, (unsigned char *) cmd);
+ memset(cmd + rr, rand() & 255, sizeof(cmd) - rr);
+ mp_read_unsigned_bin(&d, (unsigned char *) cmd, rr);
+ if (mp_cmp_mag(&c, &d) != MP_EQ) {
+ printf("mp_unsigned_bin failure!\n");
+ draw(&c);
+ draw(&d);
+ return 0;
+ }
+
+ } else if (!strcmp(cmd, "sub")) {
+ ++sub_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&c, buf, 64);
+ mp_copy(&a, &d);
+ mp_sub(&d, &b, &d);
+ if (mp_cmp(&c, &d) != MP_EQ) {
+ printf("sub %lu failure!\n", sub_n);
+ draw(&a);
+ draw(&b);
+ draw(&c);
+ draw(&d);
+ return 0;
+ }
+ } else if (!strcmp(cmd, "mul")) {
+ ++mul_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&c, buf, 64);
+ mp_copy(&a, &d);
+ mp_mul(&d, &b, &d);
+ if (mp_cmp(&c, &d) != MP_EQ) {
+ printf("mul %lu failure!\n", mul_n);
+ draw(&a);
+ draw(&b);
+ draw(&c);
+ draw(&d);
+ return 0;
+ }
+ } else if (!strcmp(cmd, "div")) {
+ ++div_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&c, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&d, buf, 64);
+
+ mp_div(&a, &b, &e, &f);
+ if (mp_cmp(&c, &e) != MP_EQ || mp_cmp(&d, &f) != MP_EQ) {
+ printf("div %lu %d, %d, failure!\n", div_n, mp_cmp(&c, &e),
+ mp_cmp(&d, &f));
+ draw(&a);
+ draw(&b);
+ draw(&c);
+ draw(&d);
+ draw(&e);
+ draw(&f);
+ return 0;
+ }
+
+ } else if (!strcmp(cmd, "sqr")) {
+ ++sqr_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+ mp_copy(&a, &c);
+ mp_sqr(&c, &c);
+ if (mp_cmp(&b, &c) != MP_EQ) {
+ printf("sqr %lu failure!\n", sqr_n);
+ draw(&a);
+ draw(&b);
+ draw(&c);
+ return 0;
+ }
+ } else if (!strcmp(cmd, "gcd")) {
+ ++gcd_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&c, buf, 64);
+ mp_copy(&a, &d);
+ mp_gcd(&d, &b, &d);
+ d.sign = c.sign;
+ if (mp_cmp(&c, &d) != MP_EQ) {
+ printf("gcd %lu failure!\n", gcd_n);
+ draw(&a);
+ draw(&b);
+ draw(&c);
+ draw(&d);
+ return 0;
+ }
+ } else if (!strcmp(cmd, "lcm")) {
+ ++lcm_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&c, buf, 64);
+ mp_copy(&a, &d);
+ mp_lcm(&d, &b, &d);
+ d.sign = c.sign;
+ if (mp_cmp(&c, &d) != MP_EQ) {
+ printf("lcm %lu failure!\n", lcm_n);
+ draw(&a);
+ draw(&b);
+ draw(&c);
+ draw(&d);
+ return 0;
+ }
+ } else if (!strcmp(cmd, "expt")) {
+ ++expt_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&c, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&d, buf, 64);
+ mp_copy(&a, &e);
+ mp_exptmod(&e, &b, &c, &e);
+ if (mp_cmp(&d, &e) != MP_EQ) {
+ printf("expt %lu failure!\n", expt_n);
+ draw(&a);
+ draw(&b);
+ draw(&c);
+ draw(&d);
+ draw(&e);
+ return 0;
+ }
+ } else if (!strcmp(cmd, "invmod")) {
+ ++inv_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&c, buf, 64);
+ mp_invmod(&a, &b, &d);
+ mp_mulmod(&d, &a, &b, &e);
+ if (mp_cmp_d(&e, 1) != MP_EQ) {
+ printf("inv [wrong value from MPI?!] failure\n");
+ draw(&a);
+ draw(&b);
+ draw(&c);
+ draw(&d);
+ mp_gcd(&a, &b, &e);
+ draw(&e);
+ return 0;
+ }
+
+ } else if (!strcmp(cmd, "div2")) {
+ ++div2_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+ mp_div_2(&a, &c);
+ if (mp_cmp(&c, &b) != MP_EQ) {
+ printf("div_2 %lu failure\n", div2_n);
+ draw(&a);
+ draw(&b);
+ draw(&c);
+ return 0;
+ }
+ } else if (!strcmp(cmd, "mul2")) {
+ ++mul2_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+ mp_mul_2(&a, &c);
+ if (mp_cmp(&c, &b) != MP_EQ) {
+ printf("mul_2 %lu failure\n", mul2_n);
+ draw(&a);
+ draw(&b);
+ draw(&c);
+ return 0;
+ }
+ } else if (!strcmp(cmd, "add_d")) {
+ ++add_d_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ sscanf(buf, "%d", &ix);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+ mp_add_d(&a, ix, &c);
+ if (mp_cmp(&b, &c) != MP_EQ) {
+ printf("add_d %lu failure\n", add_d_n);
+ draw(&a);
+ draw(&b);
+ draw(&c);
+ printf("d == %d\n", ix);
+ return 0;
+ }
+ } else if (!strcmp(cmd, "sub_d")) {
+ ++sub_d_n;
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&a, buf, 64);
+ fgets(buf, 4095, stdin);
+ sscanf(buf, "%d", &ix);
+ fgets(buf, 4095, stdin);
+ mp_read_radix(&b, buf, 64);
+ mp_sub_d(&a, ix, &c);
+ if (mp_cmp(&b, &c) != MP_EQ) {
+ printf("sub_d %lu failure\n", sub_d_n);
+ draw(&a);
+ draw(&b);
+ draw(&c);
+ printf("d == %d\n", ix);
+ return 0;
+ }
+ }
}
return 0;
}
-
398 demo/timing.c
View
@@ -11,15 +11,16 @@ ulong64 _tt;
#endif
-void ndraw(mp_int *a, char *name)
+void ndraw(mp_int * a, char *name)
{
char buf[4096];
+
printf("%s: ", name);
mp_toradix(a, buf, 64);
printf("%s\n", buf);
}
-static void draw(mp_int *a)
+static void draw(mp_int * a)
{
ndraw(a, "");
}
@@ -39,35 +40,38 @@ int lbit(void)
}
/* RDTSC from Scott Duplichan */
-static ulong64 TIMFUNC (void)
- {
- #if defined __GNUC__
- #if defined(__i386__) || defined(__x86_64__)
- unsigned long long a;
- __asm__ __volatile__ ("rdtsc\nmovl %%eax,%0\nmovl %%edx,4+%0\n"::"m"(a):"%eax","%edx");
- return a;
- #else /* gcc-IA64 version */
- unsigned long result;
- __asm__ __volatile__("mov %0=ar.itc" : "=r"(result) :: "memory");
- while (__builtin_expect ((int) result == -1, 0))
- __asm__ __volatile__("mov %0=ar.itc" : "=r"(result) :: "memory");
- return result;
- #endif
+static ulong64 TIMFUNC(void)
+{
+#if defined __GNUC__
+#if defined(__i386__) || defined(__x86_64__)
+ unsigned long long a;
+ __asm__ __volatile__("rdtsc\nmovl %%eax,%0\nmovl %%edx,4+%0\n"::
+ "m"(a):"%eax", "%edx");
+ return a;
+#else /* gcc-IA64 version */
+ unsigned long result;
+ __asm__ __volatile__("mov %0=ar.itc":"=r"(result)::"memory");
+
+ while (__builtin_expect((int) result == -1, 0))
+ __asm__ __volatile__("mov %0=ar.itc":"=r"(result)::"memory");
+
+ return result;
+#endif
// Microsoft and Intel Windows compilers
- #elif defined _M_IX86
- __asm rdtsc
- #elif defined _M_AMD64
- return __rdtsc ();
- #elif defined _M_IA64
- #if defined __INTEL_COMPILER
- #include <ia64intrin.h>
- #endif
- return __getReg (3116);
- #else
- #error need rdtsc function for this build
- #endif
- }
+#elif defined _M_IX86
+ __asm rdtsc
+#elif defined _M_AMD64
+ return __rdtsc();
+#elif defined _M_IA64
+#if defined __INTEL_COMPILER
+#include <ia64intrin.h>
+#endif
+ return __getReg(3116);
+#else
+#error need rdtsc function for this build
+#endif
+}
#define DO(x) x; x;
//#define DO4(x) DO2(x); DO2(x);
@@ -77,7 +81,7 @@ static ulong64 TIMFUNC (void)
int main(void)
{
ulong64 tt, gg, CLK_PER_SEC;
- FILE *log, *logb, *logc;
+ FILE *log, *logb, *logc, *logd;
mp_int a, b, c, d, e, f;
int n, cnt, ix, old_kara_m, old_kara_s;
unsigned rr;
@@ -90,168 +94,191 @@ int main(void)
mp_init(&f);
srand(time(NULL));
-
-
- /* temp. turn off TOOM */
- TOOM_MUL_CUTOFF = TOOM_SQR_CUTOFF = 100000;
-
- CLK_PER_SEC = TIMFUNC();
- sleep(1);
- CLK_PER_SEC = TIMFUNC() - CLK_PER_SEC;
-
- printf("CLK_PER_SEC == %llu\n", CLK_PER_SEC);
-
- log = fopen("logs/add.log", "w");
- for (cnt = 8; cnt <= 128; cnt += 8) {
- SLEEP;
- mp_rand(&a, cnt);
- mp_rand(&b, cnt);
- rr = 0;
- tt = -1;
- do {
- gg = TIMFUNC();
- DO(mp_add(&a,&b,&c));
- gg = (TIMFUNC() - gg)>>1;
- if (tt > gg) tt = gg;
- } while (++rr < 100000);
- printf("Adding\t\t%4d-bit => %9llu/sec, %9llu cycles\n", mp_count_bits(&a), CLK_PER_SEC/tt, tt);
- fprintf(log, "%d %9llu\n", cnt*DIGIT_BIT, tt); fflush(log);
- }
- fclose(log);
- log = fopen("logs/sub.log", "w");
- for (cnt = 8; cnt <= 128; cnt += 8) {
- SLEEP;
- mp_rand(&a, cnt);
- mp_rand(&b, cnt);
- rr = 0;
- tt = -1;
- do {
- gg = TIMFUNC();
- DO(mp_sub(&a,&b,&c));
- gg = (TIMFUNC() - gg)>>1;
- if (tt > gg) tt = gg;
- } while (++rr < 100000);
-
- printf("Subtracting\t\t%4d-bit => %9llu/sec, %9llu cycles\n", mp_count_bits(&a), CLK_PER_SEC/tt, tt);
- fprintf(log, "%d %9llu\n", cnt*DIGIT_BIT, tt); fflush(log);
- }
- fclose(log);
+
+ /* temp. turn off TOOM */
+ TOOM_MUL_CUTOFF = TOOM_SQR_CUTOFF = 100000;
+
+ CLK_PER_SEC = TIMFUNC();
+ sleep(1);
+ CLK_PER_SEC = TIMFUNC() - CLK_PER_SEC;
+
+ printf("CLK_PER_SEC == %llu\n", CLK_PER_SEC);
+ goto exptmod;
+ log = fopen("logs/add.log", "w");
+ for (cnt = 8; cnt <= 128; cnt += 8) {
+ SLEEP;
+ mp_rand(&a, cnt);
+ mp_rand(&b, cnt);
+ rr = 0;
+ tt = -1;
+ do {
+ gg = TIMFUNC();
+ DO(mp_add(&a, &b, &c));
+ gg = (TIMFUNC() - gg) >> 1;
+ if (tt > gg)
+ tt = gg;
+ } while (++rr < 100000);
+ printf("Adding\t\t%4d-bit => %9llu/sec, %9llu cycles\n",
+ mp_count_bits(&a), CLK_PER_SEC / tt, tt);
+ fprintf(log, "%d %9llu\n", cnt * DIGIT_BIT, tt);
+ fflush(log);
+ }
+ fclose(log);
+
+ log = fopen("logs/sub.log", "w");
+ for (cnt = 8; cnt <= 128; cnt += 8) {
+ SLEEP;
+ mp_rand(&a, cnt);
+ mp_rand(&b, cnt);
+ rr = 0;
+ tt = -1;
+ do {
+ gg = TIMFUNC();
+ DO(mp_sub(&a, &b, &c));
+ gg = (TIMFUNC() - gg) >> 1;
+ if (tt > gg)
+ tt = gg;
+ } while (++rr < 100000);
+
+ printf("Subtracting\t\t%4d-bit => %9llu/sec, %9llu cycles\n",
+ mp_count_bits(&a), CLK_PER_SEC / tt, tt);
+ fprintf(log, "%d %9llu\n", cnt * DIGIT_BIT, tt);
+ fflush(log);
+ }
+ fclose(log);
/* do mult/square twice, first without karatsuba and second with */
+ multtest:
old_kara_m = KARATSUBA_MUL_CUTOFF;
old_kara_s = KARATSUBA_SQR_CUTOFF;
- for (ix = 0; ix < 1; ix++) {
- printf("With%s Karatsuba\n", (ix==0)?"out":"");
-
- KARATSUBA_MUL_CUTOFF = (ix==0)?9999:old_kara_m;
- KARATSUBA_SQR_CUTOFF = (ix==0)?9999:old_kara_s;
-
- log = fopen((ix==0)?"logs/mult.log":"logs/mult_kara.log", "w");
- for (cnt = 4; cnt <= 288; cnt += 2) {
- SLEEP;
- mp_rand(&a, cnt);
- mp_rand(&b, cnt);
- rr = 0;
- tt = -1;
- do {
- gg = TIMFUNC();
- DO(mp_mul(&a, &b, &c));
- gg = (TIMFUNC() - gg)>>1;
- if (tt > gg) tt = gg;
- } while (++rr < 100);
- printf("Multiplying\t%4d-bit => %9llu/sec, %9llu cycles\n", mp_count_bits(&a), CLK_PER_SEC/tt, tt);
- fprintf(log, "%d %9llu\n", mp_count_bits(&a), tt); fflush(log);
+ for (ix = 0; ix < 2; ix++) {
+ printf("With%s Karatsuba\n", (ix == 0) ? "out" : "");
+
+ KARATSUBA_MUL_CUTOFF = (ix == 0) ? 9999 : old_kara_m;
+ KARATSUBA_SQR_CUTOFF = (ix == 0) ? 9999 : old_kara_s;
+
+ log = fopen((ix == 0) ? "logs/mult.log" : "logs/mult_kara.log", "w");
+ for (cnt = 4; cnt <= 10240 / DIGIT_BIT; cnt += 2) {
+ SLEEP;
+ mp_rand(&a, cnt);
+ mp_rand(&b, cnt);
+ rr = 0;
+ tt = -1;
+ do {
+ gg = TIMFUNC();
+ DO(mp_mul(&a, &b, &c));
+ gg = (TIMFUNC() - gg) >> 1;
+ if (tt > gg)
+ tt = gg;
+ } while (++rr < 100);
+ printf("Multiplying\t%4d-bit => %9llu/sec, %9llu cycles\n",
+ mp_count_bits(&a), CLK_PER_SEC / tt, tt);
+ fprintf(log, "%d %9llu\n", mp_count_bits(&a), tt);
+ fflush(log);
}
fclose(log);
- log = fopen((ix==0)?"logs/sqr.log":"logs/sqr_kara.log", "w");
- for (cnt = 4; cnt <= 288; cnt += 2) {
- SLEEP;
- mp_rand(&a, cnt);
- rr = 0;
- tt = -1;
- do {
- gg = TIMFUNC();
- DO(mp_sqr(&a, &b));
- gg = (TIMFUNC() - gg)>>1;
- if (tt > gg) tt = gg;
- } while (++rr < 100);
- printf("Squaring\t%4d-bit => %9llu/sec, %9llu cycles\n", mp_count_bits(&a), CLK_PER_SEC/tt, tt);
- fprintf(log, "%d %9llu\n", mp_count_bits(&a), tt); fflush(log);
+ log = fopen((ix == 0) ? "logs/sqr.log" : "logs/sqr_kara.log", "w");
+ for (cnt = 4; cnt <= 10240 / DIGIT_BIT; cnt += 2) {
+ SLEEP;
+ mp_rand(&a, cnt);
+ rr = 0;
+ tt = -1;
+ do {
+ gg = TIMFUNC();
+ DO(mp_sqr(&a, &b));
+ gg = (TIMFUNC() - gg) >> 1;
+ if (tt > gg)
+ tt = gg;
+ } while (++rr < 100);
+ printf("Squaring\t%4d-bit => %9llu/sec, %9llu cycles\n",
+ mp_count_bits(&a), CLK_PER_SEC / tt, tt);
+ fprintf(log, "%d %9llu\n", mp_count_bits(&a), tt);
+ fflush(log);
}
fclose(log);
}
+ exptmod:
- {
+ {
char *primes[] = {
- /* 2K moduli mersenne primes */
- "6864797660130609714981900799081393217269435300143305409394463459185543183397656052122559640661454554977296311391480858037121987999716643812574028291115057151",
- "531137992816767098689588206552468627329593117727031923199444138200403559860852242739162502265229285668889329486246501015346579337652707239409519978766587351943831270835393219031728127",
- "10407932194664399081925240327364085538615262247266704805319112350403608059673360298012239441732324184842421613954281007791383566248323464908139906605677320762924129509389220345773183349661583550472959420547689811211693677147548478866962501384438260291732348885311160828538416585028255604666224831890918801847068222203140521026698435488732958028878050869736186900714720710555703168729087",
- "1475979915214180235084898622737381736312066145333169775147771216478570297878078949377407337049389289382748507531496480477281264838760259191814463365330269540496961201113430156902396093989090226259326935025281409614983499388222831448598601834318536230923772641390209490231836446899608210795482963763094236630945410832793769905399982457186322944729636418890623372171723742105636440368218459649632948538696905872650486914434637457507280441823676813517852099348660847172579408422316678097670224011990280170474894487426924742108823536808485072502240519452587542875349976558572670229633962575212637477897785501552646522609988869914013540483809865681250419497686697771007",
- "259117086013202627776246767922441530941818887553125427303974923161874019266586362086201209516800483406550695241733194177441689509238807017410377709597512042313066624082916353517952311186154862265604547691127595848775610568757931191017711408826252153849035830401185072116424747461823031471398340229288074545677907941037288235820705892351068433882986888616658650280927692080339605869308790500409503709875902119018371991620994002568935113136548829739112656797303241986517250116412703509705427773477972349821676443446668383119322540099648994051790241624056519054483690809616061625743042361721863339415852426431208737266591962061753535748892894599629195183082621860853400937932839420261866586142503251450773096274235376822938649407127700846077124211823080804139298087057504713825264571448379371125032081826126566649084251699453951887789613650248405739378594599444335231188280123660406262468609212150349937584782292237144339628858485938215738821232393687046160677362909315071",
- "190797007524439073807468042969529173669356994749940177394741882673528979787005053706368049835514900244303495954950709725762186311224148828811920216904542206960744666169364221195289538436845390250168663932838805192055137154390912666527533007309292687539092257043362517857366624699975402375462954490293259233303137330643531556539739921926201438606439020075174723029056838272505051571967594608350063404495977660656269020823960825567012344189908927956646011998057988548630107637380993519826582389781888135705408653045219655801758081251164080554609057468028203308718724654081055323215860189611391296030471108443146745671967766308925858547271507311563765171008318248647110097614890313562856541784154881743146033909602737947385055355960331855614540900081456378659068370317267696980001187750995491090350108417050917991562167972281070161305972518044872048331306383715094854938415738549894606070722584737978176686422134354526989443028353644037187375385397838259511833166416134323695660367676897722287918773420968982326089026150031515424165462111337527431154890666327374921446276833564519776797633875503548665093914556482031482248883127023777039667707976559857333357013727342079099064400455741830654320379350833236245819348824064783585692924881021978332974949906122664421376034687815350484991",
-
- /* DR moduli */
- "14059105607947488696282932836518693308967803494693489478439861164411992439598399594747002144074658928593502845729752797260025831423419686528151609940203368612079",
- "101745825697019260773923519755878567461315282017759829107608914364075275235254395622580447400994175578963163918967182013639660669771108475957692810857098847138903161308502419410142185759152435680068435915159402496058513611411688900243039",
- "736335108039604595805923406147184530889923370574768772191969612422073040099331944991573923112581267542507986451953227192970402893063850485730703075899286013451337291468249027691733891486704001513279827771740183629161065194874727962517148100775228363421083691764065477590823919364012917984605619526140821797602431",
- "38564998830736521417281865696453025806593491967131023221754800625044118265468851210705360385717536794615180260494208076605798671660719333199513807806252394423283413430106003596332513246682903994829528690198205120921557533726473585751382193953592127439965050261476810842071573684505878854588706623484573925925903505747545471088867712185004135201289273405614415899438276535626346098904241020877974002916168099951885406379295536200413493190419727789712076165162175783",
- "542189391331696172661670440619180536749994166415993334151601745392193484590296600979602378676624808129613777993466242203025054573692562689251250471628358318743978285860720148446448885701001277560572526947619392551574490839286458454994488665744991822837769918095117129546414124448777033941223565831420390846864429504774477949153794689948747680362212954278693335653935890352619041936727463717926744868338358149568368643403037768649616778526013610493696186055899318268339432671541328195724261329606699831016666359440874843103020666106568222401047720269951530296879490444224546654729111504346660859907296364097126834834235287147",
- "1487259134814709264092032648525971038895865645148901180585340454985524155135260217788758027400478312256339496385275012465661575576202252063145698732079880294664220579764848767704076761853197216563262660046602703973050798218246170835962005598561669706844469447435461092542265792444947706769615695252256130901271870341005768912974433684521436211263358097522726462083917939091760026658925757076733484173202927141441492573799914240222628795405623953109131594523623353044898339481494120112723445689647986475279242446083151413667587008191682564376412347964146113898565886683139407005941383669325997475076910488086663256335689181157957571445067490187939553165903773554290260531009121879044170766615232300936675369451260747671432073394867530820527479172464106442450727640226503746586340279816318821395210726268291535648506190714616083163403189943334431056876038286530365757187367147446004855912033137386225053275419626102417236133948503",
- "1095121115716677802856811290392395128588168592409109494900178008967955253005183831872715423151551999734857184538199864469605657805519106717529655044054833197687459782636297255219742994736751541815269727940751860670268774903340296040006114013971309257028332849679096824800250742691718610670812374272414086863715763724622797509437062518082383056050144624962776302147890521249477060215148275163688301275847155316042279405557632639366066847442861422164832655874655824221577849928863023018366835675399949740429332468186340518172487073360822220449055340582568461568645259954873303616953776393853174845132081121976327462740354930744487429617202585015510744298530101547706821590188733515880733527449780963163909830077616357506845523215289297624086914545378511082534229620116563260168494523906566709418166011112754529766183554579321224940951177394088465596712620076240067370589036924024728375076210477267488679008016579588696191194060127319035195370137160936882402244399699172017835144537488486396906144217720028992863941288217185353914991583400421682751000603596655790990815525126154394344641336397793791497068253936771017031980867706707490224041075826337383538651825493679503771934836094655802776331664261631740148281763487765852746577808019633679",
-
- /* generic unrestricted moduli */
- "17933601194860113372237070562165128350027320072176844226673287945873370751245439587792371960615073855669274087805055507977323024886880985062002853331424203",
- "2893527720709661239493896562339544088620375736490408468011883030469939904368086092336458298221245707898933583190713188177399401852627749210994595974791782790253946539043962213027074922559572312141181787434278708783207966459019479487",
- "347743159439876626079252796797422223177535447388206607607181663903045907591201940478223621722118173270898487582987137708656414344685816179420855160986340457973820182883508387588163122354089264395604796675278966117567294812714812796820596564876450716066283126720010859041484786529056457896367683122960411136319",
- "47266428956356393164697365098120418976400602706072312735924071745438532218237979333351774907308168340693326687317443721193266215155735814510792148768576498491199122744351399489453533553203833318691678263241941706256996197460424029012419012634671862283532342656309677173602509498417976091509154360039893165037637034737020327399910409885798185771003505320583967737293415979917317338985837385734747478364242020380416892056650841470869294527543597349250299539682430605173321029026555546832473048600327036845781970289288898317888427517364945316709081173840186150794397479045034008257793436817683392375274635794835245695887",
- "436463808505957768574894870394349739623346440601945961161254440072143298152040105676491048248110146278752857839930515766167441407021501229924721335644557342265864606569000117714935185566842453630868849121480179691838399545644365571106757731317371758557990781880691336695584799313313687287468894148823761785582982549586183756806449017542622267874275103877481475534991201849912222670102069951687572917937634467778042874315463238062009202992087620963771759666448266532858079402669920025224220613419441069718482837399612644978839925207109870840278194042158748845445131729137117098529028886770063736487420613144045836803985635654192482395882603511950547826439092832800532152534003936926017612446606135655146445620623395788978726744728503058670046885876251527122350275750995227",
- "11424167473351836398078306042624362277956429440521137061889702611766348760692206243140413411077394583180726863277012016602279290144126785129569474909173584789822341986742719230331946072730319555984484911716797058875905400999504305877245849119687509023232790273637466821052576859232452982061831009770786031785669030271542286603956118755585683996118896215213488875253101894663403069677745948305893849505434201763745232895780711972432011344857521691017896316861403206449421332243658855453435784006517202894181640562433575390821384210960117518650374602256601091379644034244332285065935413233557998331562749140202965844219336298970011513882564935538704289446968322281451907487362046511461221329799897350993370560697505809686438782036235372137015731304779072430260986460269894522159103008260495503005267165927542949439526272736586626709581721032189532726389643625590680105784844246152702670169304203783072275089194754889511973916207",
- "1214855636816562637502584060163403830270705000634713483015101384881871978446801224798536155406895823305035467591632531067547890948695117172076954220727075688048751022421198712032848890056357845974246560748347918630050853933697792254955890439720297560693579400297062396904306270145886830719309296352765295712183040773146419022875165382778007040109957609739589875590885701126197906063620133954893216612678838507540777138437797705602453719559017633986486649523611975865005712371194067612263330335590526176087004421363598470302731349138773205901447704682181517904064735636518462452242791676541725292378925568296858010151852326316777511935037531017413910506921922450666933202278489024521263798482237150056835746454842662048692127173834433089016107854491097456725016327709663199738238442164843147132789153725513257167915555162094970853584447993125488607696008169807374736711297007473812256272245489405898470297178738029484459690836250560495461579533254473316340608217876781986188705928270735695752830825527963838355419762516246028680280988020401914551825487349990306976304093109384451438813251211051597392127491464898797406789175453067960072008590614886532333015881171367104445044718144312416815712216611576221546455968770801413440778423979",
- NULL
+ /* 2K large moduli */
+ "179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586239334100047359817950870678242457666208137217",
+ "32317006071311007300714876688669951960444102669715484032130345427524655138867890893197201411522913463688717960921898019494119559150490921095088152386448283120630877367300996091750197750389652106796057638384067568276792218642619756161838094338476170470581645852036305042887575891541065808607552399123930385521914333389668342420684974786564569494856176035326322058077805659331026192708460314150258592864177116725943603718461857357598351152301645904403697613233287231227125684710820209725157101726931323469678542580656697935045997268352998638099733077152121140120031150424541696791951097529546801429027668869927491725169",
+ "1044388881413152506691752710716624382579964249047383780384233483283953907971557456848826811934997558340890106714439262837987573438185793607263236087851365277945956976543709998340361590134383718314428070011855946226376318839397712745672334684344586617496807908705803704071284048740118609114467977783598029006686938976881787785946905630190260940599579453432823469303026696443059025015972399867714215541693835559885291486318237914434496734087811872639496475100189041349008417061675093668333850551032972088269550769983616369411933015213796825837188091833656751221318492846368125550225998300412344784862595674492194617023806505913245610825731835380087608622102834270197698202313169017678006675195485079921636419370285375124784014907159135459982790513399611551794271106831134090584272884279791554849782954323534517065223269061394905987693002122963395687782878948440616007412945674919823050571642377154816321380631045902916136926708342856440730447899971901781465763473223850267253059899795996090799469201774624817718449867455659250178329070473119433165550807568221846571746373296884912819520317457002440926616910874148385078411929804522981857338977648103126085902995208257421855249796721729039744118165938433694823325696642096892124547425283",
+ /* 2K moduli mersenne primes */
+ "6864797660130609714981900799081393217269435300143305409394463459185543183397656052122559640661454554977296311391480858037121987999716643812574028291115057151",
+ "531137992816767098689588206552468627329593117727031923199444138200403559860852242739162502265229285668889329486246501015346579337652707239409519978766587351943831270835393219031728127",
+ "10407932194664399081925240327364085538615262247266704805319112350403608059673360298012239441732324184842421613954281007791383566248323464908139906605677320762924129509389220345773183349661583550472959420547689811211693677147548478866962501384438260291732348885311160828538416585028255604666224831890918801847068222203140521026698435488732958028878050869736186900714720710555703168729087",
+ "1475979915214180235084898622737381736312066145333169775147771216478570297878078949377407337049389289382748507531496480477281264838760259191814463365330269540496961201113430156902396093989090226259326935025281409614983499388222831448598601834318536230923772641390209490231836446899608210795482963763094236630945410832793769905399982457186322944729636418890623372171723742105636440368218459649632948538696905872650486914434637457507280441823676813517852099348660847172579408422316678097670224011990280170474894487426924742108823536808485072502240519452587542875349976558572670229633962575212637477897785501552646522609988869914013540483809865681250419497686697771007",
+ "259117086013202627776246767922441530941818887553125427303974923161874019266586362086201209516800483406550695241733194177441689509238807017410377709597512042313066624082916353517952311186154862265604547691127595848775610568757931191017711408826252153849035830401185072116424747461823031471398340229288074545677907941037288235820705892351068433882986888616658650280927692080339605869308790500409503709875902119018371991620994002568935113136548829739112656797303241986517250116412703509705427773477972349821676443446668383119322540099648994051790241624056519054483690809616061625743042361721863339415852426431208737266591962061753535748892894599629195183082621860853400937932839420261866586142503251450773096274235376822938649407127700846077124211823080804139298087057504713825264571448379371125032081826126566649084251699453951887789613650248405739378594599444335231188280123660406262468609212150349937584782292237144339628858485938215738821232393687046160677362909315071",
+ "190797007524439073807468042969529173669356994749940177394741882673528979787005053706368049835514900244303495954950709725762186311224148828811920216904542206960744666169364221195289538436845390250168663932838805192055137154390912666527533007309292687539092257043362517857366624699975402375462954490293259233303137330643531556539739921926201438606439020075174723029056838272505051571967594608350063404495977660656269020823960825567012344189908927956646011998057988548630107637380993519826582389781888135705408653045219655801758081251164080554609057468028203308718724654081055323215860189611391296030471108443146745671967766308925858547271507311563765171008318248647110097614890313562856541784154881743146033909602737947385055355960331855614540900081456378659068370317267696980001187750995491090350108417050917991562167972281070161305972518044872048331306383715094854938415738549894606070722584737978176686422134354526989443028353644037187375385397838259511833166416134323695660367676897722287918773420968982326089026150031515424165462111337527431154890666327374921446276833564519776797633875503548665093914556482031482248883127023777039667707976559857333357013727342079099064400455741830654320379350833236245819348824064783585692924881021978332974949906122664421376034687815350484991",
+
+ /* DR moduli */
+ "14059105607947488696282932836518693308967803494693489478439861164411992439598399594747002144074658928593502845729752797260025831423419686528151609940203368612079",
+ "101745825697019260773923519755878567461315282017759829107608914364075275235254395622580447400994175578963163918967182013639660669771108475957692810857098847138903161308502419410142185759152435680068435915159402496058513611411688900243039",
+ "736335108039604595805923406147184530889923370574768772191969612422073040099331944991573923112581267542507986451953227192970402893063850485730703075899286013451337291468249027691733891486704001513279827771740183629161065194874727962517148100775228363421083691764065477590823919364012917984605619526140821797602431",
+ "38564998830736521417281865696453025806593491967131023221754800625044118265468851210705360385717536794615180260494208076605798671660719333199513807806252394423283413430106003596332513246682903994829528690198205120921557533726473585751382193953592127439965050261476810842071573684505878854588706623484573925925903505747545471088867712185004135201289273405614415899438276535626346098904241020877974002916168099951885406379295536200413493190419727789712076165162175783",
+ "542189391331696172661670440619180536749994166415993334151601745392193484590296600979602378676624808129613777993466242203025054573692562689251250471628358318743978285860720148446448885701001277560572526947619392551574490839286458454994488665744991822837769918095117129546414124448777033941223565831420390846864429504774477949153794689948747680362212954278693335653935890352619041936727463717926744868338358149568368643403037768649616778526013610493696186055899318268339432671541328195724261329606699831016666359440874843103020666106568222401047720269951530296879490444224546654729111504346660859907296364097126834834235287147",
+ "1487259134814709264092032648525971038895865645148901180585340454985524155135260217788758027400478312256339496385275012465661575576202252063145698732079880294664220579764848767704076761853197216563262660046602703973050798218246170835962005598561669706844469447435461092542265792444947706769615695252256130901271870341005768912974433684521436211263358097522726462083917939091760026658925757076733484173202927141441492573799914240222628795405623953109131594523623353044898339481494120112723445689647986475279242446083151413667587008191682564376412347964146113898565886683139407005941383669325997475076910488086663256335689181157957571445067490187939553165903773554290260531009121879044170766615232300936675369451260747671432073394867530820527479172464106442450727640226503746586340279816318821395210726268291535648506190714616083163403189943334431056876038286530365757187367147446004855912033137386225053275419626102417236133948503",
+ "1095121115716677802856811290392395128588168592409109494900178008967955253005183831872715423151551999734857184538199864469605657805519106717529655044054833197687459782636297255219742994736751541815269727940751860670268774903340296040006114013971309257028332849679096824800250742691718610670812374272414086863715763724622797509437062518082383056050144624962776302147890521249477060215148275163688301275847155316042279405557632639366066847442861422164832655874655824221577849928863023018366835675399949740429332468186340518172487073360822220449055340582568461568645259954873303616953776393853174845132081121976327462740354930744487429617202585015510744298530101547706821590188733515880733527449780963163909830077616357506845523215289297624086914545378511082534229620116563260168494523906566709418166011112754529766183554579321224940951177394088465596712620076240067370589036924024728375076210477267488679008016579588696191194060127319035195370137160936882402244399699172017835144537488486396906144217720028992863941288217185353914991583400421682751000603596655790990815525126154394344641336397793791497068253936771017031980867706707490224041075826337383538651825493679503771934836094655802776331664261631740148281763487765852746577808019633679",
+
+ /* generic unrestricted moduli */
+ "17933601194860113372237070562165128350027320072176844226673287945873370751245439587792371960615073855669274087805055507977323024886880985062002853331424203",
+ "2893527720709661239493896562339544088620375736490408468011883030469939904368086092336458298221245707898933583190713188177399401852627749210994595974791782790253946539043962213027074922559572312141181787434278708783207966459019479487",
+ "347743159439876626079252796797422223177535447388206607607181663903045907591201940478223621722118173270898487582987137708656414344685816179420855160986340457973820182883508387588163122354089264395604796675278966117567294812714812796820596564876450716066283126720010859041484786529056457896367683122960411136319",
+ "47266428956356393164697365098120418976400602706072312735924071745438532218237979333351774907308168340693326687317443721193266215155735814510792148768576498491199122744351399489453533553203833318691678263241941706256996197460424029012419012634671862283532342656309677173602509498417976091509154360039893165037637034737020327399910409885798185771003505320583967737293415979917317338985837385734747478364242020380416892056650841470869294527543597349250299539682430605173321029026555546832473048600327036845781970289288898317888427517364945316709081173840186150794397479045034008257793436817683392375274635794835245695887",
+ "436463808505957768574894870394349739623346440601945961161254440072143298152040105676491048248110146278752857839930515766167441407021501229924721335644557342265864606569000117714935185566842453630868849121480179691838399545644365571106757731317371758557990781880691336695584799313313687287468894148823761785582982549586183756806449017542622267874275103877481475534991201849912222670102069951687572917937634467778042874315463238062009202992087620963771759666448266532858079402669920025224220613419441069718482837399612644978839925207109870840278194042158748845445131729137117098529028886770063736487420613144045836803985635654192482395882603511950547826439092832800532152534003936926017612446606135655146445620623395788978726744728503058670046885876251527122350275750995227",
+ "11424167473351836398078306042624362277956429440521137061889702611766348760692206243140413411077394583180726863277012016602279290144126785129569474909173584789822341986742719230331946072730319555984484911716797058875905400999504305877245849119687509023232790273637466821052576859232452982061831009770786031785669030271542286603956118755585683996118896215213488875253101894663403069677745948305893849505434201763745232895780711972432011344857521691017896316861403206449421332243658855453435784006517202894181640562433575390821384210960117518650374602256601091379644034244332285065935413233557998331562749140202965844219336298970011513882564935538704289446968322281451907487362046511461221329799897350993370560697505809686438782036235372137015731304779072430260986460269894522159103008260495503005267165927542949439526272736586626709581721032189532726389643625590680105784844246152702670169304203783072275089194754889511973916207",
+ "1214855636816562637502584060163403830270705000634713483015101384881871978446801224798536155406895823305035467591632531067547890948695117172076954220727075688048751022421198712032848890056357845974246560748347918630050853933697792254955890439720297560693579400297062396904306270145886830719309296352765295712183040773146419022875165382778007040109957609739589875590885701126197906063620133954893216612678838507540777138437797705602453719559017633986486649523611975865005712371194067612263330335590526176087004421363598470302731349138773205901447704682181517904064735636518462452242791676541725292378925568296858010151852326316777511935037531017413910506921922450666933202278489024521263798482237150056835746454842662048692127173834433089016107854491097456725016327709663199738238442164843147132789153725513257167915555162094970853584447993125488607696008169807374736711297007473812256272245489405898470297178738029484459690836250560495461579533254473316340608217876781986188705928270735695752830825527963838355419762516246028680280988020401914551825487349990306976304093109384451438813251211051597392127491464898797406789175453067960072008590614886532333015881171367104445044718144312416815712216611576221546455968770801413440778423979",
+ NULL
};
- log = fopen("logs/expt.log", "w");
- logb = fopen("logs/expt_dr.log", "w");
- logc = fopen("logs/expt_2k.log", "w");
- for (n = 0; primes[n]; n++) {
- SLEEP;
- mp_read_radix(&a, primes[n], 10);
- mp_zero(&b);
- for (rr = 0; rr < (unsigned)mp_count_bits(&a); rr++) {
- mp_mul_2(&b, &b);
- b.dp[0] |= lbit();
- b.used += 1;
- }
- mp_sub_d(&a, 1, &c);
- mp_mod(&b, &c, &b);
- mp_set(&c, 3);
- rr = 0;
- tt = -1;
- do {
- gg = TIMFUNC();
- DO(mp_exptmod(&c, &b, &a, &d));
- gg = (TIMFUNC() - gg)>>1;
- if (tt > gg) tt = gg;
- } while (++rr < 10);
- mp_sub_d(&a, 1, &e);
- mp_sub(&e, &b, &b);
- mp_exptmod(&c, &b, &a, &e); /* c^(p-1-b) mod a */
- mp_mulmod(&e, &d, &a, &d); /* c^b * c^(p-1-b) == c^p-1 == 1 */
- if (mp_cmp_d(&d, 1)) {
- printf("Different (%d)!!!\n", mp_count_bits(&a));
- draw(&d);
- exit(0);
+ log = fopen("logs/expt.log", "w");
+ logb = fopen("logs/expt_dr.log", "w");
+ logc = fopen("logs/expt_2k.log", "w");
+ logd = fopen("logs/expt_2kl.log", "w");
+ for (n = 0; primes[n]; n++) {
+ SLEEP;
+ mp_read_radix(&a, primes[n], 10);
+ mp_zero(&b);
+ for (rr = 0; rr < (unsigned) mp_count_bits(&a); rr++) {
+ mp_mul_2(&b, &b);
+ b.dp[0] |= lbit();
+ b.used += 1;
+ }
+ mp_sub_d(&a, 1, &c);
+ mp_mod(&b, &c, &b);
+ mp_set(&c, 3);
+ rr = 0;
+ tt = -1;
+ do {
+ gg = TIMFUNC();
+ DO(mp_exptmod(&c, &b, &a, &d));
+ gg = (TIMFUNC() - gg) >> 1;
+ if (tt > gg)
+ tt = gg;
+ } while (++rr < 10);
+ mp_sub_d(&a, 1, &e);
+ mp_sub(&e, &b, &b);
+ mp_exptmod(&c, &b, &a, &e); /* c^(p-1-b) mod a */
+ mp_mulmod(&e, &d, &a, &d); /* c^b * c^(p-1-b) == c^p-1 == 1 */
+ if (mp_cmp_d(&d, 1)) {
+ printf("Different (%d)!!!\n", mp_count_bits(&a));
+ draw(&d);
+ exit(0);
+ }
+ printf("Exponentiating\t%4d-bit => %9llu/sec, %9llu cycles\n",
+ mp_count_bits(&a), CLK_PER_SEC / tt, tt);
+ fprintf(n < 4 ? logd : (n < 9) ? logc : (n < 16) ? logb : log,
+ "%d %9llu\n", mp_count_bits(&a), tt);
}
- printf("Exponentiating\t%4d-bit => %9llu/sec, %9llu cycles\n", mp_count_bits(&a), CLK_PER_SEC/tt, tt);
- fprintf((n < 6) ? logc : (n < 13) ? logb : log, "%d %9llu\n", mp_count_bits(&a), tt);
- }
}
fclose(log);
fclose(logb);
fclose(logc);
+ fclose(logd);
log = fopen("logs/invmod.log", "w");
for (cnt = 4; cnt <= 128; cnt += 4) {
@@ -260,28 +287,29 @@ int main(void)
mp_rand(&b, cnt);
do {
- mp_add_d(&b, 1, &b);
- mp_gcd(&a, &b, &c);
+ mp_add_d(&b, 1, &b);
+ mp_gcd(&a, &b, &c);
} while (mp_cmp_d(&c, 1) != MP_EQ);
- rr = 0;
- tt = -1;
+ rr = 0;
+ tt = -1;
do {
- gg = TIMFUNC();
- DO(mp_invmod(&b, &a, &c));
- gg = (TIMFUNC() - gg)>>1;
- if (tt > gg) tt = gg;
+ gg = TIMFUNC();
+ DO(mp_invmod(&b, &a, &c));
+ gg = (TIMFUNC() - gg) >> 1;
+ if (tt > gg)
+ tt = gg;
} while (++rr < 1000);
mp_mulmod(&b, &c, &a, &d);
if (mp_cmp_d(&d, 1) != MP_EQ) {
- printf("Failed to invert\n");
- return 0;
+ printf("Failed to invert\n");
+ return 0;
}
- printf("Inverting mod\t%4d-bit => %9llu/sec, %9llu cycles\n", mp_count_bits(&a), CLK_PER_SEC/tt, tt);
- fprintf(log, "%d %9llu\n", cnt*DIGIT_BIT, tt);
+ printf("Inverting mod\t%4d-bit => %9llu/sec, %9llu cycles\n",
+ mp_count_bits(&a), CLK_PER_SEC / tt, tt);
+ fprintf(log, "%d %9llu\n", cnt * DIGIT_BIT, tt);
}
fclose(log);
return 0;
}
-
2  dep.pl
View
@@ -13,6 +13,8 @@
foreach my $filename (glob "bn*.c") {
my $define = $filename;
+print "Processing $filename\n";
+
# convert filename to upper case so we can use it as a define
$define =~ tr/[a-z]/[A-Z]/;
$define =~ tr/\./_/;
35 etc/tune.c
View
@@ -10,13 +10,44 @@
*/
#define TIMES (1UL<<14UL)
+/* RDTSC from Scott Duplichan */
+static ulong64 TIMFUNC (void)
+ {
+ #if defined __GNUC__
+ #if defined(__i386__) || defined(__x86_64__)
+ unsigned long long a;
+ __asm__ __volatile__ ("rdtsc\nmovl %%eax,%0\nmovl %%edx,4+%0\n"::"m"(a):"%eax","%edx");
+ return a;
+ #else /* gcc-IA64 version */
+ unsigned long result;
+ __asm__ __volatile__("mov %0=ar.itc" : "=r"(result) :: "memory");
+ while (__builtin_expect ((int) result == -1, 0))
+ __asm__ __volatile__("mov %0=ar.itc" : "=r"(result) :: "memory");
+ return result;
+ #endif
+
+ // Microsoft and Intel Windows compilers
+ #elif defined _M_IX86
+ __asm rdtsc