Permalink
Browse files

added libtommath-0.34

  • Loading branch information...
1 parent 4b7111d commit 3d0fcaab0a2d411b541dc9df8abd0a6be2df268f Tom St Denis committed with sjaeckel Feb 12, 2005
View
BIN bn.pdf
Binary file not shown.
View
29 bn.tex
@@ -49,7 +49,7 @@
\begin{document}
\frontmatter
\pagestyle{empty}
-\title{LibTomMath User Manual \\ v0.33}
+\title{LibTomMath User Manual \\ v0.34}
\author{Tom St Denis \\ tomstdenis@iahu.ca}
\maketitle
This text, the library and the accompanying textbook are all hereby placed in the public domain. This book has been
@@ -263,12 +263,12 @@ \section{Purpose of LibTomMath}
\begin{center}
\begin{tabular}{|l|c|c|l|}
\hline \textbf{Criteria} & \textbf{Pro} & \textbf{Con} & \textbf{Notes} \\
-\hline Few lines of code per file & X & & GnuPG $ = 300.9$, LibTomMath $ = 76.04$ \\
+\hline Few lines of code per file & X & & GnuPG $ = 300.9$, LibTomMath $ = 71.97$ \\
\hline Commented function prototypes & X && GnuPG function names are cryptic. \\
\hline Speed && X & LibTomMath is slower. \\
\hline Totally free & X & & GPL has unfavourable restrictions.\\
\hline Large function base & X & & GnuPG is barebones. \\
-\hline Four modular reduction algorithms & X & & Faster modular exponentiation. \\
+\hline Five modular reduction algorithms & X & & Faster modular exponentiation for a variety of moduli. \\
\hline Portable & X & & GnuPG requires configuration to build. \\
\hline
\end{tabular}
@@ -284,9 +284,12 @@ \section{Purpose of LibTomMath}
So it may feel tempting to just rip the math code out of GnuPG (or GnuMP where it was taken from originally) in your
own application but I think there are reasons not to. While LibTomMath is slower than libraries such as GnuMP it is
not normally significantly slower. On x86 machines the difference is normally a factor of two when performing modular
-exponentiations.
+exponentiations. It depends largely on the processor, compiler and the moduli being used.
-Essentially the only time you wouldn't use LibTomMath is when blazing speed is the primary concern.
+Essentially the only time you wouldn't use LibTomMath is when blazing speed is the primary concern. However,
+on the other side of the coin LibTomMath offers you a totally free (public domain) well structured math library
+that is very flexible, complete and performs well in resource contrained environments. Fast RSA for example can
+be performed with as little as 8KB of ram for data (again depending on build options).
\chapter{Getting Started with LibTomMath}
\section{Building Programs}
@@ -809,7 +812,7 @@ \subsection{Unsigned comparison}
\index{mp\_cmp\_mag}
\begin{alltt}
-int mp_cmp(mp_int * a, mp_int * b);
+int mp_cmp_mag(mp_int * a, mp_int * b);
\end{alltt}
This will compare $a$ to $b$ placing $a$ to the left of $b$. This function cannot fail and will return one of the
three compare codes listed in figure \ref{fig:CMP}.
@@ -1220,12 +1223,13 @@ \section{Squaring}
\end{alltt}
Will square $a$ and store it in $b$. Like the case of multiplication there are four different squaring
-algorithms all which can be called from mp\_sqr(). It is ideal to use mp\_sqr over mp\_mul when squaring terms.
+algorithms all which can be called from mp\_sqr(). It is ideal to use mp\_sqr over mp\_mul when squaring terms because
+of the speed difference.
\section{Tuning Polynomial Basis Routines}
Both of the Toom-Cook and Karatsuba multiplication algorithms are faster than the traditional $O(n^2)$ approach that
-the Comba and baseline algorithms use. At $O(n^{1.464973})$ and $O(n^{1.584962})$ running times respectfully they require
+the Comba and baseline algorithms use. At $O(n^{1.464973})$ and $O(n^{1.584962})$ running times respectively they require
considerably less work. For example, a 10000-digit multiplication would take roughly 724,000 single precision
multiplications with Toom-Cook or 100,000,000 single precision multiplications with the standard Comba (a factor
of 138).
@@ -1297,22 +1301,22 @@ \section{Straight Division}
\section{Barrett Reduction}
Barrett reduction is a generic optimized reduction algorithm that requires pre--computation to achieve
-a decent speedup over straight division. First a $mu$ value must be precomputed with the following function.
+a decent speedup over straight division. First a $\mu$ value must be precomputed with the following function.
\index{mp\_reduce\_setup}
\begin{alltt}
int mp_reduce_setup(mp_int *a, mp_int *b);
\end{alltt}
-Given a modulus in $b$ this produces the required $mu$ value in $a$. For any given modulus this only has to
+Given a modulus in $b$ this produces the required $\mu$ value in $a$. For any given modulus this only has to
be computed once. Modular reduction can now be performed with the following.
\index{mp\_reduce}
\begin{alltt}
int mp_reduce(mp_int *a, mp_int *b, mp_int *c);
\end{alltt}
-This will reduce $a$ in place modulo $b$ with the precomputed $mu$ value in $c$. $a$ must be in the range
+This will reduce $a$ in place modulo $b$ with the precomputed $\mu$ value in $c$. $a$ must be in the range
$0 \le a < b^2$.
\begin{alltt}
@@ -1578,7 +1582,8 @@ \section{Root Finding}
This algorithm uses the ``Newton Approximation'' method and will converge on the correct root fairly quickly. Since
the algorithm requires raising $a$ to the power of $b$ it is not ideal to attempt to find roots for large
values of $b$. If particularly large roots are required then a factor method could be used instead. For example,
-$a^{1/16}$ is equivalent to $\left (a^{1/4} \right)^{1/4}$.
+$a^{1/16}$ is equivalent to $\left (a^{1/4} \right)^{1/4}$ or simply
+$\left ( \left ( \left ( a^{1/2} \right )^{1/2} \right )^{1/2} \right )^{1/2}$
\chapter{Prime Numbers}
\section{Trial Division}
View
@@ -21,8 +21,7 @@
* Based on slow invmod except this is optimized for the case where b is
* odd as per HAC Note 14.64 on pp. 610
*/
-int
-fast_mp_invmod (mp_int * a, mp_int * b, mp_int * c)
+int fast_mp_invmod (mp_int * a, mp_int * b, mp_int * c)
{
mp_int x, y, u, v, B, D;
int res, neg;
@@ -23,8 +23,7 @@
*
* Based on Algorithm 14.32 on pp.601 of HAC.
*/
-int
-fast_mp_montgomery_reduce (mp_int * x, mp_int * n, mp_digit rho)
+int fast_mp_montgomery_reduce (mp_int * x, mp_int * n, mp_digit rho)
{
int ix, res, olduse;
mp_word W[MP_WARRAY];
View
@@ -31,8 +31,7 @@
* Based on Algorithm 14.12 on pp.595 of HAC.
*
*/
-int
-fast_s_mp_mul_digs (mp_int * a, mp_int * b, mp_int * c, int digs)
+int fast_s_mp_mul_digs (mp_int * a, mp_int * b, mp_int * c, int digs)
{
int olduse, res, pa, ix, iz;
mp_digit W[MP_WARRAY];
@@ -81,7 +80,7 @@ fast_s_mp_mul_digs (mp_int * a, mp_int * b, mp_int * c, int digs)
}
/* store final carry */
- W[ix] = _W;
+ W[ix] = _W & MP_MASK;
/* setup dest */
olduse = c->used;
@@ -24,8 +24,7 @@
*
* Based on Algorithm 14.12 on pp.595 of HAC.
*/
-int
-fast_s_mp_mul_high_digs (mp_int * a, mp_int * b, mp_int * c, int digs)
+int fast_s_mp_mul_high_digs (mp_int * a, mp_int * b, mp_int * c, int digs)
{
int olduse, res, pa, ix, iz;
mp_digit W[MP_WARRAY];
@@ -72,7 +71,7 @@ fast_s_mp_mul_high_digs (mp_int * a, mp_int * b, mp_int * c, int digs)
}
/* store final carry */
- W[ix] = _W;
+ W[ix] = _W & MP_MASK;
/* setup dest */
olduse = c->used;
View
@@ -101,7 +101,7 @@ int fast_s_mp_sqr (mp_int * a, mp_int * b)
}
/* store it */
- W[ix] = _W;
+ W[ix] = _W & MP_MASK;
/* make next carry */
W1 = _W >> ((mp_word)DIGIT_BIT);
View
@@ -65,29 +65,37 @@ int mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y)
#endif
}
+/* modified diminished radix reduction */
+#if defined(BN_MP_REDUCE_IS_2K_L_C) && defined(BN_MP_REDUCE_2K_L_C)
+ if (mp_reduce_is_2k_l(P) == MP_YES) {
+ return s_mp_exptmod(G, X, P, Y, 1);
+ }
+#endif
+
#ifdef BN_MP_DR_IS_MODULUS_C
/* is it a DR modulus? */
dr = mp_dr_is_modulus(P);
#else
+ /* default to no */
dr = 0;
#endif
#ifdef BN_MP_REDUCE_IS_2K_C
- /* if not, is it a uDR modulus? */
+ /* if not, is it a unrestricted DR modulus? */
if (dr == 0) {
dr = mp_reduce_is_2k(P) << 1;
}
#endif
- /* if the modulus is odd or dr != 0 use the fast method */
+ /* if the modulus is odd or dr != 0 use the montgomery method */
#ifdef BN_MP_EXPTMOD_FAST_C
if (mp_isodd (P) == 1 || dr != 0) {
return mp_exptmod_fast (G, X, P, Y, dr);
} else {
#endif
#ifdef BN_S_MP_EXPTMOD_C
/* otherwise use the generic Barrett reduction technique */
- return s_mp_exptmod (G, X, P, Y);
+ return s_mp_exptmod (G, X, P, Y, 0);
#else
/* no exptmod for evens */
return MP_VAL;
View
@@ -29,8 +29,7 @@
#define TAB_SIZE 256
#endif
-int
-mp_exptmod_fast (mp_int * G, mp_int * X, mp_int * P, mp_int * Y, int redmode)
+int mp_exptmod_fast (mp_int * G, mp_int * X, mp_int * P, mp_int * Y, int redmode)
{
mp_int M[TAB_SIZE], res;
mp_digit buf, mp;
View
@@ -57,8 +57,9 @@ mp_mul_d (mp_int * a, mp_digit b, mp_int * c)
u = (mp_digit) (r >> ((mp_word) DIGIT_BIT));
}
- /* store final carry [if any] */
+ /* store final carry [if any] and increment ix offset */
*tmpc++ = u;
+ ++ix;
/* now zero digits above the top */
while (ix++ < olduse) {
View
@@ -60,15 +60,15 @@ int mp_prime_random_ex(mp_int *a, int t, int size, int flags, ltm_prime_callback
/* calc the maskOR_msb */
maskOR_msb = 0;
- maskOR_msb_offset = (size - 2) >> 3;
+ maskOR_msb_offset = ((size & 7) == 1) ? 1 : 0;
if (flags & LTM_PRIME_2MSB_ON) {
maskOR_msb |= 1 << ((size - 2) & 7);
} else if (flags & LTM_PRIME_2MSB_OFF) {
maskAND &= ~(1 << ((size - 2) & 7));
}
/* get the maskOR_lsb */
- maskOR_lsb = 0;
+ maskOR_lsb = 1;
if (flags & LTM_PRIME_BBS) {
maskOR_lsb |= 3;
}
View
@@ -16,7 +16,7 @@
*/
/* read a string [ASCII] in a given radix */
-int mp_read_radix (mp_int * a, char *str, int radix)
+int mp_read_radix (mp_int * a, const char *str, int radix)
{
int y, res, neg;
char ch;
View
@@ -19,8 +19,7 @@
* precomputed via mp_reduce_setup.
* From HAC pp.604 Algorithm 14.42
*/
-int
-mp_reduce (mp_int * x, mp_int * m, mp_int * mu)
+int mp_reduce (mp_int * x, mp_int * m, mp_int * mu)
{
mp_int q;
int res, um = m->used;
View
@@ -16,8 +16,7 @@
*/
/* reduces a modulo n where n is of the form 2**p - d */
-int
-mp_reduce_2k(mp_int *a, mp_int *n, mp_digit d)
+int mp_reduce_2k(mp_int *a, mp_int *n, mp_digit d)
{
mp_int q;
int p, res;
View
@@ -0,0 +1,58 @@
+#include <tommath.h>
+#ifdef BN_MP_REDUCE_2K_L_C
+/* LibTomMath, multiple-precision integer library -- Tom St Denis
+ *
+ * LibTomMath is a library that provides multiple-precision
+ * integer arithmetic as well as number theoretic functionality.
+ *
+ * The library was designed directly after the MPI library by
+ * Michael Fromberger but has been written from scratch with
+ * additional optimizations in place.
+ *
+ * The library is free for all purposes without any express
+ * guarantee it works.
+ *
+ * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
+ */
+
+/* reduces a modulo n where n is of the form 2**p - d
+ This differs from reduce_2k since "d" can be larger
+ than a single digit.
+*/
+int mp_reduce_2k_l(mp_int *a, mp_int *n, mp_int *d)
+{
+ mp_int q;
+ int p, res;
+
+ if ((res = mp_init(&q)) != MP_OKAY) {
+ return res;
+ }
+
+ p = mp_count_bits(n);
+top:
+ /* q = a/2**p, a = a mod 2**p */
+ if ((res = mp_div_2d(a, p, &q, a)) != MP_OKAY) {
+ goto ERR;
+ }
+
+ /* q = q * d */
+ if ((res = mp_mul(&q, d, &q)) != MP_OKAY) {
+ goto ERR;
+ }
+
+ /* a = a + q */
+ if ((res = s_mp_add(a, &q, a)) != MP_OKAY) {
+ goto ERR;
+ }
+
+ if (mp_cmp_mag(a, n) != MP_LT) {
+ s_mp_sub(a, n, a);
+ goto top;
+ }
+
+ERR:
+ mp_clear(&q);
+ return res;
+}
+
+#endif
View
@@ -16,8 +16,7 @@
*/
/* determines the setup value */
-int
-mp_reduce_2k_setup(mp_int *a, mp_digit *d)
+int mp_reduce_2k_setup(mp_int *a, mp_digit *d)
{
int res, p;
mp_int tmp;
View
@@ -0,0 +1,40 @@
+#include <tommath.h>
+#ifdef BN_MP_REDUCE_2K_SETUP_L_C
+/* LibTomMath, multiple-precision integer library -- Tom St Denis
+ *
+ * LibTomMath is a library that provides multiple-precision
+ * integer arithmetic as well as number theoretic functionality.
+ *
+ * The library was designed directly after the MPI library by
+ * Michael Fromberger but has been written from scratch with
+ * additional optimizations in place.
+ *
+ * The library is free for all purposes without any express
+ * guarantee it works.
+ *
+ * Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
+ */
+
+/* determines the setup value */
+int mp_reduce_2k_setup_l(mp_int *a, mp_int *d)
+{
+ int res;
+ mp_int tmp;
+
+ if ((res = mp_init(&tmp)) != MP_OKAY) {
+ return res;
+ }
+
+ if ((res = mp_2expt(&tmp, mp_count_bits(a))) != MP_OKAY) {
+ goto ERR;
+ }
+
+ if ((res = s_mp_sub(&tmp, a, d)) != MP_OKAY) {
+ goto ERR;
+ }
+
+ERR:
+ mp_clear(&tmp);
+ return res;
+}
+#endif
Oops, something went wrong.

0 comments on commit 3d0fcaa

Please sign in to comment.