Discrete log #130

b-mehta · 2018-08-22T05:13:39Z

Discrete logarithms and Dirichlet characters. Resolves #88. In roughly this order, this is my plan for full efficient implementation of both.

Added MultMod type to represent elements of the multiplicative group mod m, with Monoid instance. Added PrimitiveRoot type to represent primitive roots of the (cyclic) multiplicative group.
Note these two types together make discreteLogarithm a total function. Partial constructors have been added and deconstructors. (I would like feedback on the name MultMod, I feel a better name is possible.)
Amended primitive root code, testing, benching with the new type structures.
Very simple discrete log code, with test and bench.
Baby-step giant-step method for discrete logs.
Additional testing and benching for discrete logs, with prime power cases included also.
Refactor discrete logs to be efficient on prime powers, using Bach's method described here and here. (Also use the canonical isomorphism for double prime powers).
Implement Pollard's rho algorithm for discrete logs with prime moduli (Modular equations #129 would be helpful for this). Notably, Pollard's rho is a Las Vegas algorithm (which is why @Taneb's attempt in (WIP) Discrete logarithm #110 did not always return success) so may need to run multiple times - experiment with starting inputs.

Adapted from Taneb/arithmoi.

Bodigrim · 2018-08-22T07:34:18Z

Cool! I'll take a look by the end of the week.

Bodigrim · 2018-08-22T12:17:12Z

Math/NumberTheory/Moduli/Class.hs

@@ -180,6 +191,39 @@ infixr 8 ^%
 -- of type classes in Core.
 -- {-# RULES "^%Mod" forall (x :: KnownNat m => Mod m) p. x ^ p = x ^% p #-}

+-- | This type represents elements of the multiplicative group mod m, i.e.
+-- those elemets which are coprime to m. Use @toMultElement@ to construct.


Typo, "elements".

Thanks for this catch!

Bodigrim · 2018-08-22T12:20:00Z

Math/NumberTheory/Moduli/Class.hs

+                     else Nothing
+
+-- | Powers can be taken without leaving the multiplicative group.
+powMultMod :: (KnownNat m, Integral a) => MultMod m -> a -> MultMod m


We'd rather implement Data.Semigroup.stimes method.

Sure, sounds reasonable.

Bodigrim · 2018-08-22T12:21:56Z

Math/NumberTheory/Moduli/Class.hs

+-- | For elements of the multiplicative group, we can safely perform the inverse
+-- without needing to worry about failure.
+invertGroup :: KnownNat m => MultMod m -> MultMod m
+invertGroup = MultMod . fromJust . invertMod . multElement


It should be safe, but still it is better to pattern-match by Nothing manually and throw a custom error, which explicitly mentions invertGroup as a source.

Alright, sounds good, fixed.

Bodigrim · 2018-08-22T12:23:19Z

Math/NumberTheory/Moduli/DiscreteLogarithm.hs

+                            table = M.fromList $ zip babies [0..(m-1)]
+                            bigGiant = powMultMod a' (- fromIntegral m)
+                            giants = fromInteger . getVal . multElement <$> iterate (<> bigGiant) b
+                         in head [i*m + j | (v,i) <- zip giants [0..(m-1)], j <- maybeToList $ M.lookup v table]


Let us convert lets into a where-clause.

My hope is to replace this with a more space-efficient implementation with Pollard's rho, so hopefully this function should not be in the merge and so if it's okay I'll leave this as is for now.
In the future I'll make sure to use where clauses in similar scenarios though!

Bodigrim · 2018-08-23T13:16:12Z

Implement Pollard's rho algorithm for discrete logs with prime moduli (#129 would be helpful for this)

I pushed solveLinear into master.

b-mehta · 2018-08-23T18:35:30Z

Great, thanks!

Bodigrim · 2018-08-25T18:51:54Z

Math/NumberTheory/Moduli/Class.hs

@@ -180,6 +187,39 @@ infixr 8 ^%
 -- of type classes in Core.
 -- {-# RULES "^%Mod" forall (x :: KnownNat m => Mod m) p. x ^ p = x ^% p #-}

+-- | This type represents elements of the multiplicative group mod m, i.e.
+-- those elements which are coprime to m. Use @toMultElement@ to construct.
+newtype MultMod m = MultMod { multElement :: Mod m }


I've been never good with names :)
Maybe MultGroupMod?

Bodigrim · 2018-08-25T18:53:16Z

Math/NumberTheory/Moduli/Class.hs

+
+-- | Attempt to construct a multiplicative group element.
+isMultElement :: KnownNat m => Mod m -> Maybe (MultMod m)
+isMultElement a = if getVal a `gcd` getMod a == 1


It is better to use getNatVal and getNatMod respectively.

Sure, but out of interest why are these preferable?

The internal representation of Mod m uses Natural both for value and for modulo.

Bodigrim · 2018-08-25T18:55:01Z

Math/NumberTheory/Moduli/DiscreteLogarithm.hs

+-- discreteLogarithm a b = let n = prefValue . groupSize . getGroup $ a
+--                             a' = unPrimitiveRoot a
+--                             vals = genericTake n $ iterate (<> a') mempty
+--                          in fromIntegral $ fromJust $ elemIndex b vals


Remove commented lines?

Bodigrim · 2018-08-25T18:55:07Z

Math/NumberTheory/Moduli/DiscreteLogarithm.hs

+
+import Data.Semigroup
+import Data.Maybe
+-- import Data.List


Remove commented lines?

Bodigrim · 2018-08-25T18:56:15Z

Math/NumberTheory/Moduli/PrimitiveRoot.hs

+data PrimitiveRoot m =
+  PrimitiveRoot { unPrimitiveRoot :: MultMod m -- ^ Extract primitive root value.
+                , getGroup        :: CyclicGroup Natural -- ^ Get cyclic group structure.
+                }


My personal preference is to put constructor on the same line with data and list fields indented by two spaces.

Bodigrim · 2018-08-25T18:56:50Z

benchmark/Math/NumberTheory/DiscreteLogarithmBench.hs

+import Gauge.Main
+import Data.Maybe
+
+-- import Math.NumberTheory.Moduli (Mod, discreteLogarithm, PrimitiveRoot, isPrimitiveRoot, MultMod, isMultElement)


Remove commented lines?

test-suite/Math/NumberTheory/Moduli/DiscreteLogarithmTests.hs

Bodigrim · 2018-08-25T18:58:31Z

test-suite/Math/NumberTheory/Moduli/PrimitiveRootTests.hs

    InfMod{}   -> False

 isPrimitiveRootProperty2 :: Positive Natural -> Bool
 isPrimitiveRootProperty2 (Positive m)
  = isNothing (cyclicGroupFromModulo m)
  || case someNatVal m of
-    SomeNat (_ :: Proxy t) -> any isPrimitiveRoot [(minBound :: Mod t) .. maxBound]
+    SomeNat (_ :: Proxy t) -> not $ null $ mapMaybe isPrimitiveRoot [(minBound :: Mod t) .. maxBound]


I'd rather write any (isJust . isPrimitiveRoot).

b-mehta · 2018-08-30T17:54:53Z

The Bach reduction gave a large speedup for prime powers: the case 2^n = 7 mod 3^20 went from around 41ms to 3.5µs and 3^n = 17 mod 5^16 went from 549ms to 3.8µs. Range benchmarks did not change significantly, which is expected since prime powers are rare compared to primes in general. I'll modify benchmarks for prime powers now, using larger powers to make up for this speedup.

b-mehta · 2018-09-01T02:05:30Z

I've changed some things, removing some of the rem calls. When I benchmark, using m * 2 instead of shiftL is noticeably faster, so I've kept it. Since Pollard reduces to identical code as the tuples, I've left that as is also.

The calls to mod in line 92 may not be strictly necessary - I originally kept them because solveLinear' was not exported and so I wanted to make sure the input was as expected. Also, it is likely that the mod call will reduce arguments into a smaller range, so the function may run faster, but I'm happy to remove them if necessary.

Bodigrim · 2018-09-01T09:34:12Z

solveLinear' should work fine for unreduced arguments. Even without reduction by modulo bi-b2i and ai-b2i are O(n) with probability close to 1, so there should not be any performance hit. I suggest to remove mod from line 92 for the sake of clarity.

Experimentally, the baby-step giant-step algorithm performs better on moduli below 10^8, and Pollard runs better for large moduli. In addition, this means BSGS can use Ints internally.

Pollard's rho is a Las Vegas algorithm, with a negligible probability of failure for a given input.

Bodigrim · 2018-09-09T18:11:52Z

Awesome. I greatly appreciate your work.

Does it make sense to split this PR into two? The first one about discrete log (which is ready to merge, up to my understanding) and the second (future) one about Dirichlet characters. What is your opinion?

b-mehta · 2018-09-09T18:41:06Z

I think this is a good idea - in fact I'd like to split it into three PRs: The currently implemented work on discrete logs (which is ready to merge), possible speed ups on discrete logs (including Pollig-Hellman, which I'm investigating at the minute) and Dirichlet characters.

Something worth pointing out is that the algorithm is probabilistic, as I mentioned in the most recent commit, and so has a (negligible) probability of failing. Currently three starting inputs are used, which cubes the failure probability making it even more negligible, but still non-zero. Possible fixes are to allow all starting inputs on failure, so that the function is guaranteed not to fail, or just to use a small amount and hope for the best (cf line 87 in the last commit).
I'm testing how likely failure actually is in my branch here - so far for primes less than 7000, all possible instances of pollard's rho succeed by a super naive brute force search
What is your opinion on this?

(Unrelatedly, hope you enjoyed your vacation!)

Bodigrim · 2018-09-09T18:45:04Z

It would be nice to find via brute force at least one example, when a) implementation with only one seed fails to find an answer, b) implementation with two seeds fails, c) current implementation with three seeds fails. And add them to tests.

Can we resort to BSGS, when Pollard rho has failed with three seeds?

b-mehta · 2018-09-09T19:01:14Z

That would be nice, but I fear that the probability of failure is genuinely tiny - testing all primes less than 7000 represents over 5 billion instances of the discrete log problem, and no failure has yet occured.

Possibly, but BSGS uses IntMaps internally and right now BSGS is used anyway when the inputs are less than 10^8 so in that scenario, the values may not fit within an Int. In addition, if Maps are used instead, it is possible that massive maps are made and create memory issues instead. I'd be more comfortable allowing more seeds instead - since the performance hit of this will almost certainly be smaller than the performance hit of BSGS.

Bodigrim · 2018-09-10T18:36:27Z

I may be wrong, but my impression is that your implementation of Pollard is not probabilistic. AFAIU we say that Pollard algorithm failed for a given seed when the algorithm found a collision with bi == b2i, for which solveLinear' returns a full, unrestricted range. But in our case we just resort to brute force check-ing.

b-mehta · 2018-09-10T18:49:44Z

If bi == b2i, then solveLinear' attempts to solve 0x + a = 0, and returns the empty list, so we immediately fail.

Edit: My mistake, we additionally have that ai == a2i, so brute force checking does occur.

Bodigrim · 2018-09-10T19:20:01Z

Yep. For small arguments it happens quite often, which can be traced as follows:

    go tort@(xi,ai,bi) hare@(x2i,a2i,b2i)
      | xi == x2i
      , (bi - b2i) `mod` n == 0
      , traceShow (p, a, b) False
      = undefined
      | xi == x2i = solveLinear' n (bi - b2i) (ai - a2i)
      | otherwise = go (step tort) (step (step hare))

Let us have a long sequence of seeds, like [(a, b) | a <- [0..n], b <- [0..n]].
Let us proceed to solveLinear' only if gcd (bi - b2i) n is small (below sqrt n?). Otherwise take the next seed.

b-mehta · 2018-09-10T19:23:37Z

Right, I was doing exactly this. That approach sounds reasonable. Thanks for this catch!

Bodigrim · 2018-09-10T19:33:31Z

Math/NumberTheory/Moduli/DiscreteLogarithm.hs

+    n                 = p-1 -- order of the cyclic group
+    halfN             = n `quot` 2
+    mul2 m            = if m < halfN then m * 2 else m * 2 - n
+    step (xi,!ai,!bi) = case xi `rem` 3 of


Each step performs two divisions, which I expect to take a lot of time. I wonder whether we can avoid the first one. The algorithm is expected be robust to a somewhat unequal split into three subsets.

We can cast xi to Word and apply rem afterwards. At the very least we will avoid horribly expensive division of long Integer and in the best scenario compiler will apply this trick.

More aggressive approach is to skew distribution even further, looking at only 3 or 4 bits of xi. Something like this:

case fromInteger xi .&. (7 :: Word) of 0 -> (xi*xi `rem` p, mul2 ai, mul2 bi) 1 -> (xi*xi `rem` p, mul2 ai, mul2 bi) 2 -> ( a*xi `rem` p, ai+1, bi) 3 -> ( a*xi `rem` p, ai+1, bi) 4 -> ( a*xi `rem` p, ai+1, bi) _ -> ( b*xi `rem` p, ai, bi+1)

I like this idea! I recall reading that the expected time taken for Pollard can be improved to the expected random bound by using more partitions - I'm sure I read that tighter bounds were achieved with 20 but I can't seem to find that result now. With that in mind, it may be interesting to do something like fromInteger xi .&. (15 :: Word), and have 16 different cases.

In either case, would it be acceptable to take optimisation work like this into a different PR, and merge (something like) this one for now?

Something like f_T and f_C from (Speeding up pollard's rho method for discrete logarithms, Teske 1998), also described in 2.2.2 and 2.2.3 here, in which experimental performance bounds are described.

In either case, would it be acceptable to take optimisation work like this into a different PR, and merge (something like) this one for now?

Sure. Could you please add a -- TODO comment with your ideas and links? It is easy to lose them in the depth of a closed PR.

b-mehta · 2018-09-12T00:25:36Z

This should be good to go. Note that Math.NumberTheory.Moduli.Equations exports solveLinear' - hopefully this is acceptable @Bodigrim.

Bodigrim · 2018-09-16T14:02:43Z

Note that Math.NumberTheory.Moduli.Equations exports solveLinear' - hopefully this is acceptable

I've thought about it and decided to revert this new export in 473a5ef. An additional public function needs documentation, tests and is a commitment to support its interface. Being public, solveLinear' should deal somehow with negative modulo and zero modulo, it may also be reasonable to make it working for any Integral a instead of Integer only. That said, from the maintainer's viewpoint it is preferable to use solveLinear only.

b-mehta · 2018-09-16T16:24:58Z

Fair enough. Is an acceptable compromise to assert that the modulo must be positive, and that the precondition is not checked, similar to integerSquareRoot' from M.NT.Powers.Squares? That way users can still use solveLinear' for Integers without having to wrap values up using someNatVal.

Bodigrim · 2018-09-16T17:07:16Z

Unchecked preconditions are the root of all evil :) They are the worst plague of arithmoi API in its current state.

Currently I am tying up loose ends for an upcoming release; I'll be glad to return to solveLinear' question afterwards. Please raise an issue to facilitate discussion.

b-mehta · 2018-09-16T17:40:11Z

Unchecked preconditions are the root of all evil :) They are the worst plague of arithmoi API in its current state.

I see! Fair enough - I have no real complaints about the state of solveLinear' as it is.

b-mehta added 7 commits August 22, 2018 00:11

Add MultMod type

df50d98

Added PrimitiveRoot type and adjusted signatures

e4f6fe7

Amend import structure

1747210

Adjusted tests to new types

73e79af

Framework, test, bench for discrete logs

b81e174

Adapted from Taneb/arithmoi.

Added baby-step giant-step algorithm

75f4371

Removed redundant pragmas

31abbf7

Bodigrim reviewed Aug 22, 2018

View reviewed changes

b-mehta added 3 commits August 22, 2018 23:27

MultMod stimes and improved error message

de81a94

Import of Data.Semigroup for ghc>8.4 also

d8e45bc

import Semigroup in DiscreteLog too

ac3d84d

Merge branch 'master' into dirichlet-characters

3d35432

Additional tests for discrete logs [ci skip]

56b33ed

Bodigrim reviewed Aug 25, 2018

View reviewed changes

b-mehta added 10 commits August 25, 2018 21:02

Cosmetic changes

6423c6a

More benchmarks for discrete logs

8de36ca

Removed redundant comments and pragmas

e99b29c

Fix from review

ec5fa71

Basic refactoring of discrete log

a6a1c1f

Refactor further, use Natural internally

04a85cc

Individual tests of large cases

04119fe

Handle prime power case better

6dffb74

Cosmetic changes and explicit import lists

90102bb

Merge branch 'master' into dirichlet-characters

938d74f

b-mehta added 2 commits August 30, 2018 19:15

More careful imports to resolve cycle

0d78259

Remove redundant tests

72239de

Changes from review

d9953ec

b-mehta added 3 commits September 1, 2018 18:47

Use BSGS for small moduli and pollard otherwise

3942bcc

Experimentally, the baby-step giant-step algorithm performs better on moduli below 10^8, and Pollard runs better for large moduli. In addition, this means BSGS can use Ints internally.

Share computation of p^k

ed77e91

Probability of pollard failing is now much smaller

2ddd7e7

Pollard's rho is a Las Vegas algorithm, with a negligible probability of failure for a given input.

Merge branch 'master' into dirichlet-characters

f2386ab

Bodigrim reviewed Sep 10, 2018

View reviewed changes

b-mehta added 2 commits September 10, 2018 21:02

Changes from review

ac1e46a

Added todo comments

1393fdf

This was referenced Sep 12, 2018

Optimisation of discrete logarithms #136

Open

Dirichlet characters #119

Closed

Bodigrim changed the title ~~(WIP) Discrete log and Dirichlet characters~~ Discrete log Sep 12, 2018

Bodigrim merged commit dd59557 into Bodigrim:master Sep 12, 2018

Bodigrim mentioned this pull request Sep 12, 2018

(WIP) Discrete logarithm #110

Closed

b-mehta deleted the dirichlet-characters branch September 16, 2018 18:01

Bodigrim mentioned this pull request Dec 28, 2018

Improve return type of isPrimitiveRoot' #154

Closed

Discrete log #130

Discrete log #130

Conversation

b-mehta commented Aug 22, 2018 • edited by Bodigrim Loading

Bodigrim commented Aug 22, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Bodigrim commented Aug 23, 2018

b-mehta commented Aug 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

b-mehta commented Aug 30, 2018 • edited Loading

b-mehta commented Sep 1, 2018

Bodigrim commented Sep 1, 2018

Bodigrim commented Sep 9, 2018

b-mehta commented Sep 9, 2018 • edited Loading

Bodigrim commented Sep 9, 2018

b-mehta commented Sep 9, 2018 • edited Loading

Bodigrim commented Sep 10, 2018

b-mehta commented Sep 10, 2018 • edited Loading

Bodigrim commented Sep 10, 2018 • edited Loading

b-mehta commented Sep 10, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

b-mehta Sep 10, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

b-mehta commented Sep 12, 2018

Bodigrim commented Sep 16, 2018

b-mehta commented Sep 16, 2018

Bodigrim commented Sep 16, 2018

b-mehta commented Sep 16, 2018

b-mehta commented Aug 22, 2018 •

edited by Bodigrim

Loading

b-mehta commented Aug 30, 2018 •

edited

Loading

b-mehta commented Sep 9, 2018 •

edited

Loading

b-mehta commented Sep 9, 2018 •

edited

Loading

b-mehta commented Sep 10, 2018 •

edited

Loading

Bodigrim commented Sep 10, 2018 •

edited

Loading

b-mehta commented Sep 10, 2018 •

edited

Loading

b-mehta Sep 10, 2018 •

edited

Loading