Factorisation #4859

Sventimir · 2020-05-14T17:35:19Z

This pull request adds Data.Nat.Factor module to contrib package. It defines notions of a factor (a natural number, which divides another number without a remainder), and common factors of two natural numbers. It proves several basic laws of factors and implmements a decision procedure as to whether a certain number is a factor of another number of not. Finally, the notion of a greatest common divisor is formally defined and Euclid's algorithm for finding GCDs is implemented and proven correct.

This is a major step towards implementing a decent Rational type, which will be my next goal.

nickdrozd · 2020-05-16T16:06:04Z

libs/contrib/Data/Nat/Factor.idr

+||| divisors. It is proven correct and total; returns a proof that computed
+||| number actually IS the GCD. Unfortunately it's very slow, so improvements
+||| in terms of efficiency would be welcome.
+gcd : (a, b : Nat) -> {auto aok : LTE 1 a} -> {auto bok : LTE 1 b} -> (f : Nat ** GCD a b f)


There is a gcd defined in Prelude.Nat

True. But that definition does not provide any proof that the returned number is indeed what it claims to be. That is the whole point of this module: to define what a factor is and start proving things about factors and factorisation.

Yes, my point is just that there is a name collision.

I thought about naming it gcf for greatest common factor. That would be along with the naming convention in this module, but on the other hand it would not be along with what people expect that function to be named. Type of this function is different, so in some cases (when you supply implicit arguments explicitly) calls will be unambiguous. Otherwise you'll have to qualify it. Also returned type is different, so if the compiler can infer that, it also won't require qualified name. I don't think it's too bad…

The other gcd function only stipulates that the inputs can't both be zero. Could that work here? I've looked at a few expositions of the algorithm, and nobody ever spends much time discussing the nature of the inputs. It would certainly be cleaner if there weren't LTE 1 n proofs floating around all over, but I don't know how necessary they are.

Zeros are problematic, because according to this definition at least, everything is a factor of zero. This destroys some nice properties, like that a factor is always less than or equal to the number it divides. Also there is consideration that according to this setup, zero is a factor of zero, which implicitly assumes that you can actually divide by zero. Also when you want to implement rational numbers, that non-zero property comes in handy, because all ratios are either positive or negative and you just add a special constructor for zero. This way you avoid the problem ZZ has, that 0 and -0 are distinct values, which messes up the (=) proofs. Also the requirement that divisor must be non-zero is graciously provided by this arrangement (due to Uninhabited (Factor n Z) implmentation, which is no longer valid when you consider divisors of zero).

On the other hand, as I review the code, I see that the non-zero property is not used that often (mostly is just passed forward as is or with some alteration). So probably we could require non-zero-ness for those proofs only, and we could require it explicitly for rational numbers as well. I'm currently looking into the possibilities of removing that constraint and it seems possible.

nickdrozd · 2020-05-16T16:17:53Z

libs/contrib/Data/Nat.idr

+subtractEqLeft {a = Z} prf = prf
+subtractEqLeft {a = S k} {b} {c} prf = subtractEqLeft $ succInjective (k + b) (k + c) prf
+
+subtractEqRight : {a, b, c : Nat} -> a + c = b + c -> a = b


subtractEqRight can be proved as an instance of subtractEqLeft:

subtractEqRight : {a, b, c : Nat} -> a + c = b + c -> a = b subtractEqRight {a} {b} {c} prf = subtractEqLeft {a = c} {b = a} {c = b} $ trans (plusCommutative c a) $ trans prf $ plusCommutative b c

Good point!

nickdrozd · 2020-05-16T16:22:26Z

libs/contrib/Data/Nat.idr

+        rewrite sym $ plusSuccRightSucc b k in
+        LTESucc $ addLteRight prf
+
+addLteLeft : LTE b c -> LTE (a + b) (a + c)


Same here, addLteRight would be simpler to prove as an instance of addLteLeft. It's probably true for any of these left / right proofs that one side is easier to prove, and the other side can be derived from that one.

That's true. Then I'll try rewriting them.

nickdrozd · 2020-05-16T17:23:04Z

When I compile and load Factor into the repl (:module Data.Nat.Factor), it can't find gcd (or gcdProof, as I've renamed it locally). In the Emacs buffer, all the data types (Factor, NotFactor, etc) get underlined in red. I don't know what it is, but it seems like there is some kind of path / visibility problem.

Sventimir · 2020-05-16T17:37:51Z

When I compile and load Factor into the repl (:module Data.Nat.Factor), it can't find gcd (or gcdProof, as I've renamed it locally)

It's because of the %access private in the GCD namespace. I thought it would be restricted to that block, but apparently it is not. I will add exportqualifier to the gcd function and it should be fine. As to the underlining, I don't know. In my Atom nothing is underlined. Does Emacs suggest anything about what it thinks is wrong? Maybe it's some style issue, indentation size perhaps?

nickdrozd · 2020-05-16T19:31:03Z

libs/contrib/Data/Nat/Factor.idr

+
+||| If p is a factor of both n and m, then it is also a factor of their sum.
+plusFactor : Factor n p -> Factor m p -> Factor (n + m) p
+plusFactor {n} {p} nFactor@(CofactorExists qn positN prfN) (CofactorExists qm positM prfM) =


What does the @ mean?

@ binds the whole pattern on the right to the name on the left. In this case it binds CofactorExists qn positN prfN to the name nFactor. This way you can simultaneously refer to the whole expression and to its subexpressions.

It's roughly equivalent to this:

plusFactor {n} {p} nFactor (CofactorExists qm positM prfM) = let CofactorExists qn positN prfN = nFactor in ...

nickdrozd · 2020-05-16T20:16:40Z

It's because of the %access private in the GCD namespace. I thought it would be restricted to that block, but apparently it is not. I will add exportqualifier to the gcd function and it should be fine. As to the underlining, I don't know. In my Atom nothing is underlined. Does Emacs suggest anything about what it thinks is wrong? Maybe it's some style issue, indentation size perhaps?

That fixed it.

I don't know what the lines are about. It could be a bug in the mode. Actually, it is certainly a bug in the mode. For: the underlining is either accurate or not. If it isn't accurate, then that's obviously a bug. On the other hand, if it is accurate, then the lines are an indication that something is wrong, but that something isn't being reported, which is a bug.

Is that an intuitionistically valid proof?

nickdrozd · 2020-05-16T20:39:29Z

libs/contrib/Data/Nat/Factor.idr

+||| A decision procedure for whether of not p is a factor of n.
+decFactor : (n, d : Nat) -> {auto nok : LTE 1 n} -> {auto dok : LTE 1 d} -> DecFactor n d
+decFactor n (S d) {nok} {dok} with (Data.Fin.Extra.divMod n (S d))
+        | (Fraction n (S d) q r prf) = case r of


This case statement can be pushed back into the arg list:

decFactor n (S d) {nok} {dok} with (Data.Fin.Extra.divMod n (S d)) | (Fraction n (S d) q FZ prf) = let prf = replace {P = \x => x = n} (plusZeroRightNeutral (q + (d * q))) $ replace {P = \x => x + 0 = n} (multCommutative q (S d)) prf in ItIsFactor $ CofactorExists q nok prf | (Fraction n (S d) q (FS pr) prf) = ItIsNotFactor $ ProperRemExists q pr nok $ rewrite multCommutative d q in rewrite sym $ multRightSuccPlus q d in prf

(Indent to taste)

Yes… I usually make a double indentation after the function pattern, because if I'm going to add a where clause later, I like it to be less indented than the function body. I feel it's more apparent where the body ends that way. And if I'm not going to add a where, at least all functions are indented according to the same rule. However, I'm not sure if it's the right approach. I also like 4-space-wide indents, while you apparently prefer 2-spaces only. I should probably have a closer look at older modules and adapt my style to what there already is…

Mostly I stick to the indentation levels provided by the Emacs mode, and I don't think it ever provides 4-spacing. (It doesn't matter, of course.)

Actually I think it does matter in the sense that it should be consistent within a project.

Sventimir · 2020-05-16T20:52:59Z

Is that an intuitionistically valid proof?

Seems valid to me, although rather informal.

nickdrozd · 2020-05-17T15:07:08Z

libs/contrib/Data/Nat/Factor.idr

+        CofactorExists (S k) ok (rewrite plusZeroRightNeutral k in Refl)
+
+||| For all natural numbers n, n is (the greatest) a factor of n.
+selfFactor : (n : Nat) -> {auto ok : LTE 1 n} -> Factor n n


I would call this factorReflexive. The factor relationship is a partial order, right? That might be worth exploring, although the ok thing complicates it a little.

Yeah, these LTEs are annoying. However, considering zeros destroys most of the properties of factors, since anything is a factor of zero. On the other hand, maybe I should exclude zero only where it really hurts and allow considering factors of zero in general case.

Hmm, indeed! The factor relation is a partial order. That's a thing worth proving. I'll add it.

https://en.wikipedia.org/wiki/Order_theory#/media/File:Lattice_of_the_divisibility_of_60.svg

nickdrozd · 2020-05-17T15:13:22Z

libs/contrib/Data/Nat/Factor.idr

+        (CofactorExists b bok (rewrite plusZeroRightNeutral b in Refl))
+
+||| Any natural number is a common factor of itself and itself.
+selfIsCommonFactor : (a : Nat) -> {auto ok : LTE 1 a} -> CommonFactor a a a


Same here, I would call this commonFactorReflexive and move it next to commonFactorSym. Is there a sense in which CommonFactor is transitive? If it is, you could implement it as an equivalence relation.

Well, in order to be transitive, it would have to be a two-argument predicate. This can be done like so:

CommonFactor (a, b) p

However now sides of the relation have different types, which is no good. However, if you stretch the definition a little bit, you can come up with something like this:

cfTransitive : CommonFactor (a, b) p -> CommonFactor (b, c) p -> CommonFactor (a, c) p

So it is transitive when you consider it a binary relation (either there is or there isn't a common factor of two numbers in question), without considering what that factor actually is. When looked upon like this, it is indeed transitive, and this is the sense in which it is symmetric in commonFactorSym proof.

nickdrozd · 2020-05-17T15:28:05Z

libs/contrib/Data/Nat/Factor.idr

+gcd Z _ {aok} = absurd aok
+gcd _ Z {bok} = absurd bok
+gcd (S a) (S b) with (cmp (S a) (S b))
+    gcd (S (b + S d)) (S b) | CmpGT d =


One way to handle Euclid's algorithm is to arrange it so that the first argument is the smaller one. If the first argument is larger, just flip the arguments around and proceed as usual, as in gcd 27 18 = gcd 18 27 = 9. That would simplify the logic (just keep the simpler of the two branches), although I think it would require an assert_total.

This is discussed at the very beginning of Knuth's The Art of Computer Programming.

I believe it's exactly what I have done. If a > b, then the recursion is run with (a, b) pair, otherwise – with (b, a). The GCD.Search type even contains a proof that a >= b. Totality of the function is implied by the use of sizeInd function from WellFounded module, which provides looping capability already proved to be total.

Writing the recursion by hand, I could find no way to convince the compiler about totality. The trick provided by the Sized interface is quite ingenious.

Ah, okay, I'm still working through the details.

nickdrozd · 2020-05-17T16:36:25Z

libs/contrib/Data/Nat/Factor.idr

+            let sbIsFactor = the (plus q (mult b q) = S a) $
+                    rewrite multCommutative b q in
+                    rewrite sym $ multRightSuccPlus q b in
+                    replace {P = \x => x = S a} (plusZeroRightNeutral (q * S b)) prf


replace can be replaced with

rewrite sym prf in sym $ plusZeroRightNeutral (q * S b)

IME explicit use of replace is a "proof smell", and there is almost always a nicer way to do it.

I try to use rewrite rather than replace whenever I can, but it's not always possible. There is the huge difference between them, however. replace makes a substitution in a proof you already have to create another proof, whereas rewrite makes a substitution in the goal (which is a proof you do not have yet) instead. In practice it means that with rewrite you work kind of backwards, rewriting your goal until it matches something you already have, while replace has a more direct approach. Also with rewrite you can only use equality (=) proofs (the same with replace of course), which means that if you you need use a custom proof-altering function like cong or something in the middle of replacements, you actually can no longer use rewrite from that point on. Unless of course I am missing something here.

This problem does not apply here, of course. This probably arose from some code reworking or maybe I was just tired.

nickdrozd · 2020-05-17T16:43:27Z

libs/contrib/Data/Nat/Factor.idr

+||| Factor n p is a witness that p is indeed a factor of n,
+||| i.e. there exists a q such that p * q = n.
+data Factor : Nat -> Nat -> Type where
+    CofactorExists : {n, p : Nat} -> (q : Nat) -> LTE 1 n -> p * q = n -> Factor n p


I notice that there are a lot of places with rewrite sym prf or
something like that. I think the syms are requried because this
proof has the form p * q = n. If it were n = p * q, I think the
syms could be done away with, which would be a little less verbose.

Generally proof equations are written as complexity-reducing devices,
with the more complex part on the left and the less complex on the
right. For example, the monoid neutral law looks like n + 0 = n,
because n is simpler than n + 0. But with factors, the reduction
in complexity is to say that a larger number can be written as the
product of smaller numbers, so I think n = p * q is appropriate.

(It might be too late to implement this minor suggestion.)

That's a fair point. And I don't think it's too late for the change. It's only 300 lines of code. I have done much larger refactorings with much less type-checking to support me, so I believe I will do it. ;)

Sventimir · 2020-05-18T16:08:11Z

I implemented Preorder interface for Nat and Factor. I also tried to implement Poset, but proving the antisymmetric property is messy, because there are 4 variables and only two equations, so squeezing contradictions from them takes a lot of pattern matching on those Nats. I also inverted the equations in Factor and NotFactor proof constructors and eliminated these pesky LTE 1s.

nickdrozd · 2020-05-18T22:22:24Z

libs/contrib/Data/Nat/Factor.idr

+            Refl
+        )
+
+    reflexive Z = CofactorExists 1 Refl


This one doesn't require case splitting: reflexive n = CofactorExists 1 $ sym $ multOneRightNeutral n

nickdrozd · 2020-05-18T22:33:15Z

libs/contrib/Data/Nat/Factor.idr

+
+||| Factor n p is a witness that p is indeed a factor of n,
+||| i.e. there exists a q such that p * q = n.
+data Factor : Nat -> Nat -> Type where


This is maybe getting into bikeshedding territory, but what about
changing this from Factor n p to Factor p n? Factor n p reads as
"n has p as a factor", but I feel like the more natural way to say it
would be "p is a factor of n", or alternatively, "p evenly divides n".
With infix syntax, it could even be p | n or something like that.
Similar with CommonFactor. Instead of saying "a and b both have p as
a factor", the reading would be "p is a factor of both a and b" or "p
divides both a and b".

Besides being more natural (IMO), this would clarify the way in which
CommonFactor is symmetric. Given some "p", the relation is "has
factor p in common with", as in "a has factor p in common with b".
Something like this:

using (p : Nat) Preorder Nat (CommonFactor p) where Equivalence Nat (CommonFactor p) where

Unfortunately, we can't have (|) as a type constructor, since it is already reserved as a part of Idris syntax. Nontheless, this is probably a better choice, so I'll change it.

nickdrozd · 2020-05-18T22:35:00Z

libs/contrib/Data/Nat/Factor.idr

+
+||| A decision procedure for whether of not p is a factor of n.
+decFactor : (n, d : Nat) -> DecFactor n d
+decFactor n Z = ?whaatIfZ


Damnit. sorry.

…eftover hole.

Sventimir · 2020-05-19T07:41:03Z

I have switched places of arguments to Factor and NotFactor.

As to the leftover hole, I'm not really proud of the way I filled it in, but I see no other way. Standard procedure of proving not-a-factor statement requires to specify a non-zero remainder of the division, which is also strictly less than the divisor. Without the last restriction, you could for instance prove that 2 is not a factor of 4, since 2 * 1 + 2 = 4. However, when divisor is 0, there is no such remainder you could possibly construct for any number. Yet, you can't prove that 0 is a divisor of a non-zero number either. So the only way to fill that vacuum I can think of is to provide a special constructor that asserts that 0 is not a factor of any successor. This way decFactor function is total once more and factorNotFactoAbsurd property still holds.

But I don't like it really, so if anyone could come up with a simpler/nicer solution, that would be welcome.

Sventimir added 11 commits May 5, 2020 20:06

Add definition of a Factor.

e264eb1

Rewrite DecFactor in terms of divMod from Data.Fin.Extra.

0f454f9

Refactor Factor type so that it does not mention FactorPair.

f6e7c91

Add some helper functions for juggling factors.

659f7cf

Add another factor theorem.

2c183e8

Add a rough GCD procedure (with part of its properties already proven).

8e765e9

Prove totality of the GCD procedure.

e46b4dc

Prove that the GCD actually iIS the greates common divisor.

d8b1837

Improve type of cofactor accessor.

826faa0

Add a documentation comments.

b73953d

Fix a typing error in documentation comments.

3c44522

nickdrozd reviewed May 16, 2020

View reviewed changes

Simplify proofs about properties of natural numbers (in)equalities.

1878156

Add export directive for the gcd function.

2aa0139

nickdrozd reviewed May 16, 2020

View reviewed changes

nickdrozd reviewed May 17, 2020

View reviewed changes

Sventimir added 3 commits May 18, 2020 11:48

Invert the equation type in Factor and NotFactor constructors.

104152f

Remove LTE 1 requirements from Factor and NotFactor proof constructors.

e713ab7

Implement Preorder interface for Factor.

678f9c5

nickdrozd reviewed May 18, 2020

View reviewed changes

Swap places of arguments in Factor and NotFactor types and remove a l…

f914a48

…eftover hole.

Sventimir closed this Nov 14, 2022

Factorisation #4859

Factorisation #4859

Conversation

Sventimir commented May 14, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sventimir May 16, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sventimir May 17, 2020 • edited

Choose a reason for hiding this comment

nickdrozd May 16, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sventimir May 16, 2020 • edited

Choose a reason for hiding this comment

nickdrozd commented May 16, 2020

Sventimir commented May 16, 2020 • edited

Choose a reason for hiding this comment

Sventimir May 16, 2020 • edited

Choose a reason for hiding this comment

nickdrozd commented May 16, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sventimir commented May 16, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sventimir May 17, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sventimir May 17, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sventimir May 17, 2020 • edited

Choose a reason for hiding this comment

Sventimir commented May 18, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sventimir May 19, 2020 • edited

Choose a reason for hiding this comment

Sventimir commented May 19, 2020 • edited

Sventimir May 16, 2020 •

edited

Sventimir May 17, 2020 •

edited

nickdrozd May 16, 2020 •

edited

Sventimir May 16, 2020 •

edited

Sventimir commented May 16, 2020 •

edited

Sventimir May 16, 2020 •

edited

Sventimir May 17, 2020 •

edited

Sventimir May 17, 2020 •

edited

Sventimir May 17, 2020 •

edited

Sventimir commented May 18, 2020 •

edited

Sventimir May 19, 2020 •

edited

Sventimir commented May 19, 2020 •

edited