New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Factorisation #4859
Factorisation #4859
Conversation
libs/contrib/Data/Nat/Factor.idr
Outdated
||| divisors. It is proven correct and total; returns a proof that computed | ||
||| number actually IS the GCD. Unfortunately it's very slow, so improvements | ||
||| in terms of efficiency would be welcome. | ||
gcd : (a, b : Nat) -> {auto aok : LTE 1 a} -> {auto bok : LTE 1 b} -> (f : Nat ** GCD a b f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a gcd
defined in Prelude.Nat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. But that definition does not provide any proof that the returned number is indeed what it claims to be. That is the whole point of this module: to define what a factor is and start proving things about factors and factorisation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my point is just that there is a name collision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about naming it gcf
for greatest common factor. That would be along with the naming convention in this module, but on the other hand it would not be along with what people expect that function to be named. Type of this function is different, so in some cases (when you supply implicit arguments explicitly) calls will be unambiguous. Otherwise you'll have to qualify it. Also returned type is different, so if the compiler can infer that, it also won't require qualified name. I don't think it's too bad…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other gcd
function only stipulates that the inputs can't both be zero. Could that work here? I've looked at a few expositions of the algorithm, and nobody ever spends much time discussing the nature of the inputs. It would certainly be cleaner if there weren't LTE 1 n
proofs floating around all over, but I don't know how necessary they are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zeros are problematic, because according to this definition at least, everything is a factor of zero. This destroys some nice properties, like that a factor is always less than or equal to the number it divides. Also there is consideration that according to this setup, zero is a factor of zero, which implicitly assumes that you can actually divide by zero. Also when you want to implement rational numbers, that non-zero property comes in handy, because all ratios are either positive or negative and you just add a special constructor for zero. This way you avoid the problem ZZ
has, that 0
and -0
are distinct values, which messes up the (=)
proofs. Also the requirement that divisor must be non-zero is graciously provided by this arrangement (due to Uninhabited (Factor n Z)
implmentation, which is no longer valid when you consider divisors of zero).
On the other hand, as I review the code, I see that the non-zero property is not used that often (mostly is just passed forward as is or with some alteration). So probably we could require non-zero-ness for those proofs only, and we could require it explicitly for rational numbers as well. I'm currently looking into the possibilities of removing that constraint and it seems possible.
subtractEqLeft {a = Z} prf = prf | ||
subtractEqLeft {a = S k} {b} {c} prf = subtractEqLeft $ succInjective (k + b) (k + c) prf | ||
|
||
subtractEqRight : {a, b, c : Nat} -> a + c = b + c -> a = b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subtractEqRight
can be proved as an instance of subtractEqLeft
:
subtractEqRight : {a, b, c : Nat} -> a + c = b + c -> a = b
subtractEqRight {a} {b} {c} prf =
subtractEqLeft {a = c} {b = a} {c = b} $
trans (plusCommutative c a) $
trans prf $
plusCommutative b c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
rewrite sym $ plusSuccRightSucc b k in | ||
LTESucc $ addLteRight prf | ||
|
||
addLteLeft : LTE b c -> LTE (a + b) (a + c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, addLteRight
would be simpler to prove as an instance of addLteLeft
. It's probably true for any of these left / right proofs that one side is easier to prove, and the other side can be derived from that one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. Then I'll try rewriting them.
When I compile and load |
It's because of the |
libs/contrib/Data/Nat/Factor.idr
Outdated
|
||
||| If p is a factor of both n and m, then it is also a factor of their sum. | ||
plusFactor : Factor n p -> Factor m p -> Factor (n + m) p | ||
plusFactor {n} {p} nFactor@(CofactorExists qn positN prfN) (CofactorExists qm positM prfM) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the @
mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@
binds the whole pattern on the right to the name on the left. In this case it binds CofactorExists qn positN prfN
to the name nFactor
. This way you can simultaneously refer to the whole expression and to its subexpressions.
It's roughly equivalent to this:
plusFactor {n} {p} nFactor (CofactorExists qm positM prfM) =
let CofactorExists qn positN prfN = nFactor in
...
That fixed it. I don't know what the lines are about. It could be a bug in the mode. Actually, it is certainly a bug in the mode. For: the underlining is either accurate or not. If it isn't accurate, then that's obviously a bug. On the other hand, if it is accurate, then the lines are an indication that something is wrong, but that something isn't being reported, which is a bug. Is that an intuitionistically valid proof? |
||| A decision procedure for whether of not p is a factor of n. | ||
decFactor : (n, d : Nat) -> {auto nok : LTE 1 n} -> {auto dok : LTE 1 d} -> DecFactor n d | ||
decFactor n (S d) {nok} {dok} with (Data.Fin.Extra.divMod n (S d)) | ||
| (Fraction n (S d) q r prf) = case r of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case statement can be pushed back into the arg list:
decFactor n (S d) {nok} {dok} with (Data.Fin.Extra.divMod n (S d))
| (Fraction n (S d) q FZ prf) =
let prf =
replace {P = \x => x = n} (plusZeroRightNeutral (q + (d * q))) $
replace {P = \x => x + 0 = n} (multCommutative q (S d)) prf
in
ItIsFactor $ CofactorExists q nok prf
| (Fraction n (S d) q (FS pr) prf) =
ItIsNotFactor $ ProperRemExists q pr nok $
rewrite multCommutative d q in
rewrite sym $ multRightSuccPlus q d in
prf
(Indent to taste)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes… I usually make a double indentation after the function pattern, because if I'm going to add a where
clause later, I like it to be less indented than the function body. I feel it's more apparent where the body ends that way. And if I'm not going to add a where
, at least all functions are indented according to the same rule. However, I'm not sure if it's the right approach. I also like 4-space-wide indents, while you apparently prefer 2-spaces only. I should probably have a closer look at older modules and adapt my style to what there already is…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly I stick to the indentation levels provided by the Emacs mode, and I don't think it ever provides 4-spacing. (It doesn't matter, of course.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think it does matter in the sense that it should be consistent within a project.
Seems valid to me, although rather informal. |
libs/contrib/Data/Nat/Factor.idr
Outdated
CofactorExists (S k) ok (rewrite plusZeroRightNeutral k in Refl) | ||
|
||
||| For all natural numbers n, n is (the greatest) a factor of n. | ||
selfFactor : (n : Nat) -> {auto ok : LTE 1 n} -> Factor n n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call this factorReflexive
. The factor relationship is a partial order, right? That might be worth exploring, although the ok
thing complicates it a little.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, these LTE
s are annoying. However, considering zeros destroys most of the properties of factors, since anything is a factor of zero. On the other hand, maybe I should exclude zero only where it really hurts and allow considering factors of zero in general case.
Hmm, indeed! The factor relation is a partial order. That's a thing worth proving. I'll add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(CofactorExists b bok (rewrite plusZeroRightNeutral b in Refl)) | ||
|
||
||| Any natural number is a common factor of itself and itself. | ||
selfIsCommonFactor : (a : Nat) -> {auto ok : LTE 1 a} -> CommonFactor a a a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, I would call this commonFactorReflexive
and move it next to commonFactorSym
. Is there a sense in which CommonFactor
is transitive? If it is, you could implement it as an equivalence relation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, in order to be transitive, it would have to be a two-argument predicate. This can be done like so:
CommonFactor (a, b) p
However now sides of the relation have different types, which is no good. However, if you stretch the definition a little bit, you can come up with something like this:
cfTransitive : CommonFactor (a, b) p -> CommonFactor (b, c) p -> CommonFactor (a, c) p
So it is transitive when you consider it a binary relation (either there is or there isn't a common factor of two numbers in question), without considering what that factor actually is. When looked upon like this, it is indeed transitive, and this is the sense in which it is symmetric in commonFactorSym
proof.
gcd Z _ {aok} = absurd aok | ||
gcd _ Z {bok} = absurd bok | ||
gcd (S a) (S b) with (cmp (S a) (S b)) | ||
gcd (S (b + S d)) (S b) | CmpGT d = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to handle Euclid's algorithm is to arrange it so that the first argument is the smaller one. If the first argument is larger, just flip the arguments around and proceed as usual, as in gcd 27 18 = gcd 18 27 = 9
. That would simplify the logic (just keep the simpler of the two branches), although I think it would require an assert_total
.
This is discussed at the very beginning of Knuth's The Art of Computer Programming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's exactly what I have done. If a > b
, then the recursion is run with (a, b)
pair, otherwise – with (b, a)
. The GCD.Search
type even contains a proof that a >= b
. Totality of the function is implied by the use of sizeInd
function from WellFounded
module, which provides looping capability already proved to be total.
Writing the recursion by hand, I could find no way to convince the compiler about totality. The trick provided by the Sized
interface is quite ingenious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay, I'm still working through the details.
libs/contrib/Data/Nat/Factor.idr
Outdated
let sbIsFactor = the (plus q (mult b q) = S a) $ | ||
rewrite multCommutative b q in | ||
rewrite sym $ multRightSuccPlus q b in | ||
replace {P = \x => x = S a} (plusZeroRightNeutral (q * S b)) prf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace
can be replaced with
rewrite sym prf in
sym $ plusZeroRightNeutral (q * S b)
IME explicit use of replace
is a "proof smell", and there is almost always a nicer way to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try to use rewrite
rather than replace
whenever I can, but it's not always possible. There is the huge difference between them, however. replace
makes a substitution in a proof you already have to create another proof, whereas rewrite
makes a substitution in the goal (which is a proof you do not have yet) instead. In practice it means that with rewrite
you work kind of backwards, rewriting your goal until it matches something you already have, while replace
has a more direct approach. Also with rewrite
you can only use equality (=)
proofs (the same with replace
of course), which means that if you you need use a custom proof-altering function like cong
or something in the middle of replacements, you actually can no longer use rewrite
from that point on. Unless of course I am missing something here.
This problem does not apply here, of course. This probably arose from some code reworking or maybe I was just tired.
libs/contrib/Data/Nat/Factor.idr
Outdated
||| Factor n p is a witness that p is indeed a factor of n, | ||
||| i.e. there exists a q such that p * q = n. | ||
data Factor : Nat -> Nat -> Type where | ||
CofactorExists : {n, p : Nat} -> (q : Nat) -> LTE 1 n -> p * q = n -> Factor n p |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that there are a lot of places with rewrite sym prf
or
something like that. I think the sym
s are requried because this
proof has the form p * q = n
. If it were n = p * q
, I think the
sym
s could be done away with, which would be a little less verbose.
Generally proof equations are written as complexity-reducing devices,
with the more complex part on the left and the less complex on the
right. For example, the monoid neutral law looks like n + 0 = n
,
because n
is simpler than n + 0
. But with factors, the reduction
in complexity is to say that a larger number can be written as the
product of smaller numbers, so I think n = p * q
is appropriate.
(It might be too late to implement this minor suggestion.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point. And I don't think it's too late for the change. It's only 300 lines of code. I have done much larger refactorings with much less type-checking to support me, so I believe I will do it. ;)
I implemented |
libs/contrib/Data/Nat/Factor.idr
Outdated
Refl | ||
) | ||
|
||
reflexive Z = CofactorExists 1 Refl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one doesn't require case splitting: reflexive n = CofactorExists 1 $ sym $ multOneRightNeutral n
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point
|
||
||| Factor n p is a witness that p is indeed a factor of n, | ||
||| i.e. there exists a q such that p * q = n. | ||
data Factor : Nat -> Nat -> Type where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is maybe getting into bikeshedding territory, but what about
changing this from Factor n p
to Factor p n
? Factor n p
reads as
"n has p as a factor", but I feel like the more natural way to say it
would be "p is a factor of n", or alternatively, "p evenly divides n".
With infix syntax, it could even be p | n
or something like that.
Similar with CommonFactor
. Instead of saying "a and b both have p as
a factor", the reading would be "p is a factor of both a and b" or "p
divides both a and b".
Besides being more natural (IMO), this would clarify the way in which
CommonFactor
is symmetric. Given some "p", the relation is "has
factor p in common with", as in "a has factor p in common with b".
Something like this:
using (p : Nat)
Preorder Nat (CommonFactor p) where
Equivalence Nat (CommonFactor p) where
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, we can't have (|)
as a type constructor, since it is already reserved as a part of Idris syntax. Nontheless, this is probably a better choice, so I'll change it.
libs/contrib/Data/Nat/Factor.idr
Outdated
|
||
||| A decision procedure for whether of not p is a factor of n. | ||
decFactor : (n, d : Nat) -> DecFactor n d | ||
decFactor n Z = ?whaatIfZ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hole
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Damnit. sorry.
I have switched places of arguments to As to the leftover hole, I'm not really proud of the way I filled it in, but I see no other way. Standard procedure of proving not-a-factor statement requires to specify a non-zero remainder of the division, which is also strictly less than the divisor. Without the last restriction, you could for instance prove that 2 is not a factor of 4, since But I don't like it really, so if anyone could come up with a simpler/nicer solution, that would be welcome. |
This pull request adds
Data.Nat.Factor
module tocontrib
package. It defines notions of a factor (a natural number, which divides another number without a remainder), and common factors of two natural numbers. It proves several basic laws of factors and implmements a decision procedure as to whether a certain number is a factor of another number of not. Finally, the notion of a greatest common divisor is formally defined and Euclid's algorithm for finding GCDs is implemented and proven correct.This is a major step towards implementing a decent
Rational
type, which will be my next goal.