-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFE] Add math.lcm() function: Least Common Multiple #83660
Comments
can we add an lcm and gcd function that can work as: lcm(4,6) # returns 12
gcd(4,6) # returns 2 |
There is math.gcd(): You can use numpy.lcm(): Is it common to need lcm()? Do you have examples of applications which need lcm()? It's trivial to implement lcm(): I suggest to reject this feature request, since I never needed lcm() in 10 years of Python, whereas I use gcd() time to time. |
Uses for lcm are common enough that it is provided by Excel and the C++ boost. You can use it for working out problems like:
By the way, the "trivial" implementation given in the Stackoverflow link has a bug: if both arguments are zero, it raises instead of returning zero. I wish that gcd took an arbitrary number of arguments, I often need to find the gcd of three or more numbers, and this is a pain: gcd(a, gcd(b, gcd(c, gcd(d, e))))) when I could just say gcd(a, b, c, d, e) and have it work. Likewise of lcm. (For that matter, the gcd of a single number a is just a.) |
reduce(gcd, [a, b, c, d, e]) |
I created this issue as i came across the following question: There are n students in class A,and m students in class B.each class divides into teams for a competition.What is the biggest possible team size that can be divided,such that each team has same number of members. We can solve this type of problems easily if we have lcm() in math library.And there are lots of real life applications where we have to use lcm. |
Should i proceed with adding a pull request for adding a 'lcm' function in python's math module. |
I must say that the problem (with two classes divided into teams) seems to me to be exactly one that can be solved with gcd, and lcm itself is mostly useless for it. |
some problems that needs lcm function: 1:find the least number which when divided by 'a','b','c','d' leaves remainder 'e' in each case. 2:person A exercises every 'n' days and person B every 'm' days. A and B both exercised today. How many days will it be until they exercise together again? 3:The LCM is important when adding fractions which have different denominators we have to use the lcm function when,
All these shows lcm function should be included in the math library. So can i proceed with adding pull request to add lcm function in python's math module. |
-1 Given that we had gcd(), I don't see any value to adding *lcm()* as well. Once you have gcd(), getting the least common multiple is trivial. Also, it is rare to ever need a lcm() function. I don't think I've ever seen it in real code. |
I agree with Raymond that it's really seldom needed. However, I'd like to point out that the "trivial" implementation might not be so trivial after all: as Steven said, it mishandles (0,0) case. And even Tim fell into that trap, so it can't be said it's easily avoided. I agree that this case doesn't really appear in "real world" tasks, but that's not really an excuse: imagine a factorial that didn't work for 0. Also, a bit more often used case: seeing the code for lcm of 2 arguments, people might (and do; I've seen it) generalize to 3 or more arguments in a way that seems logical and is often correct, but not always (a*b*c//gcd(a,b,c)). And one more tidbit: the usual formula for lcm doesn't really work for the case of fraction inputs. I know that currently math.gcd doesn't handle fractions, but it could. It's imaginable that that feature will some day be added (just like pow recently started supporting modular inverses), and then suddenly lcd implementations will silently give the wrong result for fractions. A smart man;-) once said that the main criteria for inclusion in stdlib is that the function is often needed by users, _and_ it's often implemented wrong. I think lcm doesn't really satisfy the first criterion, but it does satisfy the second. |
I agree with Vedran Čačić.As the modules are not made for one person but it is for the ease of coding.There are so many functions which i don't use but used by other people.We are using functions to make coding easy and if lcm function is added many people will find it usefull. |
+0 from me. Another use is computing the Carmichael function for composite numbers (like an RSA encryption modulus, in which context the Carmichael function is routinely used). But only +0 instead of +1 because it's so easy to build from gcd(). I don't agree it's tricky at all. While lcm(0, 0) undoubtedly should return 0 in a general-purpose library function, in my own code I've never supplied that. Because in every application I've ever had for it, I would rather get an exception if I ever passed two zeroes - that would always have been a mistake. |
+0 from me as well; agreed with everything that Tim said (except that I've never had a need for the Carmichael function; my RSA implementations do the inefficient thing based on (p-1)(q-1)). This is somewhat reminiscent of comb and perm: lcm is often taught as a natural counterpart to gcd, so despite the fact that it's less fundamental and has less utility, people often expect to see the two together. @ananthakrishnan: do you want to put together a PR? I'll commit to reviewing it if you do. |
Yes,I want to put together a PR. |
Great. For clarity, here's a Python function giving the behaviour I'd expect from a 2-arg lcm: from math import gcd
def lcm(a, b):
if a == 0:
return 0
return abs(a // gcd(a, b) * b) |
If a < b, what is better,
or
? Or there is no difference? |
I'd guess a // gcd(a, b) * b would be faster, on the basis that division is slower than multiplication. But I doubt it's worth worrying about for this implementation, given that the gcd call is likely to be the bottleneck as a and b get large. |
And I in turn agree with everything Mark said ;-) But I'll add that while the analogy to comb/perm is a good one, I think the case for lcm is stronger than for perm. Not only is lcm more common in real life (although, no, not at all common overall!), perm was added at the same time as prod, and perm is essentially a special case of prod. |
Perhaps add, is_prime() as well. Conceptually, it is in the same family of functions. Unlike lcm(), it is one that I would use and would improve Python's value as a teaching tool. |
Raymond, there was a thread a while back about adding an In that older thread, I suggested a I don't think Anyway ;-) , ya, I like the idea, but I'd sure like it to go into a module where it obviously belongs. Also a function, e.g., to generate primes, and ... |
I dislike the idea of adding a is_prime() function in Python since users will likely stress Python with huge numbers and then complain that Python is too slow. Correct is_prime() is very slow for large numbers, and statistically approach is not 100% correct... Or we may have to add both approaches. But then you have to pick an algorithm which would fit "most use cases". I concur with Tim, it would be better to put such opinionated code on PyPI ;-) -- I'm -0 on adding math.lcm(). For example, I failed to find any request to add such function on python-ideas archives. This issue seems to be the first user request. I'm not opposed to add it. Some people seem to like the idea of the completeness of the stdlib (gcd & lcm go together). So if someone wants to add it, go ahead and propose a PR :-) |
is_prime that's always correct is probably not the right thing to go into math. Besides, now we have isqrt, it's just
-- yes, it's damn slow, but so is everything else you want to be absolutely correct. :-] is_probable_prime is another matter, but there is an enormous amount of bikeshedding about the API of that one. |
And yeah, I managed to leave out 2. Speaking about "often implemented wrong"... :-)) |
I'd have to hear back from Raymond more on what he had in mind - I may well have been reading far too much in the specific name he suggested. Don't much care about API, etc - pick something reasonable and go with it. I'm not overly ;-) concerned with being "newbie friendly". If someone is in a context where they need to use probabilistic solutions, there is no substitute for them learning something non-trivial about them. The usual API for a Miller-Rabin tester supports passing in the number of bases to try, and it's as clear as anything of this kind _can_ be then that the probability of getting back True when the argument is actually composite is no higher than 1 over 4 to the power of the number of bases tried. Which is also the way they'll find it explained in every reference. It's doing nobody a real favor to make up our own explanations for a novel UI ;-) BTW, purely by coincidence, I faced a small puzzle today, as part of a larger problem: Given that 25 is congruent to 55 mod 10, and also mod 15, what's the largest modulus we can be certain of that the congruence still holds? IOW, given x = y (mod A), and
x = y (mod B) what's the largest C such that we can be certain x = y (mod C) too? And the answer is C = lcm(A, B) (which is 30 in the example). |
On Fri, Jan 31, 2020 at 12:15:37PM +0000, Vedran Čačić wrote:
Lots of things are fast and still absolutely correct. And I think that you probably are underestimating just how slow trial P = int('29674495668685510550154174642905332730771991'
'79985304335099507553127683875317177019959423'
'8596428121188033664754218345562493168782883')
Q = P*(313*(P-1)+1)*(353*(P-1)+1) Q is a 397 digit Carmichael Number. Its smallest factor is P, which has In a practical sense, your algorithm is never going to terminate. The Upgrading your CPU is not going to help. My old and slow PC can prove that Q is a composite number, using pure
What is there to bikeshed about the API? The API should be a function that takes a single integer, and returns I will argue that any other details -- including whether it is For the same reason, the name should be just "is_prime" (or "isprime") Experts will read the documentation and/or the source code, and everyone We can argue on technical grounds just what the implementation should
etc. If we choose the fast, simple Miller-Rabin test, with just 30 random If that isn't good enough, increase the number of iterations to 50, in In comparison, it is estimated that cosmic rays cause memory bit https://blogs.oracle.com/linux/attack-of-the-cosmic-rays-v2 https://www.johndcook.com/blog/2019/05/20/cosmic-rays-flipping-bits/ I don't have much time for worries about Miller-Rabin being But with 30 or 50 rounds, I am confident that nobody will ever Let's be real: if the banks are willing to trust the transfer of (By the way: for smallish numbers, under 2**64, no more than twelve [1] For ease of discussion, we'll count zero and one as composite. [2] More like nanoscopic or picoscopic. |
Tim: Considering that congruence is _defined_ as x=y(mod m) :<=> m|y-x, it's really not so surprising. :-) Steven: It seems that we completely agree about inclusion of is_probabilistic_prime in stdlib. And we agree that it should be called isprime (or is_prime if Raymond names it;). About the bikeshedding, see Tim's comment. :-P About absolutely correct algorithms: first, what exactly is your claim? If it's "I can write an algorithm in pure Python that can for every number of 397 digits mathematically exactly determine whether it is prime in under 6 seconds", I'd very much like to see that algorithm. (I guess it must be publicly available, since we're speaking about inclusion in Python stdlib.) I really don't have much expertise in number theory that I can be convinced it doesn't exist, but I very much doubt it. |
[Steven]
The pure-Python Miller-Rabin code i posted in the aforementioned thread is typically at least 100 times faster than that. But it's not deterministic. Because it tries (pseudo)random bases, it all depends on how many it needs to try on each run before finding a witness to that Q is composite. It usually (at least 75% of runs) finds one on the first try. BTW, it doesn't much matter that this is "pure Python" - for ints of this size the bulk of the time regardless is spent in CPython's C-coded bigint arithmetic. I expect that your code must be doing more than _just_ Miller-Rabin, and in the Q case is paying through the nose for "optimizations" that all fail before getting to Miller-Rabin. About the API, I can't agree to the one you propose. Different applications have different appropriate tradeoffs between degree of certainty and time consumed - "one size fits all" doesn't fly here. def probably_prime(n, maxsteps=20) supplies a _default_ instead. I don't want an API that's useful _only_ to people who don't know what they're doing ;-)
But only if you use a fixed set of 12 specific bases for which the claim has been proved. "Real" Miller-Rabin picks bases at random, relying only on properties that have been proved independent of the argument size. [Vedran]
Didn't say it was ;-) Was just noting the odd coincidence that I just happened to stumble into a real use for lcm(), not previously mentioned in this report, while doing something else. |
[Tim]
This is exactly the sort of argument about quality of implementation I'm happy to have a discussion about implementation, here or off-list,
A fair point, thank you.
It's not just for people who don't know what they're doing. It is for If someone cares about the small details like how many bases to
etc. What if the implementation shifts away from Miller-Rabin to (say) I think that if someone cares sufficiently about the details, then they
Correct. The first twelve primes make up such a minimal set. |
[Tim]
[Steven]
I wasn't at all talking about API at that point. I was backing the argument _you_ were making, that trial division is a hopelessly inefficient implementation, compared to what's possible with probabilistic methods, regardless of API. You were in fact underselling that argument, because it's straightforward to get an order or two of magnitude faster than you demonstrated.
I know the paper it came from.
Which is why I have no problem picturing how this "should be" approached: the original Miller-Rabin (which uses random bases, not succumbing to the "premature optimization" catastrophe magnet) has no problem with Q (or with anything else!), and hasn't really been improved on for general-purpose use since it was published. It's a darned-near perfect mix of simplicity, brevity, clarity, robustness, generality, and speed. "Good enough" by a long shot on all counts.
Then they can accept the default. In what conceivable sense is that a burden?
But the function author can't possibly know what the _user_ needs this for. In some apps degree of certainty is far more important than speed, while in other apps it's the opposite. Be realistic here? Your argument here makes sense for always-right functions, but you're not willing to accept one of those for this specific purpose (& neither am I).
It's not a "small detail" where it matters: it is THE way to trade off computation time against confidence in the results. It's a _fundamental_ aspect of how Miller-Rabin works.
Then they should go use a special-purpose library ;-) Letting them fiddle the single most important parameter isn't down in the weeds, it's a top-level control knob. My answers to all the above:
It can't ;-) I would absolutely document that Miller-Rabin _is_ the algorithm being used, with the random-bases implementation choice. Then the curious can search the web for a mountain of good information about it.
All moot, given that I have no interest in hiding the specific algorithm in use. YAGNI.
Again, go to a special-purpose library if that's what they want. And, again, I don't agree with your characterization of the Miller-Rabin maxsteps parameter as a "detail". It's a fundamental aspect of what the algorithm does. Which casual users can certainly ignore, but at their own risk.
And if you don't care about picking the fixed bases from a prefix of the primes, you only need 7 bases for a 100% reliable test through 2**64: 2, 325, 9375, 28178, 450775, 9780504, and 1795265022. Or so this table claims: https://miller-rabin.appspot.com/ But I don't care here. Using a fixed set of bases is begging for embarrassment (for every fixed finite set of bases, there are an infinite number of composites they'll _claim_ are prime). There are no systemic failure modes when using random bases. |
If someone wants to continue the discussion on is_prime(), I suggest to open a separated issue. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: