Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Math.gcd new function #8331

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jbbjarnason
Copy link
Contributor

@jbbjarnason jbbjarnason commented Feb 20, 2022

Implement the greatest common divisor function in the global math module. I found it best placed under the definition of MICROPY_PY_MATH_SPECIAL_FUNCTIONS please suggest if needed to place it elsewhere.

The implementation covers the change in CPython 3.5 https://docs.python.org/3/whatsnew/3.5.html#math and as well as the more recent change in version 3.9 https://docs.python.org/3/whatsnew/3.9.html#math.

As a consequence the github pipelines need to run Python version 3.9, the image "ubuntu-latest (20.04)" is running on 3.8. So I added the necessary changes.
However, those changes do have some side effect including problems with dependencies to elftools in the coverage pipelines. My first guess is that the pip library is installing elftools for Python 3.8 (as it is the default Python version in ubuntu-latest) which is not found while running Python 3.9. How would you like this handled?

Edit: not relevant anymore.
The Github Actions configuration files have been updated to use version 3.9. As well as the Windows port.

This was worked on with the assistance of @wang3450

@jbbjarnason
Copy link
Contributor Author

jbbjarnason commented Feb 21, 2022

In the description I am referring to this error: https://github.com/micropython/micropython/runs/5273854557?check_suite_focus=true#step:7:12

Edit: Not relevant anymore.

@rkompass
Copy link

rkompass commented Feb 22, 2022

Hello jbbjarnason,

out of curiosity I had a look into this commit ( I want to learn how to contribute to MPy and the gcd topic seems simple).
And I noticed you decided for an algorithm which is mathematically simple and correct but may lead to endless recursion if you e.g. compute gcd(2147483643, 42). Even under Linux this leads to Memory Error.

I tested the following algorithms:

#include <stdio.h>

int gcd_func(int x, int y) {
    if (x == 0) {
        return y;
    }
    if (y == 0) {
        return x;
    }
    if (x == y) {
        return x;
    }
    if (x > y) {
        return gcd_func(x - y, y);
    }
    return gcd_func(x, y - x);
}

int gcd_spl(int u, int v)
{
	int t;

	if (u < 0)
		u = -u;
	if (v < 0)
		v = -v;
	if (v > u) {   // swap u and v
		t = u;
		u = v;
		v = t;
	}
	while ((t = u % v) > 0) {
		u = v;
		v = t;
	}

	return v;
}

int gcd_iter(int u, int v)
{
	if (u < 0) u = -u;
	if (v < 0) v = -v;
	while ((u %= v) && (v %= u))
		;
	return (u + v);
}

int main()
{
	int (*gcd)(int, int);
	gcd = gcd_func;
//	gcd = gcd_iter;
//	gcd = gcd_spl;

	printf("gcd(112, 6) = %d\n", gcd(112, 6));
	printf("gcd(112, -6) = %d\n", gcd(112, -6));
	printf("gcd(-112, 6) = %d\n", gcd(-112, 6));
	printf("gcd(-112, -6) = %d\n", gcd(-112, -6));
	printf("gcd(2147483643, 42) = %d\n", gcd(2147483643, 42));
	
	return 0;
}

gcd_iter() is from Rosetta code http://www.rosettacode.org/wiki/Greatest_common_divisor#C.
It has unnecessary many remainder (%) - computations to keep the code short.
I simplified/straightened it to gcd_spl().
gcd_func() is your code in plain c.
If you run this code you notice the problems.
I suggest to adapt the Pull-request to contain something like gcd_spl().
I still could not do that, as I don't know github well enough and don't know how to compute with big ints in MPy internals.
I also suggest to include computation of gcd(2147483643, 42) and even an example with a larger (bigint) number into the tests.

Greetings,
Raul

@rkompass
Copy link

My next attempt: Is this the correct way to deal with potentially big integers? Can we compare them with ==, assign with =, compare against 0 with < 0 or do we need the binary ops in the latter cases?

// gcd(x, y): return the greatest common divisor
STATIC mp_int_t gcd_func(mp_int_t u, mp_int_t v) 
{
	mp_int_t t;

	if (mp_binary_op(MP_BINARY_OP_LESS, u, 0)
		u = mp_unary_op(MP_UNARY_OP_NEGATIVE, u)
	if (mp_binary_op(MP_BINARY_OP_LESS, v, 0)
		v = mp_unary_op(MP_UNARY_OP_NEGATIVE, v)
	if (mp_binary_op(MP_BINARY_OP_MORE, v, u)    // make sure that u >= v
		t = u;
		u = v;
		v = t;
	}
	while (t = mp_binary_op(MP_BINARY_OP_MODULO, u, v); mp_binary_op(MP_BINARY_OP_MORE, t, 0))
		u = v;
		v = t;
	}

	return v;
}

@jbbjarnason
Copy link
Contributor Author

jbbjarnason commented Feb 22, 2022

Hello jbbjarnason,

out of curiosity I had a look into this commit ( I want to learn how to contribute to MPy and the gcd topic seems simple). And I noticed you decided for an algorithm which is mathematically simple and correct but may lead to endless recursion if you e.g. compute gcd(2147483643, 42). Even under Linux this leads to Memory Error.

I tested the following algorithms:

#include <stdio.h>

int gcd_func(int x, int y) {
    if (x == 0) {
        return y;
    }
    if (y == 0) {
        return x;
    }
    if (x == y) {
        return x;
    }
    if (x > y) {
        return gcd_func(x - y, y);
    }
    return gcd_func(x, y - x);
}

int gcd_spl(int u, int v)
{
	int t;

	if (u < 0)
		u = -u;
	if (v < 0)
		v = -v;
	if (v > u) {   // swap u and v
		t = u;
		u = v;
		v = t;
	}
	while ((t = u % v) > 0) {
		u = v;
		v = t;
	}

	return v;
}

int gcd_iter(int u, int v)
{
	if (u < 0) u = -u;
	if (v < 0) v = -v;
	while ((u %= v) && (v %= u))
		;
	return (u + v);
}

int main()
{
	int (*gcd)(int, int);
	gcd = gcd_func;
//	gcd = gcd_iter;
//	gcd = gcd_spl;

	printf("gcd(112, 6) = %d\n", gcd(112, 6));
	printf("gcd(112, -6) = %d\n", gcd(112, -6));
	printf("gcd(-112, 6) = %d\n", gcd(-112, 6));
	printf("gcd(-112, -6) = %d\n", gcd(-112, -6));
	printf("gcd(2147483643, 42) = %d\n", gcd(2147483643, 42));
	
	return 0;
}

gcd_iter() is from Rosetta code http://www.rosettacode.org/wiki/Greatest_common_divisor#C. It has unnecessary many remainder (%) - computations to keep the code short. I simplified/straightened it to gcd_spl(). gcd_func() is your code in plain c. If you run this code you notice the problems. I suggest to adapt the Pull-request to contain something like gcd_spl(). I still could not do that, as I don't know github well enough and don't know how to compute with big ints in MPy internals. I also suggest to include computation of gcd(2147483643, 42) and even an example with a larger (bigint) number into the tests.

Greetings, Raul

@rkompass
Thank you Raul for letting me know I updated the code according to the recursive implementation from rosettacode.org, the one-liner works with your suggested input. I added more cases in the tests file.
Was there a reason you did not suggest using that algorithm?

@rkompass
Copy link

@jbbjarnason,

hello. In ((u %= v) && (v %= u)) in every iteration there are always two modulos, i.e. remainders computed. In the example gcd(112, 6) in the first step both 112%6 and 6%112 are computed, but only 112 % 6 makes sense. This is to avoid comparison of the two numbers and reordering, if necessary. I don't like that. I don't know how fast the remainder is in ARM but I suppose it takes longer then the comparison and reordering, but I'm not sure.
Did you test with bignums too?

@rkompass
Copy link

Hello

you could test with gcd(1279980503569793028, 1234567844460). (=12). The computation with recursion seems o.k. in the one-liner case with %.

@rkompass
Copy link

Was there a reason you did not suggest using that algorithm?

I misunderstood the question, thought you referred to the gcd_iter(). The one-liner from rosetta-code works by recursion. Recursion may consume lots of memory (stack??) if one does not know it's limited.

@jbbjarnason jbbjarnason force-pushed the math/gcd-new-function branch 2 times, most recently from aeacf3b to fe9dcc2 Compare February 25, 2022 20:40
@jbbjarnason jbbjarnason marked this pull request as draft February 25, 2022 22:07
@jbbjarnason
Copy link
Contributor Author

After the update to Python 3.9 on Windows the error which is mentioned here: #8301 is happening.

3 tests failed: uasyncio_basic uasyncio_heaplock uasyncio_wait_task
--- "C:/projects/micropython/tests/results\\extmod_uasyncio_basic.py.exp"	2022-02-26 18:34:08.483553500 +0000
+++ "C:/projects/micropython/tests/results\\extmod_uasyncio_basic.py.out"	2022-02-26 18:34:08.483553500 +0000
@@ -3,4 +3,4 @@
 short
 long
 negative
-took 20 40 0
+took 30 50 0
--- "C:/projects/micropython/tests/results\\extmod_uasyncio_heaplock.py.exp"	2022-02-26 18:34:07.087346800 +0000
+++ "C:/projects/micropython/tests/results\\extmod_uasyncio_heaplock.py.out"	2022-02-26 18:34:07.087346800 +0000
@@ -3,7 +3,7 @@
 2 0
 sleep
 1 1
-1 2
 2 1
+1 2
 1 3
 finish
--- "C:/projects/micropython/tests/results\\extmod_uasyncio_wait_task.py.exp"	2022-02-26 18:34:08.814773600 +0000
+++ "C:/projects/micropython/tests/results\\extmod_uasyncio_wait_task.py.out"	2022-02-26 18:34:08.814773600 +0000
@@ -5,6 +5,6 @@
 start
 hello
 world
-took 40 40
+took 50 50
 task_raise
 ValueError

@stinos
Copy link
Contributor

stinos commented Feb 26, 2022

It's unlikely that is related to Python 3.9: those tests are exectued using MicroPython, and also happen with other Python versions.

@jbbjarnason jbbjarnason force-pushed the math/gcd-new-function branch 2 times, most recently from 86495b7 to ce7a5fb Compare March 6, 2022 10:39
@jbbjarnason
Copy link
Contributor Author

@rkompass Me and my partner, @wang3450, have updated the math gcd algorithm. Can you review its functionality?
Note: We have added a test for numbers larger than 64 bit integers.

@rkompass
Copy link

I'll try to do it next week.

jbbjarnason added 3 commits April 3, 2022 13:27
Addition to Math Special.
Greatest common divisor.
Inputs are constrained to only integers.
Implemented with variadic amount of arguments.
Tests from CPython.
Link: https://github.com/python/cpython/blob/main/Lib/test/test_math.py.
ports/windows/.appveyor: Update Windows virtual env. to 3.9.
New Python features depend on more recent Python.
@jbbjarnason jbbjarnason marked this pull request as ready for review April 3, 2022 13:44
@jbbjarnason
Copy link
Contributor Author

I have added tests from CPython covering the gcd function, which should bring confidence to these changes.
@dpgeorge I re-published the pull request as it can be reviewed now.

@jbbjarnason
Copy link
Contributor Author

@dpgeorge should I split this PR into two PR, one for the CPython update and another one for the math.gcd function?

@dpgeorge
Copy link
Member

I'm just looking at this now, and have three comments/questions:

  1. What was the trigger to implement this function, why did you need it? MicroPython needs to keep its size down so new things are added only after careful consideration.
  2. Did you see the existing mpz_gcd() function which could be used to implement this?
  3. Generally if the feature requires a "modern" version of CPython to test against we include a .py.exp file so that the tests can run even with older CPython. Doing that would be a much simpler way to get this new gcd test working, rather than update CPython to 3.9 in CI. On the other hand, one can argue that we should increase the minimum CPython version required for testing, eg to 3.9, so that we can more easily test the modern Python features.

@dpgeorge
Copy link
Member

should I split this PR into two PR, one for the CPython update and another one for the math.gcd function?

Yes, I think that's a good idea, see point (3) above. That will separate the discussion about CPython test version requirements.

@jbbjarnason
Copy link
Contributor Author

  1. What was the trigger to implement this function, why did you need it? MicroPython needs to keep its size down so new things are added only after careful consideration.

We don't have an explicit example of usage, but gcd is often used in network security and other math applications. It was not exposed to the user in MP so we decided to add it in, as it is referenced here #1329 as being not yet implemented.

2. Did you see the existing mpz_gcd() function which could be used to implement this?

We were not aware of it.

3. Generally if the feature requires a "modern" version of CPython to test against we include a .py.exp file so that the tests can run even with older CPython. Doing that would be a much simpler way to get this new gcd test working, rather than update CPython to 3.9 in CI. On the other hand, one can argue that we should increase the minimum CPython version required for testing, eg to 3.9, so that we can more easily test the modern Python features.

At that point in time we were not aware of .py.exp files so we updated the environment instead.

Conclusion:

Should we abandon this PR since we don't have an explicit use case?

@stinos
Copy link
Contributor

stinos commented Apr 13, 2022

FWIW I've needed gcd more than once in numerical stuff, and then implemented it in Python with the typcial recursive method.

@massimosala
Copy link

Hi!
I sugget to see this algo:
https://en.wikipedia.org/wiki/Binary_GCD_algorithm

@massimosala
Copy link

Or also this one:

def gcd(a, b) :
	while b:
		a, b = b, a % b
	return a

It is not recursive and is very efficient: it converges to the result in fewer cycles than using subtraction.

@massimosala
Copy link

Or also this one:

def gcd(a, b) :
	while b:
		a, b = b, a % b
	return a

It is not recursive and is very efficient: it converges to the result in fewer cycles than using subtraction.

I made a few benchmarks on C Python 3.8.
This gcd is faster than gcd_spl and I think it is also elegant, in its simplicity.

About recursive functions on microcontrollers... IMHO they are a big disgrace.

@jbbjarnason
Copy link
Contributor Author

@massimosala if I remember correctly the current implementation of this PR is doing really similar to what you mention in C, https://github.com/micropython/micropython/pull/8331/files#diff-4eb012569ff9502b6617d7166eee7dc32340c2d1c737690a8226039b12c2efc5R222-R226

tannewt added a commit to tannewt/circuitpython that referenced this pull request Aug 29, 2023
utilize CIRCUITPY_CONSOLE_UART_BAUDRATE parameter
@projectgus
Copy link
Contributor

This is an automated heads-up that we've just merged a Pull Request
that removes the STATIC macro from MicroPython's C API.

See #13763

A search suggests this PR might apply the STATIC macro to some C code. If it
does, then next time you rebase the PR (or merge from master) then you should
please replace all the STATIC keywords with static.

Although this is an automated message, feel free to @-reply to me directly if
you have any questions about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants