Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: make arange start argument positional-only #25336

Merged
merged 4 commits into from Jan 18, 2024

Conversation

ev-br
Copy link
Contributor

@ev-br ev-br commented Dec 7, 2023

Make np.arange first argument positional-only. This does break some backwards compat. This PR removes a test, which also shows typical usages this patch disallows. The weirdest one is probably np.arange(stop=3), where the start is inferred to have a zero default. FWIW, the current arange sinature, arange([start,] stop[, step,], where the first argument has a default value and the second does not, is not very easy to reconstruct with just pure python rules.

cross-ref #23999 (comment) and #23999 (comment)

@@ -3062,7 +3062,7 @@ array_arange(PyObject *NPY_UNUSED(ignored),
NPY_PREPARE_ARGPARSER;

if (npy_parse_arguments("arange", args, len_args, kwnames,
"|start", NULL, &o_start,
"", NULL, &o_start,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you make this "|" you would still allow stop= on its own, I think? Or is there some other reason to do this, besides arange(start=3) being silly and it doesn't seem worthwhile to explicitly error out there to allow kwarg use for start?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, it's just consistency and cognitive load.
arange(3) means arange(stop=3), which is arange(start=0, stop=3, step=1). What is stop in arange(start=3) (not what numpy allows currently, what it should be? I've no idea and no intuition).

Having the first argument "|" feels strange. If the first argument is optional, it should mean that arange() is legal. But it is not and should not be IMO.

Overall, I think what's here is the best approximation for arange(start=0, stop, step=1) which is impossible without a crystal ball.

@ngoldbaum
Copy link
Member

ngoldbaum commented Dec 7, 2023

FWIW, a github search finds many real-world usages of at least the np.arange(stop= pattern. This is enough usage IMO that if we do want to change this behavior it needs to be a deprecation first, if we want to deprecate it at all.

@ev-br
Copy link
Contributor Author

ev-br commented Dec 9, 2023

I wonder how much of the code from that github search is going to be broken by other NumPy 2.0. changes and cleanups.
On a high level, yes, this is a breaking change. Were it not for the breaking change window of 2.0, this would have been a no-go of course. If we are breaking BC, this is IMO worth it.
If the numpy folks believe it's too much, ¯\_(ツ)_/¯.

@ngoldbaum
Copy link
Member

The goal with all the python migrations was either to expire existing deprecations, in which case breaking changes are fine, and otherwise look for existing usages and gauge whether to outright remove something or merely deprecate it in 2.0 based on the level of existing usage inferred by public code searches.

There are definitely lots of new deprecations in numpy 2.0 - accessing or importing almost anything from np.core for example - that I think it should also be viewed as a release that also has some new deprecations along with outright backward-incompatible changes.

@charris
Copy link
Member

charris commented Dec 9, 2023

This seems dangerous to me. Some downstream projects will need to make fixes when 2.0 comes out, but changing a widely used function may cause more churn than needed. It could maybe be justified if it uncovered unsuspected bugs, but it isn't clear to me that that is the case here.

@seberg
Copy link
Member

seberg commented Dec 19, 2023

Let's just change this to keep allowing arange(stop=3) and get it over with? I don't care if the documentation looks like it might not be allowed.

@ev-br
Copy link
Contributor Author

ev-br commented Dec 22, 2023

If you make this "|" you would still allow stop= on its own

and

Let's just change this to keep allowing arange(stop=3)

Not sure how to make it work, actually. Using

diff --git a/numpy/_core/src/multiarray/multiarraymodule.c b/numpy/_core/src/multiarray/multiarraymodule.c
index d673d95c02..653895c554 100644
--- a/numpy/_core/src/multiarray/multiarraymodule.c
+++ b/numpy/_core/src/multiarray/multiarraymodule.c
@@ -3062,7 +3062,7 @@ array_arange(PyObject *NPY_UNUSED(ignored),
     NPY_PREPARE_ARGPARSER;
 
     if (npy_parse_arguments("arange", args, len_args, kwnames,
-            "", NULL, &o_start,
+            "|", NULL, &o_start,
             "|stop", NULL, &o_stop,
             "|step", NULL, &o_step,
             "|dtype", &PyArray_DescrConverter2, &typecode,

results in

In [2]: np.arange(stop=4)
---------------------------------------------------------------------------
SystemError                               Traceback (most recent call last)
Cell In[2], line 1
----> 1 np.arange(stop=4)

SystemError: NumPy internal error: non-kwarg marked with | or $ to arange() at argument 1.

@seberg
Copy link
Member

seberg commented Dec 22, 2023

🤷, I suspect you can just change the parsing code to allow | being used without a kwarg. The thing that would break things is probably kwarg after positional only arg. But I think that check is already missing anyway.
If not, I guess parsing start manually will work (maybe slightly odd errors, but probably not bad).

@seberg
Copy link
Member

seberg commented Dec 22, 2023

OK, I pushed the needed changes. You could even add $start now additionally, although that would require explicitly checking that not both of them were passed.

@ngoldbaum
Copy link
Member

This seems fine to me. It needs a release note though. If it turns out this breaks code and we get complaints we could consider only deprecating using start as a keyword, but also people can just delete the start= and their code will continue to work with older numpy versions...

@ev-br
Copy link
Contributor Author

ev-br commented Dec 25, 2023

Thank you Sebastian!
Release notes snippet added.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No comments, so will just apply thos e suggestions and merge, thanks.

doc/release/upcoming_changes/25336.compatibility.rst Outdated Show resolved Hide resolved
doc/release/upcoming_changes/25336.compatibility.rst Outdated Show resolved Hide resolved
@seberg seberg merged commit a7c6be5 into numpy:main Jan 18, 2024
59 of 63 checks passed
@fancidev
Copy link

I read through the NumPy 2.0 release notes and all makes perfect sense except this change. Leaving backward compatibility aside, the construct arange(start=1, stop=11) makes more sense than arange(1,stop=11) which is what I have to write in the new API.

The Python built-in range() function disallows keyword arguments altogether, which is less ideal than supporting all keywords arguments but better than a mixture of keyword and positional arguments.

If the purpose is to disallow arange(start=1), how about making the signature arange(*args, **kwargs) and then parse the arguments? This also maintains backward compatibility.

@seberg
Copy link
Member

seberg commented Jan 25, 2024

No big opinion from me. I think you can't translate to a pretty signature, but the change itself would be pretty simply since this is in C (which actually makes it more convenient here).

@mattip
Copy link
Member

mattip commented Mar 7, 2024

I am confused as to why we did this change that breaks a lot of existing code, especially since we disallowed arange(start=, stop=).

[T]he current arange sinature, arange([start,] stop[, step,], where the first argument has a default value and the second does not, is not very easy to reconstruct with just pure python rules.

Is this the entire motivation for this change, or is there more? Does "rules" refer to some typing problems? It seems a pure-python version could easily mimic whatever we do in C.
I must be missing something.

@mattip
Copy link
Member

mattip commented Mar 7, 2024

Ahh, this fixes the problem mentioned elsewhere that

The worst of what we had was that arange(start=3) was actually arange(stop=3)

I think we should make that change, but allow arange(start=3, stop=5) as well as arange(3, stop=5)

Edit: that comment seems mistaken. I will continue the discussion on #25743.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants