-
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: introducing some keyword-only args for NX 3.0 #5063
Conversation
An important question to answer is: is this worth it? Even the mildest changes to the API are likely to break at least some existing code somewhere. There may be ways to make this less noisy (e.g. a decorator to convert kwarg-only exceptions to warnings), but no matter what it's likely to be a noisy change for existing code. In addition to the benefits mentioned above, having a slightly more strict interface can help prevent issues like e.g. #4845 . However, similar issues are mitigated by careful use of the existing API, and introducing more strict rules about how functions can be called will likely break users who are not experiencing such errors. However, for new additions to the API, there are no such concerns about breaking existing usage, so it would definitely make sense to consider adding kwarg-only specifiers to the signatures of new functions where appropriate. Definitely something to keep in mind when reviewing contributions! |
As I originally read this, I was a bit confused as to why this would be needed, then I realized that it was probably because all of my recent work with NetworkX has been from a developer prospective were I can read all of the docstrings and implementations without needing to refer to the website or GitHub repo rather than a user prospective. That being said, I agree that there is a possibility to increase the overall readability of NetworkX code across the board. This is still a new area of discussion for me, I'm not used to having to think about the overall API in programs that I am normally writing, but I think that we should be careful about your comment about new additions to the API. I don't think that we would want inconsistent keyword requirements based on something as seemingly arbitrary to an end-user as when a function was added to the library. At least to me, this is something that we would have to commit to entirely or not enforce at all and maybe try to encourage 'proper' usage by using keyword arguments if that is at all possible. (If I'm totally off base here, just let me know) |
Definitely not - you've made a lot of nice observations that get at the core of the problem!
This is an insightful comment 👍 it's very easy to lose sight of other perspectives when you're used to reading the source code any time you don't understand something!
Agreed, I think deciding which (if any) new kwargs should be keyword-only is the tricky part. You're right that we definitely don't want it to seem arbitrary and there's a lot of subjectivity built-in here. For example, a usage-pattern that may feel natural to some might feel overly verbose to others, and unfortunately there is no "right" answer. IMO flexibility is one of the strengths of Python, but it's a double-edged sword and can catch unwary users out when not paying close attention to function signatures: see e.g. #4845 .
I have a slightly different opinion here - IMO the penalty for breaking backwards compatibility is much higher than that of making new function signatures overly-restrictive. If a new function is introduced with keyword-only arguments and users raise issues about it, it's very easy to back the restriction out. OTOH, if we modify existing signatures we risk breaking user code that already runs. Thus I think if we're going to consider adopting keyword-only arguments at all, that doing so in new functions (that have no backwards compatibility concerns) is the place to start |
I'm going to go ahead and close this as the purpose was to have collection point for discussion and it has served that purpose! IMO the next step here is probably to formalize a plan via NXEP. |
Related to #4100
As an experiment, I converted the
convert_matrix
module to use keyword-only args to see what it would take and how disruptive the changes would be within NX (as a proxy for how disruptive the change would be for general users).I naively chose to be very aggressive in specifying what must be kwarg-only: basically, I made it so that every keyword argument (i.e. that has a default value) must be named when calling the function. Some observations from this experiment:
Switching to keyword-only arguments is likely to break people
As expected, making this switch did break some tests and some internal usages of e.g.
nx.to_numpy_array
in other NetworkX functions. For example, about half of the tests in thetest_pandas.py
suite fail due to the keyword-only requirement on things like:Of course, failing on things like this is kind of the point, because the above code is not easy to understand without looking up the
from_pandas_edgelist
signature, whereas the function call that obeys the kwarg-only requirements is much more readable:Nevertheless, I think this indicates that switching to keyword-only arguments is likely to catch people out and raise a lot of new
TypeErrors
on previously working code.Deciding which arguments exactly should be keyword only is going to be tricky
In this experiment, I made all of the arguments after the first kwarg-only. This is because all of the conversion functions follow the same general pattern where the first argument is the main data structure to be acted upon (a graph, a numpy array, scipy sparse matrix, etc.) and all of the other arguments are optional. For cases like the pandas conversion functions above I think this made sense, as it's not at all clear what the 2nd and 3rd positional arguments represent without additional context.
However, for e.g. the
to_numpy_array
case, making thenodelist
argument keyword only caused quite a few failures on internal uses ofto_numpy_array
in other nx functions. There, the common pattern seemed to be:Forcing
nodelist
to be kwarg only causes the above to fail, requiringnx.to_numpy_array(G, nodelist=nodelist)
instead. Based on the number of places this usage pattern shows up, it seems like there is a pretty strong convention that the 2nd positional argument of theto_*
conversion functions isnodelist
. This highlights one of the challenges in deciding where the kwarg-only cutoff should be. My general sense is that we should not be overly aggressive in making things kwarg-only if it will be disruptive without any obvious benefit.Takeaways
My thoughts after conducting this experiment:
to_numpy_array(G, nodelist=None, *, ...)
instead ofto_numpy_array(G, *, nodelist=None, ...)
.