-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancements to autodoc/embedsignature #5415
Conversation
Note: The changes from this PR are opt-in: code that sets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems cool overall. It's definitely an opt-in change, given that a lot of type information is lost here.
We probably need a test for passing PyLong values into a fused function in Py2, seems to be missing.
My main motivation is to emit something that would be valid Python syntax and compatible with type hints, I'm using these pure signatures to generate *.pyi stubs, see here . That means we cannot emit C type names as is. And I think you agree that honoring explicit type annotations is OK. Then what remains is how to map a C type to some syntactically valid Python identifier/expression. I decided to map C integral/real-floating/complex-floating to Python int/float/complex type names. But we could do things differently. For example, map |
Cool. It's usually a good idea to begin generated files with a comment like "# This file was generated by …", to avoid accidental manual changes after the fact, and to leave a note how to regenerate the file on changes. (There's an additional debate whether generated files should be under version control, but I doubt that there's a general answer to that question.)
Not just syntactically valid, also semantically meaningful. Mapping C integer types to Python EDIT: Actually, I think we even accept Python |
Also note that return types are very different from argument types. If the C return type is known, then we can map it to an exact Python type and write that as return type annotation. If a C argument type is given, then the C-API conversion rules apply and the range of Python types that we accept for it may be broader. It would be great to express that in the generated type annotations. There's also a possible mapping from |
You're essentially working on #3150 |
I almost never get contributions for low-level stuff, so this has not been an issue, but you are definitely right.
The generation step involved compiled code (importing the ext module), and that complicates matters with cross-compilation. I have CI tests that make sure that the checked-in code is always in sync with the generated output.
Of course, but the "semantically meaningful" part is quite hard in the duck-typing scenario.
You are right, of course, but maybe practicality beats purity? I don't thing the stdlib stubs from typeshed are 100 % accurate either.
Yes, we accept float for C integer types, because at some point we fall back to
I didn't want to go there just yet, we also have the new generic alias syntax
Well, my changes here are much humble. This PR does not pretend to solve all the issues with typing stub generation, but it is nonetheless a small step forward while keeping the focus on this being about docstrings and signatures. This was a rather long reply. At this point I'm a bit confused whether you are OK with this PR as it is now, of there are strictly necessary changes that you want me incorporating. Your concerns about specializing |
Absolutely. And that's the only real issue that I see here. If we settle on a specific kind of output, then users will start relying on it, and it will be difficult to change in the future. The main problem here are the different audiences. Type annotations are for humans, whereas static types are for compilers. Mostly. Just because Cython can generate code for a type conversion does not mean that it's intended to be used that way. For correctness, the type annotations that we generate would have to be as broad as possible, meaning: any index instead of |
My gut feeling is that users who write C The fact that we allow two kinds of declarations and prefer the Python one is pragmatic, but also bares a risk of confusion and mix of syntaxes. I'm really not sure that I want to encourage that. But it would be good to read a few more opinions. |
… unless you use them as the way you declare types for Cython. In that case, they are type declarations, not just annotations. The fact that you can mix both in
I didn't suggest to make them invalid syntax, but I also wouldn't want to present mixing both forms of type declarations as the best of both worlds. I don't want Python type annotations to be considered a second class citizen that is only looked at if there is nothing more convincing and can otherwise be happily ignored by the compiler.
Why is this better than the following? def foo(arg: Annotated[cython.int, MaxValueAllowed(42)]) -> Any:
... |
I actually think this discussion belongs into #3150. Let's continue it there. |
Indeed
Which maybe is a good thing, totally line with Python's explicit is better than implicit?
As Cython accepts Py
Python
Consenting adults? Again, remember that my proposal is opt-in. If you do not opt in, you will get whatever Cython is doing right now.
So, we are stuck on how to map a C type to an annotation, and only when an explicit annotation is not provided (I assume I convinced you we cannot disable C types with annotations in this #5415 (comment)). I have the feeling that sometimes you forget this PR is about docstrings and not about generating accurate typing stubs. To summarize alternatives, if we have:
then the generated embedded signature could be
I would not object strongly to the following relatively trivial mapping:
but I still think that |
There's also |
That would require implementing the text_signature format. You already made me remove from this PR the bits that would simplify that work. Nonetheless, PS: As I said before, I think |
think that `int`, `float`, `complex` are easier human-consume
I agree. Let's do this: we keep it as simple as you wrote it for now, remove the documentation part that talks about mixing syntax, and add a note that the specific output and type mapping is experimental and may change over time.
Basically, we acknowledge that the feature is helpful and guarantee that users will get something reasonable out of it, just not exactly what.
|
Your reluctance to accept this PR made me think harder. We can do slightly better: have a compiler directive |
That sounds very reasonable. I'd call the "pure" version "python", though, and the "full" one "c" (and require lower case and reject everything else). I think that's clearer. |
@scoder I took the liberty of using Please double check the updated documentation entry for the new compiler directive. Regarding implementation, I used the |
d724ba4
to
a887986
Compare
I had also first used |
OK, fair enough. I'll make the change. |
Done. I'll wait CI and your approval to merge. Otherwise, click yourself the merge button. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, just a few comments.
change over time. If set to ``clinic`` and the ``binding`` | ||
directive is set to False, Cython will generate signatures | ||
compatible with CPython's Argument Clinic. Default is | ||
``c``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, what happens if emberdignature.format=clinic
and binding=True
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No signature is generated, because CythonFunction does not support the __text_signature__
descriptor like CPython's builtin_function_or_method
.
How would you like to rephrase the documentation?
As I told you before, this clinic
format is of little use in Cython 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I told you before, this
clinic
format is of little use in Cython 3.
Perhaps a stupid question, but why include it at all then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I'd say "The clinic format generates signatures that can be understood by CPython's argument clinic tool and used to generate the __text_signature__
attribute. This mainly useful when binding=False
, since the functions generated with binding=True
do not have a __text_signature__
attribute."
I'm not sure I'd disable it when binding=True
(even if it's a bit pointless there).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"little use" does not mean "no use". Those that for whatever reason want to set binding=False
may find the new option useful to get __text_signature__
and then the inspect
module to work closer to CPython builtin functions and methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I'd disable it when binding=True (even if it's a bit pointless there).
Is not only that is pointless. it looks awful as well, e,g
($self, *args, **kwargs)
--
<BLANKLINE>
<BLANKLINE>
Function docstring goes here.
7bb9ae3
to
d0cebcd
Compare
change over time. If set to ``clinic`` and the ``binding`` | ||
directive is set to False, Cython will generate signatures | ||
compatible with CPython's Argument Clinic. Default is | ||
``c``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I'd say "The clinic format generates signatures that can be understood by CPython's argument clinic tool and used to generate the __text_signature__
attribute. This mainly useful when binding=False
, since the functions generated with binding=True
do not have a __text_signature__
attribute."
I'm not sure I'd disable it when binding=True
(even if it's a bit pointless there).
@@ -854,6 +854,25 @@ Cython code. Here is the list of currently supported directives: | |||
signature, which cannot otherwise be retrieved after | |||
compilation. Default is False. | |||
|
|||
``embedsignature.format`` (``c`` / ``python`` / ``clinic``) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@da-woods I reworded the documentation as per your comments. Note however that the argument clinic tool does not generate signatures, instead, you have to write them by hand following the proper format.
Can we get this PR merged before the next beta release? |
Thanks. Very nice improvement. |
Add
embedsignature.pure
compiler directive to generate pure-Python type annotations compatible with type hinting syntax. This mostly amounts to:See #3150