Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-40334: Support type comments #19780

Merged
merged 12 commits into from
Apr 30, 2020
Merged

bpo-40334: Support type comments #19780

merged 12 commits into from
Apr 30, 2020

Conversation

gvanrossum
Copy link
Member

@gvanrossum gvanrossum commented Apr 29, 2020

This implements full support for # type: <type> comments, # type: ignore <stuff> comments, and the func_type parsing mode for ast.parse() and compile().

Closes we-like-parsers#95

https://bugs.python.org/issue40334

This does not yet support the "long form" for arguments with type
comments.

We also totally skip `# type: ignore` comments, and there is no
support for func_type_input yet.
This involved a complete refactor of the part of the grammar used to
recognise formal parameters in function definitions.  I renamed many
of the rules involved.

The rewrite always includes the comma following a parameter in that
argument, so that we can also include the type comment, if any (which
follows the comma).

The final parameter may not be followed by a comma, so we allow the
comma to be omitted, but only if a close parenthesis follows instead
(preceded by the type comment, if any).

Some additional complications are needed because the code generator
doesn't actually support alternatives that may be empty.

PS. I didn't refactor the corresponding rules for lambdas, since they
don't have type comments.
@gvanrossum gvanrossum marked this pull request as draft April 29, 2020 00:18
@gvanrossum gvanrossum changed the title bpo-40334: Support type comments bpo-40334: Support type comments [WIP] Apr 29, 2020
@gvanrossum
Copy link
Member Author

Wait, I forgot to un-skip the test for long-form arguments and it segfaults. On it.

@gvanrossum gvanrossum changed the title bpo-40334: Support type comments [WIP] bpo-40334: Support type comments Apr 29, 2020
@gvanrossum
Copy link
Member Author

Hm... when parsing something as simple as

def f(a): pass

it seems the end column number for a is two characters too far. I'll investigate tomorrow.

@gvanrossum gvanrossum marked this pull request as ready for review April 29, 2020 16:00
@gvanrossum
Copy link
Member Author

Dang I already have conflicts. Looking...

Copy link
Contributor

@lysnikolaou lysnikolaou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! Apart from all the TYPE_COMMENTS work, I strongly feel that the arguments improvements made a significant difference in readability.

Grammar/python.gram Outdated Show resolved Hide resolved
Grammar/python.gram Outdated Show resolved Hide resolved
Parser/pegen/pegen.c Outdated Show resolved Hide resolved
Parser/pegen/pegen.c Show resolved Hide resolved
Parser/pegen/pegen.c Outdated Show resolved Hide resolved
Parser/pegen/pegen.h Outdated Show resolved Hide resolved
Parser/pegen/pegen.h Outdated Show resolved Hide resolved
@lysnikolaou
Copy link
Contributor

It seems there's a failure in test_type_comments. Was it there before as well?

@gvanrossum
Copy link
Member Author

The test_type_comments failure is new, looking now.

Copy link
Member Author

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All your other suggestions will appear in the next commit.

Parser/pegen/pegen.c Show resolved Hide resolved
Parser/pegen/pegen.h Outdated Show resolved Hide resolved
@@ -25,7 +25,7 @@ _PyPegen_add_type_comment(Parser *p, arg_ty a, char *tc)
if (tc == NULL) {
return a;
}
PyObject *tco = _PyPegen_new_type_comment(p, tc);
PyObject *tco = NEW_TYPE_COMMENT(tc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't NEW_TYPE_COMMENT expect tc to be a Token *, i.e. that tc->bytes is a bytes object? I feel that NEW_TYPE_COMMENT would better be off as an inline function.

@lysnikolaou
Copy link
Contributor

Also, all the calls to NEW_TYPE_COMMENT should be wrapped with CHECK_CALL_NULL_ALLOWED.

@@ -1483,13 +1597,13 @@ _PyPegen_get_values(Parser *p, asdl_seq *seq)

/* Constructs a NameDefaultPair */
NameDefaultPair *
_PyPegen_name_default_pair(Parser *p, arg_ty arg, expr_ty value)
_PyPegen_name_default_pair(Parser *p, arg_ty arg, expr_ty value, char *tc)
Copy link
Contributor

@lysnikolaou lysnikolaou Apr 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tc should be a Token * here.

}

arg_ty
_PyPegen_add_type_comment_to_arg(Parser *p, arg_ty a, char *tc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tc should be a Token * here.

@lysnikolaou
Copy link
Contributor

lysnikolaou commented Apr 30, 2020

To recap, because it might not be clear what my latest reviews are suggesting. I think that the following are needed:

  1. Change the type of the tc argument in _PyPegen_add_type_comment_to_arg and _PyPegen_name_default_pair to Token *.
  2. Wrap calls to NEW_TYPE_COMMENT with CHECK_CALL_NULL_ALLOWED.

I'd also make NEW_TYPE_COMMENT an inline function, but this is certainly not needed.

@gvanrossum
Copy link
Member Author

Got it. I'm putting the CHECK_CALL_NULL_ALLOWED call inside NEW_TYPE_COMMENT -- does that make sense? I don't see that pattern used elsewhere.

if (tc == NULL) {
return NULL;
}
return CHECK_CALL_NULL_ALLOWED(p, _PyPegen_new_type_comment(p, PyBytes_AsString(tc->bytes)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will segfault if PyBytes_AsString fails. I think the best way to do it is assign it to a variable beforehand and check for NULL, but that means the the CHECK_NULL_ALLOWED call cannot be inside NEW_TYPE_COMMENT or it has to be called multiple times, once for each time a function is called that could potentially return NULL, which isn't much more beautiful than having it in the grammar IMHO. What do you think?

Suggested change
return CHECK_CALL_NULL_ALLOWED(p, _PyPegen_new_type_comment(p, PyBytes_AsString(tc->bytes)));
char *bytes = PyBytes_AsString(tc->bytes);
if (!bytes) {
return NULL;
}
return _PyPegen_new_type_comment(p, bytes);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will try again. I really don't want to litter the grammar with CHECK_CALL_NULL_ALLOWED(NEW_TYPE_COMMENT(p, tc)) so I am changing NEW_TYPE_COMMENT to set p->error_indicator on every error exit.

Copy link
Contributor

@lysnikolaou lysnikolaou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing and it's good to go.

# type_expressions allow */** but ignore them
type_expressions[asdl_seq*]:
| a=','.expression+ ',' '*' b=expression ',' '**' c=expression {
_PyPegen_seq_append_to_end(p, _PyPegen_seq_append_to_end(p, a, b), c) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_PyPegen_seq_append_to_end(p, _PyPegen_seq_append_to_end(p, a, b), c) }
_PyPegen_seq_append_to_end(p, CHECK(_PyPegen_seq_append_to_end(p, a, b)), c) }

@gvanrossum
Copy link
Member Author

@lysnikolaou Thank you so much for all your help on this one! It's been invaluable.

Copy link
Contributor

@lysnikolaou lysnikolaou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My pleasure!

@gvanrossum gvanrossum merged commit c001c09 into master Apr 30, 2020
@gvanrossum gvanrossum deleted the type-comments branch April 30, 2020 19:12
@lysnikolaou
Copy link
Contributor

lysnikolaou commented May 1, 2020

Currently working on support for feature_version and I just realised that there is no assignment to tok->type_comments in _PyPegen_run_parser_from_file_pointer like the one in _PyPegen_run_parser_from_file_pointer. Is this okay or should it be fixed?

@gvanrossum
Copy link
Member Author

Oops. Fixing in #19828

@hauntsaninja
Copy link
Contributor

This doesn't seem to be able to handle function type comments where there aren't "normal" argument types preceding the *args and **kwargs types. I've opened #19894 with a fix, but it's my first time working with CPython's grammar, so apologies in advance for any crimes against parsers committed.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Supporting type_comments=True
5 participants