Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to parse command-style method calls more correctly #1949

Merged
merged 5 commits into from
Dec 5, 2023

Conversation

makenowjust
Copy link
Contributor

@makenowjust makenowjust commented Nov 30, 2023

Fix #1468
Fix #1575

To decide whether command-style method calls are allowed, this introduces a new parameter accepts_command_call to parse_expression and some functions.

Although one thinks this can be solved by operator precedence, it is hard or impossible, because the precedence of command-style calls is skewed (e.g. ! bar 1 is accepted, but foo = ! bar 1 and !! bar 1 are rejected.)

One of the most complex examples is that:

  1. Even though foo = bar = baz 1 and foo, bar = baz 1 are accepted,
  2. foo, bar = baz = fuzz 1 is rejected.

To implement this behavior, this introduces a new binding power PM_BINDING_POWER_MULTI_ASSIGNMENT and uses it to distinguish which single assignments or multi assignments at their RHS.

@makenowjust
Copy link
Contributor Author

I found another corner case foo 1 in x. In CRuby, it is rejected, but Prism accepts it. I think this PR's change cannot catch this kind of error, so this bug will be left in place for now. I will create a new issue for this.

Copy link
Collaborator

@kddnewton kddnewton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work @makenowjust! I have a couple of small-ish requests, but otherwise this looks really, really good.

src/prism.c Outdated Show resolved Hide resolved
src/prism.c Outdated Show resolved Hide resolved
@@ -16518,14 +16519,14 @@ parse_expression_infix(pm_parser_t *parser, pm_node_t *node, pm_binding_power_t
case PM_TOKEN_KEYWORD_AND: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you split these into two different cases?

Suggested change
case PM_TOKEN_KEYWORD_AND: {
case PM_TOKEN_AMPERSAND_AMPERSAND: {
parser_lex(parser);
pm_node_t *right = parse_expression(parser, binding_power, PM_ERR_EXPECT_EXPRESSION_AFTER_OPERATOR, false);
return (pm_node_t *) pm_and_node_create(parser, node, &token, right);
}
case PM_TOKEN_KEYWORD_AND: {
parser_lex(parser);
pm_node_t *right = parse_expression(parser, binding_power, PM_ERR_EXPECT_EXPRESSION_AFTER_OPERATOR, true);
return (pm_node_t *) pm_and_node_create(parser, node, &token, right);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like my version because your suggestion is not clear about what is different between these at first glance.

@@ -16518,14 +16519,14 @@ parse_expression_infix(pm_parser_t *parser, pm_node_t *node, pm_binding_power_t
case PM_TOKEN_KEYWORD_AND: {
parser_lex(parser);

pm_node_t *right = parse_expression(parser, binding_power, PM_ERR_EXPECT_EXPRESSION_AFTER_OPERATOR);
pm_node_t *right = parse_expression(parser, binding_power, PM_ERR_EXPECT_EXPRESSION_AFTER_OPERATOR, parser->previous.type == PM_TOKEN_KEYWORD_AND);
return (pm_node_t *) pm_and_node_create(parser, node, &token, right);
}
case PM_TOKEN_KEYWORD_OR:
case PM_TOKEN_PIPE_PIPE: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here about splitting.

if (
current_binding_powers.nonassoc &&
current_binding_powers.right <= pm_binding_powers[parser->current.type].left
) {
break;
}
if (accepts_command_call) {
// A command-style method call is only accepted on method chains.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method calls are very common, so this is going to be checked very frequently. I'm worried about the speed of parsing getting slowed down by checking this on every method call.

Call nodes already have a special set of flags on them. Could we add one for METHOD_CHAIN or something similar? That way we would set the flag when the node is created and here we would only have to check if the flag is set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try to implement it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kddnewton

It has not yet been implemented for two reasons.

  1. I am not sure that this optimization is effective. In the first place, it is also unclear whether the current implementation has any impact on performance.
  2. The METHOD_CHAIN flag seems abusing of flags. This flag is used for parsing purposes and has no benefit to the user. I have concerns about such a flag to be exposed to users.

At least the current implementation works correctly. Optimization could be done after the merge.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@makenowjust I agree. Let's do optimizations later. If you don't mind rebasing, I will merge this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, a method chain flag would be useful to e.g. rubyfmt so it wouldn't have to recompute that information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@froydnj Perhaps this flag is not what you expected. This flag should be named POSSIBLY_A_PART_OF_METHOD_CHAIN_FOR_OPTIMIZING_TO_PARSE_COMMAND_STYLE_METHOD_CORRECTLY, and it is used only for the parsing purpose (not for users).

makenowjust added a commit to makenowjust/prism-1 that referenced this pull request Nov 30, 2023
ruby#1949 (comment)

Co-Authored-By: Kevin Newton <kddnewton@gmail.com>
src/prism.c Outdated Show resolved Hide resolved
makenowjust added a commit to makenowjust/prism-1 that referenced this pull request Dec 2, 2023
makenowjust and others added 5 commits December 5, 2023 13:34
Fix ruby#1468
Fix ruby#1575

To decide command-style method calls are allowed, this introduce a new
parameter `accepts_command_call` to `parse_expression` and some
functions.

Although one think this can be solved by operator precedence, it is
hard or impossible, because the precedence of command-style calls is skewed
(e.g. `! bar 1 ` is accepted, but `foo = ! bar 1` is rejected.)

One of the most complex examples is that:
(1) even though `foo = bar = baz 1` and `foo, bar = baz 1` is accepted,
(2) `foo, bar = baz = fuzz 1` is rejected.
To implement this behavior, this introduces a new binding power
`PM_BINDING_POWER_MULTI_ASSIGNMENT` and uses it for distinguish which single
assignments or multi assignments at their RHS.
ruby#1949 (comment)

Co-Authored-By: Kevin Newton <kddnewton@gmail.com>
@makenowjust
Copy link
Contributor Author

Rebasing is done.

@kddnewton kddnewton merged commit d131265 into ruby:main Dec 5, 2023
46 checks passed
matzbot pushed a commit to ruby/ruby that referenced this pull request Dec 5, 2023
matzbot pushed a commit to ruby/ruby that referenced this pull request Dec 5, 2023
@makenowjust makenowjust deleted the fix-1468-1575 branch December 5, 2023 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants