Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug with *rest parameter followed by required parameters #1436

Closed
seven1m opened this issue Sep 10, 2023 · 12 comments
Closed

Possible bug with *rest parameter followed by required parameters #1436

seven1m opened this issue Sep 10, 2023 · 12 comments

Comments

@seven1m
Copy link
Contributor

seven1m commented Sep 10, 2023

Is this a bug?

irb(main):001:0> node = YARP.parse('def foo(x, *splat, y); end').value
=> 
ProgramNode(0...26)(                                                                     
...                                                                       
irb(main):002:0> node.child_nodes.first.child_nodes.first.parameters.child_nodes
=> [RequiredParameterNode(8...9)(:x), RequiredParameterNode(19...20)(:y), RestParameterNode(11...17)(:splat, (12...17), (11...12)), nil, nil]

I would expect to see:

RequiredParameterNode, RestParameterNode, RequiredParameterNode

...but instead I see

RequiredParameterNode, RequiredParameterNode, RestParameterNode
seven1m added a commit to natalie-lang/natalie that referenced this issue Sep 10, 2023
I was trying to get test/natalie/method_test.rb parsing (and passing),
but I think I found a few unexpected results from YARP. I think these
are bugs, which I filed here:

ruby/prism#1435
ruby/prism#1436
seven1m added a commit to natalie-lang/natalie that referenced this issue Sep 10, 2023
I was trying to get test/natalie/method_test.rb parsing (and passing),
but I found a few unexpected results from YARP. I think these are bugs,
which I filed here:

ruby/prism#1435
ruby/prism#1436
@seven1m
Copy link
Contributor Author

seven1m commented Sep 10, 2023

If I change the code at https://github.com/ruby/yarp/blob/main/src/yarp.c#L9115-L9120 from:

                if (params->rest == NULL) {
                    yp_parameters_node_rest_set(params, param);
                } else {
                    yp_diagnostic_list_append(&parser->error_list, param->base.location.start, param->base.location.end, YP_ERR_PARAMETER_SPLAT_MULTI);
                    yp_parameters_node_posts_append(params, (yp_node_t *) param);
                }

...to just:

                yp_diagnostic_list_append(&parser->error_list, param->base.location.start, param->base.location.end, YP_ERR_PARAMETER_SPLAT_MULTI);
                yp_parameters_node_posts_append(params, (yp_node_t *) param);

...it fixes the issue. But I don't think that is the proper solution, since then we aren't setting the ParametersNode#rest attr.

[edit] And now I understand that posts are "post optional arguments", which seem to be for something else. I think I'm out of my depth here. 😅

@kddnewton
Copy link
Collaborator

So I think this behavior is actually as expected, though very confusing. It's easier if you look at the inspect output:

$ bin/parse -e 'def foo(x, *splat, y); end'
@ ProgramNode (location: (0...26))
├── locals: []
└── statements:
    @ StatementsNode (location: (0...26))
    └── body: (length: 1)
        └── @ DefNode (location: (0...26))
            ├── name: :foo
            ├── name_loc: (4...7) = "foo"
            ├── receiver: ∅
            ├── parameters:
            │   @ ParametersNode (location: (8...20))
            │   ├── requireds: (length: 1)
            │   │   └── @ RequiredParameterNode (location: (8...9))
            │   │       └── name: :x
            │   ├── optionals: (length: 0)
            │   ├── posts: (length: 1)
            │   │   └── @ RequiredParameterNode (location: (19...20))
            │   │       └── name: :y
            │   ├── rest:
            │   │   @ RestParameterNode (location: (11...17))
            │   │   ├── name: :splat
            │   │   ├── name_loc: (12...17) = "splat"
            │   │   └── operator_loc: (11...12) = "*"
            │   ├── keywords: (length: 0)
            │   ├── keyword_rest: ∅
            │   └── block: ∅
            ├── body: ∅
            ├── locals: [:x, :splat, :y]
            ├── def_keyword_loc: (0...3) = "def"
            ├── operator_loc: ∅
            ├── lparen_loc: (7...8) = "("
            ├── rparen_loc: (20...21) = ")"
            ├── equal_loc: ∅
            └── end_keyword_loc: (23...26) = "end"

#child_nodes in this case is confusing, because it doesn't come back sorted by location. I think this is definitely something that can be improved (either by sorting when it's called or making it more explicit in the docs).

#posts are meant to be any required positional arguments after optionals or rest positionals. Basically they're the things that get taken from the end of the list as opposed to the start.

@enebo
Copy link
Collaborator

enebo commented Sep 10, 2023

@kddnewton won't this get fixed if we just switch posts and rest in config.yml and regenerate the templates?

@seven1m
Copy link
Contributor Author

seven1m commented Sep 10, 2023

#posts are meant to be any required positional arguments after optionals or rest positionals. Basically they're the things that get taken from the end of the list as opposed to the start.

Very good, that makes sense. This is the first case in YARP where I noticed the child_nodes "out of order," but now I understand there is no guarantee there.

My solution for getting the parameters in order for Natalie is something like:

    in_order = node.requireds +
               [node.rest].compact +
               node.optionals +
               node.posts +
               [node.block].compact +
               node.keywords +
               [node.keyword_rest].compact

I wonder if there is an easier way, or if this is the proper way to get the parameters back out in the right order.

@kddnewton
Copy link
Collaborator

@seven1m child_nodes.sort_by { |node| node.location.start_offset } should do it

@seven1m
Copy link
Contributor Author

seven1m commented Sep 10, 2023

OK this works. Thanks for explaining it to me. I think we can close this issue!

@seven1m seven1m closed this as completed Sep 10, 2023
seven1m added a commit to natalie-lang/natalie that referenced this issue Sep 11, 2023
I was trying to get test/natalie/method_test.rb parsing (and passing),
but I found a few unexpected results from YARP. I think these are bugs,
which I filed here:

ruby/prism#1435
ruby/prism#1436
eregon added a commit to eregon/yarp that referenced this issue Sep 12, 2023
@eregon
Copy link
Member

eregon commented Sep 12, 2023

@kddnewton won't this get fixed if we just switch posts and rest in config.yml and regenerate the templates?

Agreed, that seems a lot more intuitive.
AFAIK post-required paramaters are always in the source after the rest parameter.
I made a PR for it: #1448

@kddnewton
Copy link
Collaborator

I'm fine with merging that for this particular issue, but note that there are some nodes where reordering will not fix the issue. IfNode/UnlessNode/WhileNode/UnlessNode all have modifier forms where the child nodes will be out of order. Also, parameters child nodes will still be out of order in the event of syntax errors. So, I think we should either document that the order is not necessarily correct or template out a sort for these nodes.

@eregon
Copy link
Member

eregon commented Sep 12, 2023

So, I think we should either document that the order is not necessarily correct or template out a sort for these nodes.

Yes, I think that makes sense to document.
I think it's pretty much expected that order is not correct on SyntaxError (it probably can't be).

I think when there is a list of nodes then it's rather confusing if the order is not as close as possible to order in the source, like parameters here.
i.e. it's best-effort and cannot always be source order

@heyvito
Copy link

heyvito commented Oct 8, 2023

@seven1m child_nodes.sort_by { |node| node.location.start_offset } should do it

Just a small comment for anyone else being confused by this behaviour and leveraging @seven1m's excellent suggestion: Prism will include nil items representing rest, keyword_rest, and block attributes on the value returned by child_nodes. The updated snippet below takes that into account:

compact_child_nodes.sort_by { |node| node.location.start_offset }

@kddnewton
Copy link
Collaborator

@heyvito there is also compact_child_nodes that will do that, but more efficiently. That's baked in.

@heyvito
Copy link

heyvito commented Oct 8, 2023

@kddnewton oooh! Didn't notice that method! Thank you so much for pointing that out! 🙌🏻

Just updated my last comment! 💕

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants