Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index{Operator,And,Or}WriteNode #1698

Merged
merged 1 commit into from
Oct 18, 2023
Merged

Index{Operator,And,Or}WriteNode #1698

merged 1 commit into from
Oct 18, 2023

Conversation

kddnewton
Copy link
Collaborator

@kddnewton kddnewton commented Oct 17, 2023

Fixes #1636

Right now, our Call{Operator,And,Or}WriteNode nodes represent two different concepts:

foo.bar += 1
foo[bar] += 1

These two statements are different in what they can support. The former can never have arguments (or an opening_loc or closing_loc). The former can also never have a block. Also, the former is a variable method name.

The latter is always going to be []/[]=, it can have any number of arguments including blocks (foo[&bar] ||= 1), and will always have an opening_loc and closing_loc.

Furthermore, these statements end up having to take different paths through the various compilers because with the latter you have to consider the arguments and the block, whereas the former can perform some additional peephole optimizations since there are fewer values on the stack.

For these reasons, I'm introducing Index{Operator,And,Or}WriteNode. These nodes never have a read_name or write_name on them because they are always []/[]=. They also support blocks, which the previous write nodes didn't. As a benefit of introducing these nodes, I've removed the opening_loc, closing_loc, and arguments from the older write nodes because they will always be null.

For the serialized format, both of these nodes end up being smaller, and for in-memory we're storing fewer things in general, so we have savings all around.

I don't love that we are introducing another node that is a call node since we generally want consumers to only have to handle a single call, but these nodes are so specific that they would have to be handled separately anyway since in fact call 2 methods.

Right now, our Call{Operator,And,Or}WriteNode nodes represent two
different concepts:

```ruby
foo.bar += 1
foo[bar] += 1
```

These two statements are different in what they can support. The
former can never have arguments (or an opening_loc or closing_loc).
The former can also never have a block. Also, the former is a
variable method name.

The latter is always going to be []/[]=, it can have any number of
arguments including blocks (`foo[&bar] ||= 1`), and will always
have an opening_loc and closing_loc.

Furthermore, these statements end of having to take different paths
through the various compilers because with the latter you have to
consider the arguments and the block, whereas the former can
perform some additional peephole optimizations since there are
fewer values on the stack.

For these reasons, I'm introducing Index{Operator,And,Or}WriteNode.
These nodes never have a read_name or write_name on them because
they are always []/[]=. They also support blocks, which the previous
write nodes didn't. As a benefit of introducing these nodes, I've
removed the opening_loc, closing_loc, and arguments from the older
write nodes because they will always be null.

For the serialized format, both of these nodes end up being
smaller, and for in-memory we're storing fewer things in general,
so we have savings all around.

I don't love that we are introducing another node that is a call
node since we generally want consumers to only have to handle a
single call, but these nodes are so specific that they would have
to be handled separately anyway since in fact call 2 methods.
@eregon
Copy link
Member

eregon commented Oct 18, 2023

Furthermore, these statements end up having to take different paths through the various compilers because with the latter you have to consider the arguments and the block, whereas the former can perform some additional peephole optimizations since there are fewer values on the stack.

I'm not sure they do actually, because the only difference is only the value as argument or the ArgumentsNode + the value.
Except that part I think the rest is the same regarding compilation of that.
Of course details differ per implementation like those peephole optimizations you mention.

To be fair we haven't implemented CallOperatorWriteNode in TruffleRuby yet, so the above is based on reasoning rather than code.

I think either way is fine, the logic might be a bit simpler with the new node at a cost of a bit of duplication, which can probably be removed with helper methods.

The gains in footprint and serialization speed are probably worth it.

I quickly checked serialized size, before and after
The difference seems smaller than 0.1% for total serialized, so no big change there but saves a few bytes.

@kddnewton kddnewton merged commit f0aaec6 into main Oct 18, 2023
46 checks passed
@kddnewton kddnewton deleted the index-block branch October 18, 2023 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ruby accepts a[&b]=c but Prism rejects it
3 participants