New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Named captures syntax in NodePattern #6724
Comments
Hi, @baweaver! Firstly, thank you for taking the time to think this through. The suggestion is clear and well worded. 🙇
The benefit would be limited to the expression itself (since the result will either commonly be destructured or yielded to named variables), but there it could add a great deal of clarity.
Yes. I agree. Either all unnamed, or all named. 🙂
These are good considerations to make. Another one is that node matchers can yield to a block, e.g.: my_matcher(node) do |capture|
...
end but I think in the case of the extended syntax, the matcher could yield keyword arguments. WDYT?
I think it is a great suggestion! Would you be interested in working on this? 🙂 |
I've done similar things when destructuring hashes, and it makes for some nice syntax, especially considering you know exactly what keys are going to be present preventing any of the usual issues with key presence mismatches versus hashes: node = parse '1 + 2'
NodePattern
.new('(send $<a>(...) :+ $<b>(...))')
.match(node) do |a:, b:|
# a is 1, b is 2
end
One thing that I'd noticed for autocorrect is that if you want to extract the values you'd have to run the matcher twice to do so. Not sure if that's correct or not, but if this is circumvented it may be really useful. I'd been using autocorrect with extracted s-expression trees to change code around, and named captures would make it clearer in those specific contexts: module RuboCop
module Cop
module Tests
class RailsActionCableWebsocket < RuboCop::Cop::Cop
MSG = 'Deprecated!'
# Original
def_node_search :websocket_set, <<~AST
(send
(const nil? :ActionCable) :WebSocket=
$(...))
AST
# Proposed
def_node_search :websocket_set_alt, <<~AST
(send
(const nil? :ActionCable) :WebSocket=
$<websocket_handler>(...))
AST
# Unchanged
def on_send(node)
matching_nodes = websocket_set(node)
add_offense(node, location: :expression) if matching_nodes.any?
end
def autocorrect(node)
lambda do |corrector|
# Original
matching_nodes = websocket_set(node)
content = matching_nodes.first.source
corrector.replace(
node.loc.expression,
"ActionCable.adapters.WebSocket = #{content}"
)
# Proposed
matching_nodes = websocket_set_alt(node)
content = matching_nodes[:websocket_handler].source
corrector.replace(
node.loc.expression,
"ActionCable.adapters.WebSocket = #{content}"
)
end
end
end
end
end
end For this specific code it's not a massive change, but it does make it clear exactly what content it is you're substituting in instead of relying on the index and giving it positional (connascence)[http://connascence.io/pages/about.html] (I still love Jim's talk on that concept). I'd just be very careful not to let it go into method-based captures (e.g.
I'd certainly consider it, though I think my first priority in contribution may be more towards documenting the existing content and exposing how it works in more of a guide-based format. It took me a fair amount of time to grok what was what and how exactly to do some of this, so I think there'd be substantial gain in focusing on that first. Noted that's a separate issue though, and I still have quite a bit to learn about how this all works. :) Definitely up for chatting more, are you all active on the Gitter channel? |
Somewhat related to this. I have had some issues with the current implementation of named captures when using the same name multiple times across optional sections. For example, (done from memory so this may not show off the exact issue) (send
{
(send _name :== _)
(send _name :!= _)
}
:== _name) may not wind up matching. My theory is that during a first pass, I bring this up because I assume the same area of the code will be touched when looking into this feature. |
This is intentional behavior. It is mentioned in the code file as “unification”. 🙂 |
I don't think I did the best job explaining the issue that I have run into. The functionality of matching named captures via Given the code However, given the code (and {
(send $_ name :!= _)
$_name
}
(send $_name :== _)
) This will fail to match the code even though we have a direct match in the second option of the left hand side of the For clarity, the following pattern will match the example code. This will force you to have to check that the variables match within the code rather than being able to handle it directly in pattern. (and {
(send $_ :!= _)
$_
}
(send $_ :== _)
) It seems like when there are multiple parts to a pattern, the named captures are unable to hang onto multiple potential matches. My assumption is that when the first part of the group I hope this clarifies the issue that I was trying to convey. I realize that this is tangentially related to the issue being reported, and I will gladly open a new issue to move this conversation to. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contribution and understanding! |
I'll be looking more into this and a few ideas, though I'd noted the any order groupings taking |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contribution and understanding! |
This issues been automatically closed due to lack of activity. Feel free to re-open it if you ever come back to it. |
+1 for this feature |
Re-opening this, as I think it's still worth tracking. |
Describe the solution you'd like
Currently captures are returned in an Array in the order that they're processed walking the tree.
As this syntax feels loosely based on Regex, a named capture would potentially be within the realm of features such a language could express:
I believe this would add to the expressiveness of the language, and make code-rewriting based on the results of a captured match easier to work with.
Psuedo-Implementation
I'd made a quick pass at implementation of this, not working with the actual source:
I'd looked at the source defining the capture, but I'd have to read over it a few more times to really get a grip on how it'd be implemented.
Additional context
This idea is heavily inspired by named captures in Regex (
/(?<name>.+)/
), and that idea has helped make more complicated Regex queries more expressive through giving names to various sections of captured content.Potential Issues
Granted changing the syntax comes with a few potential issues.
Forked Return Types
By introducing this it would effectively fork the return from being solely
{nil, Array[Node]}
to{nil, Array[Node], Hash[Symbol, Node]}
depending on the string inputted.As it would be an additive type it should not affect current matches, mitigating some of the risk.
Mixed Captures
The first potential is dealing with a mix of named and unnamed captures:
In this case I'd consider raising an exception, but don't have a good idea of how to best deal with it at the moment.
Other Issues
There are some minor other issues, but those are mostly due to the nature of hashes involving duplicate keys and validating the syntax, which is done for other constructs as is.
Thoughts?
I'd be curious to get people's thoughts on this. I believe that named captures would make the NodePattern language more expressive, and be a substantial win when dealing with cop clarity, especially around autocorrect code.
Thanks for reading!
The text was updated successfully, but these errors were encountered: