-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regular expression replacement with a function #6197
Conversation
CT Test Results 2 files 86 suites 33m 53s ⏱️ Results for commit 94191a8. ♻️ This comment has been updated with latest results. To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts// Erlang/OTP Github Action Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I think that we should handle optional capturing patterns as you suggested, namely documenting the behavior and leave it up to the user's fun to handle it. In practice, I wouldn't expect such patterns to be very common, so most replacement funs wouldn't have to handle it.
Please add more details to the commit message, so that if anyone wonders what happened 10 years from now don't have to locate the pull request, but can find the details in the commit message.
Ok, great :) I applied your suggestions and will finish up the docs, squash the commits and write a more elaborate commit message tomorrow. |
@bjorng I documented the short subexpression list behavior as best as I could, what do you think? I also augmented the commit message as requested. |
With this change, `re:replace/3,4` also accepts a function for the Replacement argument, for cases when more complex processing is required to generate a replacement. The given function will be called with the complete match and a list of subexpression matches as arguments, and the returned value will be inserted into the result. Co-authored-by: Maria Scott <maria-12648430@hnc-agency.org> Co-authored-by: Björn Gustavsson <bgustavsson@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Added to our daily builds.
There is some strange failure in the doc build, that seems to be unrelated to my changes 🤷♂️ |
Yes, it is unrelated to your changes. |
Thanks for your pull request. |
I rarely use regular expressions with Erlang. But when I do, I often would need a function to process matches and create replacements.
This PR extends
re:replace/3,4
to accept a 2-ary function for the replacement. This function will be called with the entire match as the first and a list of subexpression matches as second argument. The value returned from the function will be inserted into the result.A problem exists in case of regular expressions like this:
When applied to a string like
"axy"
, the function will be called with arguments<<"axy">>
and[<<"x">>, <<"y">>]
.When applied to
"ax"
, the arguments will be<<"ax">>
and[<<"x">>]
, ie the subexpression match list does not contain the optional-ish match related to(y)
.When applied to
"ay"
, the arguments will be<<"ay">>
and[<<>>, <<"y">>]
, ie the subexpression match list does contain the optional-ish match related to(x)
as an empty binary.I'm not sure how to handle this (and that is why it is not documented yet). The implementation for replacements using strings with back references inserts empty binaries if the match index does not exist in the actual match. For creating the subexpression match list for replacement via a function however, something like this is not possible. The easiest way out may be to recommend a catch-all clause that will return an empty binary, in the documentation.