Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regular expression replacement with a function #6197

Merged
merged 1 commit into from
Aug 26, 2022

Conversation

juhlig
Copy link
Contributor

@juhlig juhlig commented Aug 4, 2022

I rarely use regular expressions with Erlang. But when I do, I often would need a function to process matches and create replacements.

This PR extends re:replace/3,4 to accept a 2-ary function for the replacement. This function will be called with the entire match as the first and a list of subexpression matches as second argument. The value returned from the function will be inserted into the result.

A problem exists in case of regular expressions like this:

".(x)?(y)?"

When applied to a string like "axy", the function will be called with arguments <<"axy">> and [<<"x">>, <<"y">>].
When applied to "ax", the arguments will be <<"ax">> and [<<"x">>], ie the subexpression match list does not contain the optional-ish match related to (y).
When applied to "ay", the arguments will be <<"ay">> and [<<>>, <<"y">>], ie the subexpression match list does contain the optional-ish match related to (x) as an empty binary.

I'm not sure how to handle this (and that is why it is not documented yet). The implementation for replacements using strings with back references inserts empty binaries if the match index does not exist in the actual match. For creating the subexpression match list for replacement via a function however, something like this is not possible. The easiest way out may be to recommend a catch-all clause that will return an empty binary, in the documentation.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2022

CT Test Results

       2 files       86 suites   33m 53s ⏱️
1 784 tests 1 736 ✔️ 48 💤 0
2 066 runs  2 016 ✔️ 50 💤 0

Results for commit 94191a8.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

@rickard-green rickard-green added the team:VM Assigned to OTP team VM label Aug 8, 2022
Copy link
Contributor

@bjorng bjorng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I think that we should handle optional capturing patterns as you suggested, namely documenting the behavior and leave it up to the user's fun to handle it. In practice, I wouldn't expect such patterns to be very common, so most replacement funs wouldn't have to handle it.

Please add more details to the commit message, so that if anyone wonders what happened 10 years from now don't have to locate the pull request, but can find the details in the commit message.

lib/stdlib/doc/src/re.xml Outdated Show resolved Hide resolved
lib/stdlib/src/re.erl Outdated Show resolved Hide resolved
@juhlig
Copy link
Contributor Author

juhlig commented Aug 23, 2022

Ok, great :) I applied your suggestions and will finish up the docs, squash the commits and write a more elaborate commit message tomorrow.

@juhlig
Copy link
Contributor Author

juhlig commented Aug 24, 2022

@bjorng I documented the short subexpression list behavior as best as I could, what do you think? I also augmented the commit message as requested.

With this change, `re:replace/3,4` also accepts a function for the
Replacement argument, for cases when more complex processing is
required to generate a replacement. The given function will be
called with the complete match and a list of subexpression matches
as arguments, and the returned value will be inserted into the result.

Co-authored-by: Maria Scott <maria-12648430@hnc-agency.org>
Co-authored-by: Björn Gustavsson <bgustavsson@gmail.com>
Copy link
Contributor

@bjorng bjorng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Added to our daily builds.

@bjorng bjorng added the testing currently being tested, tag is used by OTP internal CI label Aug 24, 2022
@juhlig
Copy link
Contributor Author

juhlig commented Aug 24, 2022

There is some strange failure in the doc build, that seems to be unrelated to my changes 🤷‍♂️

@bjorng
Copy link
Contributor

bjorng commented Aug 24, 2022

There is some strange failure in the doc build, that seems to be unrelated to my changes.

Yes, it is unrelated to your changes.

@bjorng bjorng merged commit fc09037 into erlang:master Aug 26, 2022
@bjorng
Copy link
Contributor

bjorng commented Aug 26, 2022

Thanks for your pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team:VM Assigned to OTP team VM testing currently being tested, tag is used by OTP internal CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants