Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Buffer.add_substitute #12423

Merged
merged 1 commit into from
Nov 29, 2023
Merged

Conversation

damiendoligez
Copy link
Member

@damiendoligez damiendoligez commented Jul 24, 2023

  • clarify documentation
  • remove Not_found exception
  • fix implementation

Fixes #6732

Copy link
Contributor

@MisterDA MisterDA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Do we expect breakage with this change?
The commit message should be amended to keep a blank line between the first line and the body.

testsuite/tests/lib-buffer/test.ml Outdated Show resolved Hide resolved
@damiendoligez damiendoligez force-pushed the fix-add-substitute branch 2 times, most recently from fa17021 to 94bb6ec Compare October 12, 2023 13:31
- clarify documentation
- remove Not_found exception
- fix implementation
string pattern, a variable name immediately follows a non-escaped
[$] character and is one of the following:
string pattern, a variable reference is a non-escaped [$] immediately
followed by a variable name, which is one of the following:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the $ character is escaped is inconsistent with escaping in other places (eg. OCaml strings, format strings) as it's impossible to escape the \ character.
It's impossible to have this function write \$ without using a trick but it's possible to write "\\\"" and printf "%%s".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the $ character is escaped is inconsistent with escaping in other places (eg. OCaml strings, format strings) as it's impossible to escape the \ character.

That's true, it's this way because I wanted to make minimal changes to the behavior of the function.

It's impossible to have this function write \$ without using a trick

That's in fact fixed by this PR. Look at the test file, the function maps:
\$ to $
\\$ to \$
\\\$ to \\$
(and so on). It's a different quoting convention, but it is surjective, unlike the current version of add_substitute.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, what is impossible without tricks is a backslash followed by a variable reference. We could fix this by switching to the usual quoting convention, but I'm afraid this might entail a lot of breakage.

Copy link
Contributor

@nojb nojb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This is a mostly-backwards compatible that makes two changes to the function:

  • a backslash now always stands for itself, except when it precedes a $ when the pair stands for a lone $. Previously, \\$foo would be mapped to \\<value of foo>, now it maps to \$foo.

  • no longer raises Not_found when an unterminated variable reference is found (eg ${foo), instead it simply echoes the characters back. A quick OPAM grep (https://sherlocode.com/?q=Buffer.add_substitute) shows that most (all?) users of this function do not handle Not_found anyway. One could argue that, if anything, Buffer.add_substitute should have raised invalid_arg instead of Not_found in this case, but simply returning the untouched characters seems even better.

A second approval by a core dev is needed, as this touches the standard library (note that @MisterDA has approved the PR as well, so a "on-behalf-of" approval could be enough).

@MisterDA
Copy link
Contributor

I'm satisfied about the new behavior of the function, and the test cases cover all the patterns that were important to me when we filed the original bug.

@nojb
Copy link
Contributor

nojb commented Nov 28, 2023

Friendly ping; we still need a second review of this PR... volunteers appreciated! Thanks.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My impression:

  • the spec of this function is somewhat broken, it should not be in Buffer in the first place ( I think everyone agrees, starting with Alain 9 years ago: Buffer.add_substitute is not surjective and its documentation is incomplete. #6732 (comment) )
  • the fix makes some cases that looked very broken behave nicer -- but the result remains unsatisfying in some respects
  • in jest I would say, semi-seriously, that the main reason to merge this is the sunk cost fallacy: we have put enough effort in the change to want to see it merged, despite the fact that in practice no one cares and the cost of changing the behavior of the function under the user's feet is probably going to outweigh the benefits of the improvements

I deduce from this glowing review that we should go ahead and merge it, sooner rather than later.

@nojb
Copy link
Contributor

nojb commented Nov 29, 2023

  • the main reason to merge this is the sunk cost fallacy:

Thanks for the review @gasche!

Just for future reference, my thought process regarding this PR is as follows: I tend to agree that the API is not ideal, but as pointed out in #6732 (comment), we are stuck with it for the time being, so in general we should try to fix its shortcomings if it can be done without too much trouble. Whether to go ahead with with backwards-incompatible fixes as in this PR is a more delicate question, but my (necessarily subjective) feeling is that it is worth it.

@nojb nojb merged commit e0fdea0 into ocaml:trunk Nov 29, 2023
9 checks passed
@damiendoligez damiendoligez deleted the fix-add-substitute branch November 29, 2023 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Buffer.add_substitute is not surjective and its documentation is incomplete.
5 participants