-
-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Text/replace builtin #1065
Add Text/replace builtin #1065
Conversation
50e1ea6
to
e083884
Compare
Signed-off-by: Alex Humphreys <alex.humphreys@here.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great so far!
I think one thing we need to decide is what to do if the substring to replace is ""
, in order to avoid an infinite loop. Some possibilities I can think of are:
- Silently perform no replacement if the substring to replace is empty
- Return an
Optional
with aNone
result - Require the substring to replace to be statically known at type-checking time and return a type error if empty
As regards "", I'd vote for the first choice of do nothing. The second choice of returning an optional seems unergonomic. The third choice seems like an odd type error, given that "non-empty string" (or non-empty list for that matter) isn't a type that exists anywhere else in the language. Just tried ruby to see what it does in this case and it was not what I was expecting: "foo".gsub "" "bar" == "barfbarobarobar" ... not sure we want to follow that example 😅 Though I can see a case to be made for I don't feel strongly about this though, whatever ye all prefer is cool. |
Co-authored-by: Gabriel Gonzalez <Gabriel439@gmail.com>
Signed-off-by: Alex Humphreys <alex.humphreys@here.com>
Signed-off-by: Alex Humphreys <alex.humphreys@here.com>
Co-authored-by: Gabriel Gonzalez <Gabriel439@gmail.com>
Signed-off-by: Alex Humphreys <alex.humphreys@here.com>
7ac0709
to
9182692
Compare
@Gabriel439 I've committed your suggested change defining normalisation in terms of recursion. I've also added a note to do nothing in the case of |
@alexhumphreys: I'm taking vacation this week, but I'll review again afterwards |
Ah nice, enjoy the vacation! 🌴 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add a test for replacing an empty string? That way, implementations won't forget to handle that case
Other than that, this looks to me like this proposal is basically ready to move out of Draft status
standard/beta-normalization.md
Outdated
If the substring to replace is empty (`""`), then no replacement is performed: | ||
|
||
|
||
f ⇥ Text/replace "" replacement a ⇥ "foo" | ||
──────────────────────────────────────────── | ||
f a ⇥ "foo" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move this judgement to be the first or second one? The standard has an implicit "first judgement wins" convention that we've been using when more than one judgement applies, and we want this judgment to take precedence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
Co-authored-by: Gabriel Gonzalez <Gabriel439@gmail.com>
Signed-off-by: Alex Humphreys <alex.humphreys@here.com>
Signed-off-by: Alex Humphreys <alex.humphreys@here.com>
Signed-off-by: Alex Humphreys <alex.humphreys@here.com>
@Gabriel439 I've added that empty string test. I still don't have the |
Isn't the normalization still way under-specified? How are composite characters supposed to be handled, for example? (I believe @Profpatsch had some opinions about that in some discussion about |
@sjakobi: I'd propose doing replacement at the grapheme level, as suggested by this test: dhall-lang/Prelude/Text/replace.dhall Line 12 in 595c8ca
… and for correctly handling composite characters I'd suggest performing the replace if the "needle" and the prefix are both canonically equivalent, meaning that they have the same normalization form, specifically Normalization Form C. |
@Gabriel439 Sounds good to me! |
Signed-off-by: Alex Humphreys <alex.humphreys@here.com>
@Gabriel439 @sjakobi I updated this with a note on using NFC. Well, I pretty much just took your comment from above. Does this need more detail or would that be clear enough? |
|
||
let example1 = assert : replace "💣" "💥" "💣💣💣" ≡ "💥💥💥" | ||
|
||
let example2 = assert : replace "👨" "👩" "👨👩👧👦" ≡ "👩👩👧👦" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add this as one of the test cases in the standard test suite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, updated. Should I leave this in the prelude? Just wondering how easy this will be for all implementations, since most users importing the prelude won't need this particular replacement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is also worth keeping in the Prelude since this is useful documentation for the end user. These examples will also show up in the generated docs
@alexhumphreys: I have a PR open with all of my suggested changes in: alexhumphreys#1 I decided to skip handling normalization for now (and adding a test making that explicit). I can standardize support for Unicode normalization in a separate proposal, to keep this pull request simpler. |
… as standardized in dhall-lang/dhall-lang#1065
I also have a matching change to the Haskell implementation up here: dhall-lang/dhall-haskell#2063 |
Suggested changes for dhall-lang#1065
@Gabriel439 thanks for the PR, those beta normalisation rules are a lot clearer. It also thought me a bit about string interpolation: didn't know it would get normalised under a lambda binding before the lambda is applied. Splitting the Unicode handling to a different PR is cool too. Would |
Signed-off-by: Alex Humphreys <alex.humphreys@here.com>
Signed-off-by: Alex Humphreys <alex.humphreys@here.com>
@alexhumphreys: I think that example would be the same regardless of how we handle normalization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM apart from one more wibble. Thanks! :)
Co-authored-by: Simon Jakobi <simon.jakobi@gmail.com>
@alexhumphreys: You should be clear to merge this |
@Gabriel439 sweet! Any 3 day waiting period or can I just hit the button? |
@alexhumphreys: Since you don't have a majority of approvals but you also don't have a majority of disapprovals then the policy is 7 days since the PR was ready (i.e. not a draft PR) and 1 day since last code change. For more details, see: https://github.com/dhall-lang/dhall-lang/blob/master/.github/CONTRIBUTING.md#how-do-changes-get-approved |
Ah cool, so it should be good to be merged. Looks like I don't have write access so maybe one of ye wants to hit the button? Also there's 25 commits in this PR now, do ye have a policy of squashing commits or should I go in and tidy up the history a bit? |
This repo enforces squash-on-merge. We should get you the commit bit so you can merge yourself 👍🏻 |
@alexhumphreys I’ve granted you write access now, check your email |
Sweet! Thanks @philandstuff, and everyone else too for helping on this 🙂 |
@alexhumphreys: Thank you, too! Also, now that this is merged you can submit an invoice here for $200 and one of us will approve it: https://opencollective.com/dhall/expenses/new |
Thanks, but it's ok! I've been meaning to donate to that Dhall fund so consider this a contribution 🙂 |
… as standardized in dhall-lang/dhall-lang#1065
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm a bit late for a review. I just have a question as I am attempting to implement this in dhall-rust
. 😅
{- This test verifies that an implementation correctly permits both the | ||
"replacement" and the "haystack" to be abstract. | ||
-} | ||
λ(x : Text) → λ(y : Text) → Text/replace "a" "-${x}-" "_a_${y}_a_" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand how the haystack can be abstract. Don't we want to replace occurrences of the needle in the abstract section too? i.e. What happens if y
is the string "a"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@basile-henry: Oh, I see what you mean. Allowing the haystack to be abstract means that β-reduction is no longer confluent. For right now, just ignore that test and I can put up a change to require the haystack to be non-abstract before further reduction can occur.
This attempts to standardise the
Text/replace
builtin.There's a few parts I wasn't sure on (eg. the
B
case for the testtests/parser/success/builtinsA.dhall
), and there's probably plenty other things I missed so best to look over this carefully.Also, wasn't sure how best to describe this feature in the prose sections (doc strings in prelude, Language-tour.md, etc.). I feel the need to use the phrase "substring" but Dhall has
Text
notString
so maybe that's misleading. So if anyone has suggestions to make those descriptions more clear I'd appreciate it.Fixes #1051.