Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: 32-bit Unicode name mangling #539

Merged
merged 2 commits into from
Jun 23, 2021
Merged

Conversation

Vtec234
Copy link
Member

@Vtec234 Vtec234 commented Jun 21, 2021

Reported on Zulip.

| 0, it, r => r
| i+1, it, r =>
let c := it.curr
if c.isAlpha || c.isDigit then
mangleAux i it.next (r.push c)
else if c = '_' then
mangleAux i it.next (r ++ "__")
else if c.toNat < 255 then
else if c.toNat < 0x100 then
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB the bound increase, now _xff can be encoded as well.

else
let n := c.toNat
let r := r ++ "_u32"
let r := Nat.toDigits 16 n |>.foldl (fun r c => r.push c) r
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it important in the 16-bit case that the encoding be constant-width? If yes, I can add justification to 8 characters here as well. Or alternatively, this last case could cover all Unicode with _u used always instead of _u/_u32.

Copy link
Collaborator

@digama0 digama0 Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it has to be fixed width because otherwise _uabcd is confusable with _uabc + d. You can also solve this by using different prefixes for different lengths, though, as with _x vs _u.

Copy link
Member Author

@Vtec234 Vtec234 Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'm not sure if we ever unmangle atm, but still it's better to keep the mangling injective.
(We could also use UTF-8 instead of a UTF-16/UTF-32 switch. Not sure which is preferrable.)

let r := r ++ "_U"
let ds := Nat.toDigits 16 n
let r := Nat.repeat (·.push '0') (8 - ds.length) r
let r := ds.foldl (fun r c => r.push c) r
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not actually clear to me why Nat.toDigits builds a char list instead of a string immediately, but that's tangential to this PR

@Kha Kha requested a review from leodemoura June 23, 2021 07:02
@leodemoura leodemoura merged commit 40f07ef into leanprover:master Jun 23, 2021
ChrisHughes24 pushed a commit to ChrisHughes24/lean4 that referenced this pull request Dec 2, 2022
The command `#conv $tac => $e` will run a conv tactic `tac` on `e` and display the resulting simplified version of `e'`. This is used to implement several other user commands:

* `#whnf e` displays the weak head normal form of `e`
* `#simp e` displays the simp normal form of `e`
* `#norm_num e` evaluates `e` using `norm_num`
* `#push_neg e` shows the result of `push_neg` on `e`

The `conv` versions of `norm_num` and `push_neg` are also added here, so that the commands can be implemented as macros.
ChrisHughes24 pushed a commit to ChrisHughes24/lean4 that referenced this pull request Dec 2, 2022
* [x] depends on leanprover#539
* [x] depends on leanprover#552

This implements the `ring` conv tactic, which rewrites a target that is an equality in rings to `True`. I'm not sure if anyone uses this, but it is what it is and it brings down our metrics.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants