Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dafny does not support unicode characters (outside of escape sequences) #818

Closed
davidcok opened this issue Aug 22, 2020 · 3 comments · Fixed by #3016
Closed

Dafny does not support unicode characters (outside of escape sequences) #818

davidcok opened this issue Aug 22, 2020 · 3 comments · Fixed by #3016
Labels
kind: enhancement Enhancements, feature requests, etc. These are NOT bugs, but ways we can improve Dafny part: parser First phase of Dafny's pipeline

Comments

@davidcok
Copy link
Collaborator

Although Coco claims to support UTF-8 encoded input, the scanner it generates for Dafny does not currently do so. Even the most recent executable from the CoCo/R website does not. Perhaps some setting needs to be set or the executable is not consistent with the source code.

The effect is that in characters and in strings, one may only use ASCII characters, and so unicode content must be constructed as escape sequences. Dafny's verbatim strings can contain unicode characters currently, but they will be misunderstood.

@davidcok
Copy link
Collaborator Author

I was able to obtain and compile current Coco source, and the source code does mention UTF-8, so presumably does support UTF-8. However the generated Dafny parser when compiled with Dafny source produces a ream of errors. (branch cok-coco)

When this issue is fixed, the RM will need updating.

@robin-aws
Copy link
Member

In case it's relevant, I strongly believe the last comment here describes what should be the correct semantics for Dafny strings: #413

@davidcok davidcok added this to the Post 3.0 milestone Sep 10, 2020
@keyboardDrummer keyboardDrummer added kind: enhancement Enhancements, feature requests, etc. These are NOT bugs, but ways we can improve Dafny part: parser First phase of Dafny's pipeline labels Jul 19, 2021
@robin-aws robin-aws removed this from the Post 3.1 milestone Sep 22, 2021
@robin-aws
Copy link
Member

#2717 partially fixed this, but there is still Coco-generated code that casts from int to a char, and hence truncates any code point outside of the BMP.

robin-aws added a commit that referenced this issue Nov 24, 2022
Implementation of the design from
dafny-lang/rfcs#13.

Resolves #413. Fixes #2928. Fixes #818. Fixes #3058. Fixes #1293. Fixes
#3001.

Depends on #2976 to be fixed for the tests to pass consistently across
platforms.

I've documented some of the less obvious compilation strategy decisions
in `docs/Compilation/StringsAndChars.md`.

<small>By submitting this pull request, I confirm that my contribution
is made under the terms of the [MIT
license](https://github.com/dafny-lang/dafny/blob/master/LICENSE.txt).</small>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: enhancement Enhancements, feature requests, etc. These are NOT bugs, but ways we can improve Dafny part: parser First phase of Dafny's pipeline
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants