Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding mismatches result in runtime crashes in customer code compiled to Go #2934

Closed
cpitclaudel opened this issue Oct 27, 2022 · 3 comments
Assignees
Labels
kind: bug Crashes, unsoundness, incorrect output, etc. If possible, add a `part:` label lang: golang Dafny's transpiler to Go and its runtime

Comments

@cpitclaudel
Copy link
Member

Input:

method Main() {
  var escaped := "\u00E9"; // UTF-16 version of é
  var unescaped := "é";

  print |escaped|, "\n";
  print |unescaped|, "\n";
  print "'", escaped, "'", "\n";
  print "'", unescaped, "'", "\n";

  assert |escaped| == 1;
  assert |unescaped| == 1;

  if |escaped| == 2 {
    print 1 / 0;
  }

  if |unescaped| == 2 {
    print 1 / 0;
  }
}

Output, with go:

2
2
'%!v(PANIC=String method: interface conversion: interface {} is nil, not dafny.Char)'
'%!v(PANIC=String method: interface conversion: interface {} is nil, not dafny.Char)'
[Program halted] division by zero

What leads to the issue:

  1. The Dafny file is read as UTF-8. In memory, escaped is represented as 6 C# "chars", and unescaped as one C# char (UCS2)
  2. At verification time, both of these are treated as one-character strings
  3. When compiled to go, both strings are encoded as UTF-8 and printed to a go file, so the go file contains the same string literals as the Dafny file. This is doubly wrong: the escaped string has a different meaning in Go, and the unescaped string ends up written as two bytes in the target file (utf-8)
  4. The StringToSeq function in the Go runtime does not encode to UTF-16 before converting, so for unescaped the final byte sequence is the UTF-8 one (length = 2)

#2926 currently fixes the issue with unescaped, but not with escaped.

@cpitclaudel cpitclaudel added kind: bug Crashes, unsoundness, incorrect output, etc. If possible, add a `part:` label lang: golang Dafny's transpiler to Go and its runtime severity: soundness (crash) labels Oct 27, 2022
@robin-aws
Copy link
Member

Thanks for writing this up. As discussed I've updated #2926 to account for this case too - have a look and make sure my test cases are thorough enough!

@robin-aws robin-aws self-assigned this Nov 1, 2022
@robin-aws
Copy link
Member

#2926 currently fixes the issue with unescaped, but not with escaped.

Other way around as it turns out: it should fix both, but I'm not claiming to fix unescaped characters because I had to remove the corresponding flaky tests until we address #2976

robin-aws added a commit that referenced this issue Nov 3, 2022
…and Go (#2926)

Fixes #1980. Fixes #2925.

Should fix the root cause of #2934, but I'm not claiming the complete
fix, as attempting to include unescaped, non-ASCII characters in a Dafny
source file is revealing platform-specific headaches in the integration
test runner I'd like to address in a separate PR.

Printing non-ASCII characters, especially invalid UTF-16 sequences, is
still inconsistent across backends, but at least should not crash after
this change. Again this is difficult to test effectively across
platforms in our testing architecture.

<small>By submitting this pull request, I confirm that my contribution
is made under the terms of the [MIT
license](https://github.com/dafny-lang/dafny/blob/master/LICENSE.txt).</small>

Co-authored-by: Clément Pit-Claudel <cpitclaudel@users.noreply.github.com>
@robin-aws
Copy link
Member

#3016 finished addressing this by adding the missing tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: bug Crashes, unsoundness, incorrect output, etc. If possible, add a `part:` label lang: golang Dafny's transpiler to Go and its runtime
Projects
None yet
Development

No branches or pull requests

2 participants