Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scala 2.10 String Interpolation inside a multiline String with backslash #6631

Closed
scabug opened this issue Nov 8, 2012 · 14 comments
Closed

Scala 2.10 String Interpolation inside a multiline String with backslash #6631

scabug opened this issue Nov 8, 2012 · 14 comments
Assignees
Milestone

Comments

@scabug
Copy link

@scabug scabug commented Nov 8, 2012

picked up from http://stackoverflow.com/questions/13296411/is-this-a-bug-in-scala-2-10-string-interpolation-inside-a-multiline-string-with.

Here is the output from scala repl.
Welcome to Scala version 2.10.0-RC1 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6
.0_33).
Type in expressions to have them evaluated.
Type :help for more information.

scala> val fileName = "test"
fileName: String = test

scala> val path = s"""c:\foo\bar$fileName.csv"""
java.lang.StringIndexOutOfBoundsException: String index out of range: 11
at java.lang.String.charAt(String.java:686)
at scala.collection.immutable.StringOps$.apply$extension(StringOps.scala
:39)
at scala.StringContext$.treatEscapes(StringContext.scala:202)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:90)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:90)
at scala.StringContext.standardInterpolator(StringContext.scala:120)
at scala.StringContext.s(StringContext.scala:90)
at .(:8)
at .()
at .(:7)
at .()
...tuncated

scala> val path = s"""c:\foo\bar$fileName.csv"""
path: String = c:oo?artest.csv

scala>

@scabug
Copy link
Author

@scabug scabug commented Nov 8, 2012

Imported From: https://issues.scala-lang.org/browse/SI-6631?orig=1
Reporter: Santosh Gokak (santoshgokak)
Affected Versions: 2.10.0-RC1

@scabug
Copy link
Author

@scabug scabug commented Nov 8, 2012

@paulp said:
Much reduced:

scala> s"""\"""
java.lang.StringIndexOutOfBoundsException: String index out of range: 1
	at java.lang.String.charAt(String.java:658)
	at scala.collection.immutable.StringOps$.apply$extension(StringOps.scala:39)
	at scala.StringContext$.treatEscapes(StringContext.scala:202)
	at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:90)

@scabug
Copy link
Author

@scabug scabug commented Nov 8, 2012

Santosh Gokak (santoshgokak) said:
What about the second case? That just returns complete gibberish.

@scabug
Copy link
Author

@scabug scabug commented Nov 8, 2012

@paulp said:
It's not gibberish; it's what you get when you interpret \f as a form feed, like it is in a single quoted string.

Try "c:\foo\bar\fileName.csv" in the repl.

@scabug
Copy link
Author

@scabug scabug commented Nov 8, 2012

@Ichoran said:
Even more reduced:

s"\"

String interpolation treats everything as a raw string. (But the escape parser is buggy.)

@scabug
Copy link
Author

@scabug scabug commented Nov 8, 2012

@paulp said:
Whoa. You're telling me it's intentional that

"""\t"""

and

s"""\t"""

Get me one string with a tab and one with a \t ? I can't believe anyone thought that was a good idea.

@scabug
Copy link
Author

@scabug scabug commented Nov 8, 2012

@Ichoran said:
I'm not sure what s"""\t""" ought to do. The format specifier says it's simply formatted as opposed to raw, i.e. raw"""\t""" or raw"\t". But the triple quotes say it's raw.

In case of ambiguity, you will find a life jacket under the seat in front of you. The person to your right, however, should use their seat cushion as a floatation device.

It's simple enough to fix the bug (note: octal parsing is buggy also), but the correct behavior is nonobvious without a nontrivial change to something.

@scabug
Copy link
Author

@scabug scabug commented Nov 10, 2012

@dcsobral said:
It is pretty much intentional indeed. Strings in Scala offer either multiline capability or escape characters, but not both at the same time, nor neither at the same time.

String interpolators make it possible to have any particular combination, by letting multiline be decided by the number of quotes, and escaping be decided by the interpolator.

Though I do see the potential for confusion, and I'm certain plenty will arise from it, I think it was the right choice. I wouldn't be bold enough to make it, but I find it easy to support Odersky on this. :-)

This particular ticket is not a bug, because there's nothing for the \ before $ to escape. If you use the raw interpolator, it will work. If you double the backslashes, it will work. Everything is according to spec, and, at most, a better error message is required. I'd call it enhancement.

@scabug
Copy link
Author

@scabug scabug commented Nov 10, 2012

@Ichoran said:
It is a bug because it throws an index out of bounds error instead of something related to escape processing. That " and """ are indistinguishable is highly questionable, but that's an improvement request, not a bug. Simply falling off the end of the string with a generic exception is bad manners. (Thank goodness nobody uses octal escapes in strings.)

@scabug
Copy link
Author

@scabug scabug commented Nov 11, 2012

@Ichoran said:
I know I should really have a pull request or at least a diff. But anyway, here's fixed code that

(1) Throws the correct exception on an error

(2) Does not fail on a short octal at the end of a string s"\4"

(3) Uses the append(str, a, b) string builder method, which should be faster than substring (no object creation required)

  def treatEscapes(str: String): String = {
    lazy val bldr = new java.lang.StringBuilder
    val len = str.length
    var start = 0
    var cur = 0
    var idx = 0
    def output(ch: Char) = {
      bldr.append(str, start, cur)
      bldr append ch
      start = idx
    }
    while (idx < len) {
      cur = idx
      if (str(idx) == '\\') {
        idx += 1
        if (idx >= len) throw new InvalidEscapeException(str, cur)
        if ('0' <= str(idx) && str(idx) <= '7') {
          val leadch = str(idx)
          var oct = leadch - '0'
          idx += 1
          if (idx < len && '0' <= str(idx) && str(idx) <= '7') {
            oct = oct * 8 + str(idx) - '0'
            idx += 1
            if (idx < len && leadch <= '3' && '0' <= str(idx) && str(idx) <= '7') {
              oct = oct * 8 + str(idx) - '0'
              idx += 1
            }
          }
          output(oct.toChar)
        } else {
          val ch = str(idx)
          idx += 1
          output {
            ch match {
              case 'b' => '\b'
              case 't' => '\t'
              case 'n' => '\n'
              case 'f' => '\f'
              case 'r' => '\r'
              case '\"' => '\"'
              case '\'' => '\''
              case '\\' => '\\'
              case _ => throw new InvalidEscapeException(str, cur)
            }
          }
        }
      } else {
        idx += 1
      }
    }
    if (start == 0) str
    else bldr.append(str, start, idx).toString
  }

It does not address the issue of it being weird that s"""\n""" gives you a carriage return. It also does not address the issue that s"\"" doesn't do what you think it will.

@scabug
Copy link
Author

@scabug scabug commented Nov 14, 2012

Eran Medan (eranation) said (edited on Nov 26, 2012 4:12:23 AM UTC):
Confirmed seeing this also in RC2

@scabug
Copy link
Author

@scabug scabug commented Dec 2, 2012

@retronym said:
Rex's suggestions:

scala/scala#1690

To the original reporter: you should use raw"c:\foo\bar" or raw"""c:\foo\bar""" to avoid string escape processing. But beware, unicode escapes, e.g \u1234 are still processed as they happen much much earlier. So raw"c:\users" doesn't work. You can actually turn off unicode escape processing with the compiler option -Xno-uescape. Or make your life easier, and use '/' in strings as path separators.

@scabug
Copy link
Author

@scabug scabug commented Dec 2, 2012

@retronym said:
I've submitted Rex's suggestions as scala/scala#1690.

Santosh: you can use raw"c:\foo\$filename" to avoid interpreting '' as an escape character.

But beware, \uNNNN is always treated as a unicode escape, unless you use the non-standard compiler option -Xno-uescape. I always use '/' as a path separator in literal strings to avoid these issues; even on Windows this is allowed.

@scabug
Copy link
Author

@scabug scabug commented Dec 6, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants