Announcing Scala.js 1.7.0.

sjrd · sjrd · commit bebc77087941 · 2021-08-03T15:44:23.000+02:00
And update the documentation for regular expressions.
diff --git a/_config.yml b/_config.yml
@@ -64,7 +64,7 @@ colors:  #in hex code if not noted else
 
 ### VERSIONS ###
 versions:
-  scalaJS: 1.6.0
+  scalaJS: 1.7.0
   scalaJSBinary: 1
   scalaJS06x: 0.6.33
   scalaJS06xBinary: 0.6
diff --git a/_data/doc.yml b/_data/doc.yml
@@ -55,6 +55,9 @@
       url: /doc/all-api.html
 - text: Semantics of Scala.js
   url: /doc/semantics.html
+  subitems:
+    - text: Regular expressions
+      url: /doc/regular-expressions.html
 - text: Internals
   url: /doc/internals/
   subitems:
diff --git a/_data/library/versions.yml b/_data/library/versions.yml
@@ -27,3 +27,4 @@
 - 1.4.0
 - 1.5.0
 - 1.6.0
+- 1.7.0
diff --git a/_posts/news/2021-08-04-announcing-scalajs-1.7.0.md b/_posts/news/2021-08-04-announcing-scalajs-1.7.0.md
@@ -0,0 +1,135 @@
+---
+layout: post
+title: Announcing Scala.js 1.7.0
+category: news
+tags: [releases]
+permalink: /news/2021/08/04/announcing-scalajs-1.7.0/
+---
+
+
+We are excited to announce the release of Scala.js 1.7.0!
+
+This release fixes a number of bugs.
+In particular, regular expressions, available through `java.util.regex.Pattern` or Scala's `Regex` and `.r` method, now behave in the same way as on the JVM.
+This change has compatibility implications, which we discuss below.
+
+Moreover, this release fixes *all* the known bugs that were left.
+As of this writing, Scala.js 1.7.0 has zero known bugs!
+
+The Scala standard library was upgraded to versions 2.12.14 and 2.13.6.
+
+Read on for more details.
+
+<!--more-->
+
+## Getting started
+
+If you are new to Scala.js, head over to [the tutorial]({{ BASE_PATH }}/tutorial/).
+
+If you need help with anything related to Scala.js, you may find our community [on Gitter](https://gitter.im/scala-js/scala-js) and [on Stack Overflow](https://stackoverflow.com/questions/tagged/scala.js).
+
+Bug reports can be filed [on GitHub](https://github.com/scala-js/scala-js/issues).
+
+## Release notes
+
+If upgrading from Scala.js 0.6.x, make sure to read [the release notes of Scala.js 1.0.0]({{ BASE_PATH }}/news/2020/02/25/announcing-scalajs-1.0.0/) first, as they contain a host of important information, including breaking changes.
+
+This is a **minor** release:
+
+* It is backward binary compatible with all earlier versions in the 1.x series: libraries compiled with 1.0.x through 1.6.x can be used with 1.7.0 without change.
+* It is *not* forward binary compatible with 1.6.x: libraries compiled with 1.7.0 cannot be used with 1.6.x or earlier.
+* It is *not* entirely backward source compatible: it is not guaranteed that a codebase will compile *as is* when upgrading from 1.6.x (in particular in the presence of `-Xfatal-warnings`).
+
+As a reminder, libraries compiled with 0.6.x cannot be used with Scala.js 1.x; they must be republished with 1.x first.
+
+## Fixes with compatibility concerns
+
+### Regular expressions have been fixed to match the JVM behavior
+
+Until Scala.js 1.6.x, the regular expressions provided by `java.util.regex.Pattern`, and used by `scala.util.matching.Regex` and the `.r` method, were implemented directly in terms of JavaScript's `RegExp`.
+That meant that they used the feature set and the semantics of [JavaScript regular expressions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions), which are different from [Java regular expressions](https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/util/regex/Pattern.html).
+Scala.js 1.7.0 finally fixes this issue, and now correctly implements the semantics of Java regular expressions, although some features are not supported.
+
+Since the old implementation has been there for 7 years, and documented as such, it is possible that this fix will actually break some code in the wild.
+During the implementation of this feature, we have analyzed the corpus of all Scala.js-only libraries (without JVM support) and extracted all the regexes that they use.
+We have verified that none of those use cases are impacted by this change.
+It is still possible that applications are impacted.
+
+It is also possible for some cross-platform libraries to face issues, as we have not covered those.
+Unlike Scala.js-only libraries, we consider it unlikely that they will have issues with the *change of semantics* per se, as they already worked on the JVM, with the new semantics.
+
+The biggest danger would be a cross-library that uses the `MULTILINE` flag (aka `(?m)`).
+Indeed, that feature kind of worked before, but is now rejected at `Pattern.compile()`-time with a `PatternSyntaxException` by default.
+The reason is that to correctly implement the semantics of that flag, we need support for look-behind assertions (`(?<=𝑋)`) in JavaScript's `RegExp`.
+That support was only added in ECMAScript 2018, whereas Scala.js targets ES 2015 by default.
+
+It is possible to change that target with the following setting:
+
+{% highlight scala %}
+scalaJSLinkerConfig ~= (_.withESFeatures(_.withESVersion(ESVersion.ES2018)))
+{% endhighlight %}
+
+**Attention!** While this enables support for the `MULTILINE` flag (among others), it restricts your application to environments that support recent JavaScript features.
+If you maintain a library, this restriction applies to all downstream libraries and applications.
+
+We therefore recommend to try and *avoid* the need for that flag instead.
+We give several strategies on how to do so [on the Regular expressions documentation page]({{ BASE_PATH }}/doc/regular-expressions.html).
+That page also contains many more details on the new support of regular expressions.
+
+## New features
+
+### Add a configurable header comment in generated .js files
+
+Sometimes, it is desirable to add a header comment in the generated .js files.
+This is typically used for license information or any other metadata.
+While it has always been possible to post-process the generated .js files in the build, doing so came at the cost of destroying the source maps.
+
+Scala.js 1.7.0 introduces a new linker configuration, `jsHeader`, to specify a comment to insert at the top of .js files:
+
+{% highlight scala %}
+scalaJSLinkerConfig ~= {
+  _.withJSHeader(
+    """
+      |/* This is the header, which source maps
+      | * take into account.
+      | */
+    """.stripMargin.trim() + "\n"
+  )
+},
+{% endhighlight %}
+
+The `jsHeader` must be a combination of valid JavaScript whitespace and/or comments, and must not contain any newline character other than `\n` (the UNIX newline).
+If non-empty, it must end with a new line.
+These restrictions ensure that this feature is not abused to inject arbitrary JavaScript code in the .js file generated by the compiler, potentially compromising the compiler abstractions.
+
+## Miscellaneous
+
+### New JDK APIs
+
+The following JDK classes have been added
+
+* `java.util.concurrent.atomic.LongAdder`
+
+### Set up the `versionScheme` of library artifacts
+
+This release configures the sbt `versionScheme` setting for the library artifacts of Scala.js (with `"semver-spec"` for the public ones).
+This will reduce spurious eviction warnings in downstream projects.
+
+### Upgrade to GCC v20210601
+
+We upgraded to the Google Closure Compiler v20210601.
+
+## Bug fixes
+
+Among others, the following bugs have been fixed in 1.7.0:
+
+* [#4507](https://github.com/scala-js/scala-js/issues/4507) 1.6.0 regression: `new mutable.WrappedArrayBuilder(classTag[Unit]).result()` throws a CCE
+* [#3953](https://github.com/scala-js/scala-js/issues/3953) fastOptJS error in scalaz 7.3 with Scala.js 1.0.0
+* [#3918](https://github.com/scala-js/scala-js/issues/3918) Mixed-in field in class inside lazy val rhs is erroneously immutable in the IR -> IR checking error
+* [#4511](https://github.com/scala-js/scala-js/issues/4511) Nested JS Class in JS Class with Scala companion crashes compiler
+* [#4465](https://github.com/scala-js/scala-js/issues/4465) Default parameters in constructors of nested JS classes cause invalid IR
+* [#4526](https://github.com/scala-js/scala-js/issues/4526) Compiler crashes on nested JS class with default constructor params with private companion
+* [#4336](https://github.com/scala-js/scala-js/issues/4336) Failing `<project>/run` can `close()` subsequent runs too early
+* [#105](https://github.com/scala-js/scala-js/issues/105) `String.split(x: Array[Char])` produces bad regexes
+
+You can find the full list [on GitHub](https://github.com/scala-js/scala-js/issues?q=is%3Aissue+milestone%3Av1.7.0+is%3Aclosed).
diff --git a/assets/badges/scalajs-1.7.0.svg b/assets/badges/scalajs-1.7.0.svg
@@ -0,0 +1 @@
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="98" height="20" role="img" aria-label="scala.js: 1.7.0+"><title>scala.js: 1.7.0+</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="98" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="51" height="20" fill="#555"/><rect x="51" width="47" height="20" fill="#007ec6"/><rect width="98" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="265" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="410">scala.js</text><text x="265" y="140" transform="scale(.1)" fill="#fff" textLength="410">scala.js</text><text aria-hidden="true" x="735" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="370">1.7.0+</text><text x="735" y="140" transform="scale(.1)" fill="#fff" textLength="370">1.7.0+</text></g></svg>
diff --git a/doc/all-api.md b/doc/all-api.md
@@ -5,6 +5,15 @@ title: All previous versions of the Scala.js API
 
 ## All previous versions of the API
 
+### Scala.js 1.7.0
+* [1.7.0 scalajs-library]({{ site.production_url }}/api/scalajs-library/1.7.0/scala/scalajs/js/index.html)
+* [1.7.0 scalajs-test-interface]({{ site.production_url }}/api/scalajs-test-interface/1.7.0/)
+* [1.7.0 scalajs-ir]({{ site.production_url }}/api/scalajs-ir/1.7.0/org/scalajs/ir/index.html)
+* [1.7.0 scalajs-linker-interface]({{ site.production_url }}/api/scalajs-linker-interface/1.7.0/org/scalajs/linker/interface/index.html) ([Scala.js version]({{ site.production_url }}/api/scalajs-linker-interface-js/1.7.0/org/scalajs/linker/interface/index.html))
+* [1.7.0 scalajs-linker]({{ site.production_url }}/api/scalajs-linker/1.7.0/org/scalajs/linker/index.html) ([Scala.js version]({{ site.production_url }}/api/scalajs-linker-js/1.7.0/org/scalajs/linker/index.html))
+* [1.7.0 scalajs-test-adapter]({{ site.production_url }}/api/scalajs-sbt-test-adapter/1.7.0/org/scalajs/testing/adapter/index.html)
+* [1.7.0 sbt-scalajs]({{ site.production_url }}/api/sbt-scalajs/1.7.0/#org.scalajs.sbtplugin.package)
+
 ### Scala.js 1.6.0
 * [1.6.0 scalajs-library]({{ site.production_url }}/api/scalajs-library/1.6.0/scala/scalajs/js/index.html)
 * [1.6.0 scalajs-test-interface]({{ site.production_url }}/api/scalajs-test-interface/1.6.0/)
diff --git a/doc/internals/version-history.md b/doc/internals/version-history.md
@@ -5,6 +5,7 @@ title: Version history
 
 ## Version history of Scala.js
 
+- [1.7.0](/news/2021/08/04/announcing-scalajs-1.7.0/)
 - [1.6.0](/news/2021/06/09/announcing-scalajs-1.6.0/)
 - [1.5.1](/news/2021/04/01/announcing-scalajs-1.5.1/)
 - [1.5.0](/news/2021/02/12/announcing-scalajs-1.5.0/)
diff --git a/doc/regular-expressions.md b/doc/regular-expressions.md
@@ -0,0 +1,156 @@
+---
+layout: doc
+title: Regular expressions
+---
+
+[JavaScript regular expressions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions) are different from [Java regular expressions](https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/util/regex/Pattern.html).
+For `java.util.regex.Pattern` (and its derivatives like `scala.util.matching.Regex` and the `.r` method), Scala.js implements the semantics of Java regular expressions, although with some limitations.
+The semantics and feature set of JavaScript regular expressions is available through `js.RegExp`, as any other JavaScript API.
+
+## Support
+
+The set of supported features for `Pattern` depends on the target ECMAScript version, specified in `ESFeatures.esVersion`.
+By default, Scala.js targets ECMAScript 2015.
+It is possible to change that target with the following setting:
+
+{% highlight scala %}
+scalaJSLinkerConfig ~= (_.withESFeatures(_.withESVersion(ESVersion.ES2018)))
+{% endhighlight %}
+
+**Attention!** While this enables more features of regular expressions, it restricts your application to environments that support recent JavaScript features.
+If you maintain a library, this restriction applies to all downstream libraries and applications.
+We therefore recommend to try and avoid the additional features, and prefer additional logic in code if that is possible.
+
+In particular, we recommend avoiding the `MULTILINE` flag, aka `(?m)`, which requires ES2018.
+We give some hints on how to avoid it below.
+
+### Not supported
+
+The following features are never supported:
+
+* the `CANON_EQ` flag,
+* the `\X`, `\b{g}` and `\N{...}` expressions,
+* `\p{In𝘯𝘢𝘮𝘦}` character classes representing Unicode *blocks*,
+* the `\G` boundary matcher, *except* if it appears at the very beginning of the regex (e.g., `\Gfoo`),
+* embedded flag expressions with inner groups, i.e., constructs of the form `(?idmsuxU-idmsuxU:𝑋)`,
+* embedded flag expressions without inner groups, i.e., constructs of the form `(?idmsuxU-idmsuxU)`, *except* if they appear at the very beginning of the regex (e.g., `(?i)abc` is accepted, but `ab(?i)c` is not), and
+* numeric "back" references to groups that are defined later in the pattern (note that even Java does not support *named* back references like that).
+
+### Conditionally supported
+
+The following features require `esVersion >= ESVersion.ES2015` (which is true by default):
+
+* the `UNICODE_CASE` flag.
+
+The following features require `esVersion >= ESVersion.ES2018` (which is false by default):
+
+* the `MULTILINE` and `UNICODE_CHARACTER_CLASS` flags,
+* look-behind assertions `(?<=𝑋)` and `(?<!𝑋)`,
+* the `\b` and `\B` expressions used together with the `UNICODE_CASE` flag,
+* `\p{𝘯𝘢𝘮𝘦}` expressions where `𝘯𝘢𝘮𝘦` is not one of the [POSIX character classes](https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/util/regex/Pattern.html#posix).
+
+### Always supported
+
+It is worth noting that, among others, the following features *are* supported in all cases, even when no equivalent feature exists in ECMAScript at all, or in the target version of ECMAScript:
+
+* correct handling of surrogate pairs (natively supported in ES 2015+),
+* the `\G` boundary matcher when it is at the beginning of the pattern (corresponding to the 'y' flag, natively supported in ES 2015+),
+* named groups and named back references (natively supported in ES 2018+),
+* the `DOTALL` flag (natively supported in ES 2018+),
+* ASCII case-insensitive matching (`CASE_INSENSITIVE` on but `UNICODE_CASE` off),
+* comments with the `COMMENTS` flag,
+* POSIX character classes in ASCII mode, or their Unicode variant with `UNICODE_CHARACTER_CLASS` (if the latter is itself supported, see above),
+* complex character classes with unions and intersections (e.g., `[a-z&&[^g-p]]`),
+* atomic groups `(?>𝑋)`,
+* possessive quantifiers `𝑋*+`, `𝑋++` and `𝑋?+`,
+* the `\A`, `\Z` and `\z` boundary matchers,
+* the `\R` expression,
+* embedded quotations with `\Q` and `\E`, both outside and inside character classes.
+
+All the supported features have the correct semantics from Java.
+This is even true for features that exist in JavaScript but with different semantics, among which:
+
+* the `^` and `$` boundary matchers with the `MULTILINE` flag (when the latter is supported),
+* the predefined character classes `\h`, `\s`, `\v`, `\w` and their negated variants, respecting the `UNICODE_CHARACTER_CLASS` flag,
+* the `\b` and `\B` boundary matchers, respecting the `UNICODE_CHARACTER_CLASS` flag,
+* the internal format of `\p{𝘯𝘢𝘮𝘦}` character classes, including the `\p{java𝘔𝘦𝘵𝘩𝘰𝘥𝘕𝘢𝘮𝘦}` classes,
+* octal escapes and control escapes.
+
+## Guarantees
+
+If a feature is not supported, a `PatternSyntaxException` is thrown at the time of `Pattern.compile()`.
+
+If `Pattern.compile()` succeeds, the regex is guaranteed to behave exactly like on the JVM, *except* for capturing groups within repeated segments (both for their back references and subsequent calls to `group`, `start` and `end`):
+
+* on the JVM, a capturing group always captures whatever substring was successfully matched last by that group during the processing of the regex:
+  - even if it was in a previous iteration of a repeated segment and the last iteration did not have a match for that group, or
+  - if it was during a later iteration of a repeated segment that was subsequently backtracked;
+* in JS and hence in Scala.js, capturing groups within repeated segments always capture what was matched (or not) during the last iteration that was eventually kept.
+
+The behavior of JavaScript is more "functional", whereas that of the JVM is more "imperative".
+This imperative nature is also reflected in the `hitEnd()` and `requireEnd()` methods of `Matcher`, which are not supported (they do not link).
+
+The behavior of the JVM does not appear to be specified, and is questionable.
+There are several open issues that argue it is buggy:
+
+* [JDK-8027747](https://bugs.openjdk.java.net/browse/JDK-8027747)
+* [JDK-8187083](https://bugs.openjdk.java.net/browse/JDK-8187083)
+* [JDK-8187080](https://bugs.openjdk.java.net/browse/JDK-8187080)
+* [JDK-8187082](https://bugs.openjdk.java.net/browse/JDK-8187082)
+
+Scala.js keeps the the JavaScript behavior, and does not try to replicate the JVM behavior (potentially at great cost).
+
+## Avoiding the `MULTILINE` flag, aka `(?m)`
+
+The 'm' flag of JavaScript's `RegExp` is subtly different from that of Java's `Pattern`.
+It considers that the position in the middle of a `\r\n` sequence is both the beginning and end of a line, whereas `Pattern` considers that neither is true.
+The semantics of `Pattern` correspond to Unicode recommendations.
+
+In general, we cannot implement the `Pattern` behavior without look-behind asertions (`(?<=𝑋)`), which are only available in ECMAScript 2018+.
+However, in most concrete cases, it is possible to replace the usage of the 'm' flag with a combination of a) more complicated patterns and b) some ad hoc logic in the code using the regex.
+
+Consider the following simple example, which matches every `foo` or `bar` or empty string on a line and prints them:
+
+{% highlight scala %}
+val regex = """(?m)^(foo|bar|)$""".r
+for (m <- regex.findAllMatchIn(input))
+  println(m.matched)
+{% endhighlight %}
+
+Assuming that, in the particular use case we are facing, only UNIX newlines can appear in the `input` string, we can rewrite the regex without the `(?m)` flag:
+
+{% highlight scala %}
+val regex2 = """(?:^|\n)(foo|bar|)(?=\n|$)""".r
+{% endhighlight %}
+
+`regex2` has exactly one match for each match of `regex`, and can therefore be used instead.
+However, the specific string being matched changes, since the newline characters are included in the matched substrings.
+The surrounding code can compensate for that discrepancy, using the capturing group in the middle:
+
+{% highlight scala %}
+for (m <- regex2.findAllMatchIn(input))
+  println(m.group(1)) // `group(1)` instead of `matched`
+{% endhighlight %}
+
+If other newline characters must be recognized, a more complicated pattern needs to be used.
+If it is acceptable to consider the position in the middle of `\r\n` as the start and end of a line (like JavaScript's `RegExp` does), the following regex works:
+
+{% highlight scala %}
+val regex3 = """(?:^|[\n\r\u0085\u2028\u2029])(foo|bar|)(?=[\n\r\u0085\u2028\u2029]|$)""".r
+for (m <- regex3.findAllMatchIn(input))
+  println(m.group(1))
+{% endhighlight %}
+
+If not, invalid matches must be rejected a posteriori using ad hoc logic:
+
+{% highlight scala %}
+def isBetweenCRAndNL(i: Int): Boolean =
+  i > 0 && i < input.length() && input.charAt(i - 1) == '\r' && input.charAt(i) == '\n'
+
+for {
+  m <- regex3.findAllMatchIn(input)
+  if !isBetweenCRAndNL(m.start(1)) && !isBetweenCRAndNL(m.end(1))
+} {
+  println(m.group(1))
+}
+{% endhighlight %}
diff --git a/doc/semantics.md b/doc/semantics.md

-Original file line number
+Diff line change
 - 1.4.0
 - 1.5.0
 - 1.6.0
 +- 1.7.0