Skip to content

Commit bebc770

Browse files
committed
Announcing Scala.js 1.7.0.
And update the documentation for regular expressions.
1 parent ebd097b commit bebc770

File tree

9 files changed

+309
-12
lines changed

9 files changed

+309
-12
lines changed

_config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ colors: #in hex code if not noted else
6464

6565
### VERSIONS ###
6666
versions:
67-
scalaJS: 1.6.0
67+
scalaJS: 1.7.0
6868
scalaJSBinary: 1
6969
scalaJS06x: 0.6.33
7070
scalaJS06xBinary: 0.6

_data/doc.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,9 @@
5555
url: /doc/all-api.html
5656
- text: Semantics of Scala.js
5757
url: /doc/semantics.html
58+
subitems:
59+
- text: Regular expressions
60+
url: /doc/regular-expressions.html
5861
- text: Internals
5962
url: /doc/internals/
6063
subitems:

_data/library/versions.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,3 +27,4 @@
2727
- 1.4.0
2828
- 1.5.0
2929
- 1.6.0
30+
- 1.7.0
Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
---
2+
layout: post
3+
title: Announcing Scala.js 1.7.0
4+
category: news
5+
tags: [releases]
6+
permalink: /news/2021/08/04/announcing-scalajs-1.7.0/
7+
---
8+
9+
10+
We are excited to announce the release of Scala.js 1.7.0!
11+
12+
This release fixes a number of bugs.
13+
In particular, regular expressions, available through `java.util.regex.Pattern` or Scala's `Regex` and `.r` method, now behave in the same way as on the JVM.
14+
This change has compatibility implications, which we discuss below.
15+
16+
Moreover, this release fixes *all* the known bugs that were left.
17+
As of this writing, Scala.js 1.7.0 has zero known bugs!
18+
19+
The Scala standard library was upgraded to versions 2.12.14 and 2.13.6.
20+
21+
Read on for more details.
22+
23+
<!--more-->
24+
25+
## Getting started
26+
27+
If you are new to Scala.js, head over to [the tutorial]({{ BASE_PATH }}/tutorial/).
28+
29+
If you need help with anything related to Scala.js, you may find our community [on Gitter](https://gitter.im/scala-js/scala-js) and [on Stack Overflow](https://stackoverflow.com/questions/tagged/scala.js).
30+
31+
Bug reports can be filed [on GitHub](https://github.com/scala-js/scala-js/issues).
32+
33+
## Release notes
34+
35+
If upgrading from Scala.js 0.6.x, make sure to read [the release notes of Scala.js 1.0.0]({{ BASE_PATH }}/news/2020/02/25/announcing-scalajs-1.0.0/) first, as they contain a host of important information, including breaking changes.
36+
37+
This is a **minor** release:
38+
39+
* It is backward binary compatible with all earlier versions in the 1.x series: libraries compiled with 1.0.x through 1.6.x can be used with 1.7.0 without change.
40+
* It is *not* forward binary compatible with 1.6.x: libraries compiled with 1.7.0 cannot be used with 1.6.x or earlier.
41+
* It is *not* entirely backward source compatible: it is not guaranteed that a codebase will compile *as is* when upgrading from 1.6.x (in particular in the presence of `-Xfatal-warnings`).
42+
43+
As a reminder, libraries compiled with 0.6.x cannot be used with Scala.js 1.x; they must be republished with 1.x first.
44+
45+
## Fixes with compatibility concerns
46+
47+
### Regular expressions have been fixed to match the JVM behavior
48+
49+
Until Scala.js 1.6.x, the regular expressions provided by `java.util.regex.Pattern`, and used by `scala.util.matching.Regex` and the `.r` method, were implemented directly in terms of JavaScript's `RegExp`.
50+
That meant that they used the feature set and the semantics of [JavaScript regular expressions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions), which are different from [Java regular expressions](https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/util/regex/Pattern.html).
51+
Scala.js 1.7.0 finally fixes this issue, and now correctly implements the semantics of Java regular expressions, although some features are not supported.
52+
53+
Since the old implementation has been there for 7 years, and documented as such, it is possible that this fix will actually break some code in the wild.
54+
During the implementation of this feature, we have analyzed the corpus of all Scala.js-only libraries (without JVM support) and extracted all the regexes that they use.
55+
We have verified that none of those use cases are impacted by this change.
56+
It is still possible that applications are impacted.
57+
58+
It is also possible for some cross-platform libraries to face issues, as we have not covered those.
59+
Unlike Scala.js-only libraries, we consider it unlikely that they will have issues with the *change of semantics* per se, as they already worked on the JVM, with the new semantics.
60+
61+
The biggest danger would be a cross-library that uses the `MULTILINE` flag (aka `(?m)`).
62+
Indeed, that feature kind of worked before, but is now rejected at `Pattern.compile()`-time with a `PatternSyntaxException` by default.
63+
The reason is that to correctly implement the semantics of that flag, we need support for look-behind assertions (`(?<=𝑋)`) in JavaScript's `RegExp`.
64+
That support was only added in ECMAScript 2018, whereas Scala.js targets ES 2015 by default.
65+
66+
It is possible to change that target with the following setting:
67+
68+
{% highlight scala %}
69+
scalaJSLinkerConfig ~= (_.withESFeatures(_.withESVersion(ESVersion.ES2018)))
70+
{% endhighlight %}
71+
72+
**Attention!** While this enables support for the `MULTILINE` flag (among others), it restricts your application to environments that support recent JavaScript features.
73+
If you maintain a library, this restriction applies to all downstream libraries and applications.
74+
75+
We therefore recommend to try and *avoid* the need for that flag instead.
76+
We give several strategies on how to do so [on the Regular expressions documentation page]({{ BASE_PATH }}/doc/regular-expressions.html).
77+
That page also contains many more details on the new support of regular expressions.
78+
79+
## New features
80+
81+
### Add a configurable header comment in generated .js files
82+
83+
Sometimes, it is desirable to add a header comment in the generated .js files.
84+
This is typically used for license information or any other metadata.
85+
While it has always been possible to post-process the generated .js files in the build, doing so came at the cost of destroying the source maps.
86+
87+
Scala.js 1.7.0 introduces a new linker configuration, `jsHeader`, to specify a comment to insert at the top of .js files:
88+
89+
{% highlight scala %}
90+
scalaJSLinkerConfig ~= {
91+
_.withJSHeader(
92+
"""
93+
|/* This is the header, which source maps
94+
| * take into account.
95+
| */
96+
""".stripMargin.trim() + "\n"
97+
)
98+
},
99+
{% endhighlight %}
100+
101+
The `jsHeader` must be a combination of valid JavaScript whitespace and/or comments, and must not contain any newline character other than `\n` (the UNIX newline).
102+
If non-empty, it must end with a new line.
103+
These restrictions ensure that this feature is not abused to inject arbitrary JavaScript code in the .js file generated by the compiler, potentially compromising the compiler abstractions.
104+
105+
## Miscellaneous
106+
107+
### New JDK APIs
108+
109+
The following JDK classes have been added
110+
111+
* `java.util.concurrent.atomic.LongAdder`
112+
113+
### Set up the `versionScheme` of library artifacts
114+
115+
This release configures the sbt `versionScheme` setting for the library artifacts of Scala.js (with `"semver-spec"` for the public ones).
116+
This will reduce spurious eviction warnings in downstream projects.
117+
118+
### Upgrade to GCC v20210601
119+
120+
We upgraded to the Google Closure Compiler v20210601.
121+
122+
## Bug fixes
123+
124+
Among others, the following bugs have been fixed in 1.7.0:
125+
126+
* [#4507](https://github.com/scala-js/scala-js/issues/4507) 1.6.0 regression: `new mutable.WrappedArrayBuilder(classTag[Unit]).result()` throws a CCE
127+
* [#3953](https://github.com/scala-js/scala-js/issues/3953) fastOptJS error in scalaz 7.3 with Scala.js 1.0.0
128+
* [#3918](https://github.com/scala-js/scala-js/issues/3918) Mixed-in field in class inside lazy val rhs is erroneously immutable in the IR -> IR checking error
129+
* [#4511](https://github.com/scala-js/scala-js/issues/4511) Nested JS Class in JS Class with Scala companion crashes compiler
130+
* [#4465](https://github.com/scala-js/scala-js/issues/4465) Default parameters in constructors of nested JS classes cause invalid IR
131+
* [#4526](https://github.com/scala-js/scala-js/issues/4526) Compiler crashes on nested JS class with default constructor params with private companion
132+
* [#4336](https://github.com/scala-js/scala-js/issues/4336) Failing `<project>/run` can `close()` subsequent runs too early
133+
* [#105](https://github.com/scala-js/scala-js/issues/105) `String.split(x: Array[Char])` produces bad regexes
134+
135+
You can find the full list [on GitHub](https://github.com/scala-js/scala-js/issues?q=is%3Aissue+milestone%3Av1.7.0+is%3Aclosed).

assets/badges/scalajs-1.7.0.svg

Lines changed: 1 addition & 0 deletions
Loading

doc/all-api.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,15 @@ title: All previous versions of the Scala.js API
55

66
## All previous versions of the API
77

8+
### Scala.js 1.7.0
9+
* [1.7.0 scalajs-library]({{ site.production_url }}/api/scalajs-library/1.7.0/scala/scalajs/js/index.html)
10+
* [1.7.0 scalajs-test-interface]({{ site.production_url }}/api/scalajs-test-interface/1.7.0/)
11+
* [1.7.0 scalajs-ir]({{ site.production_url }}/api/scalajs-ir/1.7.0/org/scalajs/ir/index.html)
12+
* [1.7.0 scalajs-linker-interface]({{ site.production_url }}/api/scalajs-linker-interface/1.7.0/org/scalajs/linker/interface/index.html) ([Scala.js version]({{ site.production_url }}/api/scalajs-linker-interface-js/1.7.0/org/scalajs/linker/interface/index.html))
13+
* [1.7.0 scalajs-linker]({{ site.production_url }}/api/scalajs-linker/1.7.0/org/scalajs/linker/index.html) ([Scala.js version]({{ site.production_url }}/api/scalajs-linker-js/1.7.0/org/scalajs/linker/index.html))
14+
* [1.7.0 scalajs-test-adapter]({{ site.production_url }}/api/scalajs-sbt-test-adapter/1.7.0/org/scalajs/testing/adapter/index.html)
15+
* [1.7.0 sbt-scalajs]({{ site.production_url }}/api/sbt-scalajs/1.7.0/#org.scalajs.sbtplugin.package)
16+
817
### Scala.js 1.6.0
918
* [1.6.0 scalajs-library]({{ site.production_url }}/api/scalajs-library/1.6.0/scala/scalajs/js/index.html)
1019
* [1.6.0 scalajs-test-interface]({{ site.production_url }}/api/scalajs-test-interface/1.6.0/)

doc/internals/version-history.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ title: Version history
55

66
## Version history of Scala.js
77

8+
- [1.7.0](/news/2021/08/04/announcing-scalajs-1.7.0/)
89
- [1.6.0](/news/2021/06/09/announcing-scalajs-1.6.0/)
910
- [1.5.1](/news/2021/04/01/announcing-scalajs-1.5.1/)
1011
- [1.5.0](/news/2021/02/12/announcing-scalajs-1.5.0/)

doc/regular-expressions.md

Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
---
2+
layout: doc
3+
title: Regular expressions
4+
---
5+
6+
[JavaScript regular expressions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions) are different from [Java regular expressions](https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/util/regex/Pattern.html).
7+
For `java.util.regex.Pattern` (and its derivatives like `scala.util.matching.Regex` and the `.r` method), Scala.js implements the semantics of Java regular expressions, although with some limitations.
8+
The semantics and feature set of JavaScript regular expressions is available through `js.RegExp`, as any other JavaScript API.
9+
10+
## Support
11+
12+
The set of supported features for `Pattern` depends on the target ECMAScript version, specified in `ESFeatures.esVersion`.
13+
By default, Scala.js targets ECMAScript 2015.
14+
It is possible to change that target with the following setting:
15+
16+
{% highlight scala %}
17+
scalaJSLinkerConfig ~= (_.withESFeatures(_.withESVersion(ESVersion.ES2018)))
18+
{% endhighlight %}
19+
20+
**Attention!** While this enables more features of regular expressions, it restricts your application to environments that support recent JavaScript features.
21+
If you maintain a library, this restriction applies to all downstream libraries and applications.
22+
We therefore recommend to try and avoid the additional features, and prefer additional logic in code if that is possible.
23+
24+
In particular, we recommend avoiding the `MULTILINE` flag, aka `(?m)`, which requires ES2018.
25+
We give some hints on how to avoid it below.
26+
27+
### Not supported
28+
29+
The following features are never supported:
30+
31+
* the `CANON_EQ` flag,
32+
* the `\X`, `\b{g}` and `\N{...}` expressions,
33+
* `\p{In𝘯𝘢𝘮𝘦}` character classes representing Unicode *blocks*,
34+
* the `\G` boundary matcher, *except* if it appears at the very beginning of the regex (e.g., `\Gfoo`),
35+
* embedded flag expressions with inner groups, i.e., constructs of the form `(?idmsuxU-idmsuxU:𝑋)`,
36+
* embedded flag expressions without inner groups, i.e., constructs of the form `(?idmsuxU-idmsuxU)`, *except* if they appear at the very beginning of the regex (e.g., `(?i)abc` is accepted, but `ab(?i)c` is not), and
37+
* numeric "back" references to groups that are defined later in the pattern (note that even Java does not support *named* back references like that).
38+
39+
### Conditionally supported
40+
41+
The following features require `esVersion >= ESVersion.ES2015` (which is true by default):
42+
43+
* the `UNICODE_CASE` flag.
44+
45+
The following features require `esVersion >= ESVersion.ES2018` (which is false by default):
46+
47+
* the `MULTILINE` and `UNICODE_CHARACTER_CLASS` flags,
48+
* look-behind assertions `(?<=𝑋)` and `(?<!𝑋)`,
49+
* the `\b` and `\B` expressions used together with the `UNICODE_CASE` flag,
50+
* `\p{𝘯𝘢𝘮𝘦}` expressions where `𝘯𝘢𝘮𝘦` is not one of the [POSIX character classes](https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/util/regex/Pattern.html#posix).
51+
52+
### Always supported
53+
54+
It is worth noting that, among others, the following features *are* supported in all cases, even when no equivalent feature exists in ECMAScript at all, or in the target version of ECMAScript:
55+
56+
* correct handling of surrogate pairs (natively supported in ES 2015+),
57+
* the `\G` boundary matcher when it is at the beginning of the pattern (corresponding to the 'y' flag, natively supported in ES 2015+),
58+
* named groups and named back references (natively supported in ES 2018+),
59+
* the `DOTALL` flag (natively supported in ES 2018+),
60+
* ASCII case-insensitive matching (`CASE_INSENSITIVE` on but `UNICODE_CASE` off),
61+
* comments with the `COMMENTS` flag,
62+
* POSIX character classes in ASCII mode, or their Unicode variant with `UNICODE_CHARACTER_CLASS` (if the latter is itself supported, see above),
63+
* complex character classes with unions and intersections (e.g., `[a-z&&[^g-p]]`),
64+
* atomic groups `(?>𝑋)`,
65+
* possessive quantifiers `𝑋*+`, `𝑋++` and `𝑋?+`,
66+
* the `\A`, `\Z` and `\z` boundary matchers,
67+
* the `\R` expression,
68+
* embedded quotations with `\Q` and `\E`, both outside and inside character classes.
69+
70+
All the supported features have the correct semantics from Java.
71+
This is even true for features that exist in JavaScript but with different semantics, among which:
72+
73+
* the `^` and `$` boundary matchers with the `MULTILINE` flag (when the latter is supported),
74+
* the predefined character classes `\h`, `\s`, `\v`, `\w` and their negated variants, respecting the `UNICODE_CHARACTER_CLASS` flag,
75+
* the `\b` and `\B` boundary matchers, respecting the `UNICODE_CHARACTER_CLASS` flag,
76+
* the internal format of `\p{𝘯𝘢𝘮𝘦}` character classes, including the `\p{java𝘔𝘦𝘵𝘩𝘰𝘥𝘕𝘢𝘮𝘦}` classes,
77+
* octal escapes and control escapes.
78+
79+
## Guarantees
80+
81+
If a feature is not supported, a `PatternSyntaxException` is thrown at the time of `Pattern.compile()`.
82+
83+
If `Pattern.compile()` succeeds, the regex is guaranteed to behave exactly like on the JVM, *except* for capturing groups within repeated segments (both for their back references and subsequent calls to `group`, `start` and `end`):
84+
85+
* on the JVM, a capturing group always captures whatever substring was successfully matched last by that group during the processing of the regex:
86+
- even if it was in a previous iteration of a repeated segment and the last iteration did not have a match for that group, or
87+
- if it was during a later iteration of a repeated segment that was subsequently backtracked;
88+
* in JS and hence in Scala.js, capturing groups within repeated segments always capture what was matched (or not) during the last iteration that was eventually kept.
89+
90+
The behavior of JavaScript is more "functional", whereas that of the JVM is more "imperative".
91+
This imperative nature is also reflected in the `hitEnd()` and `requireEnd()` methods of `Matcher`, which are not supported (they do not link).
92+
93+
The behavior of the JVM does not appear to be specified, and is questionable.
94+
There are several open issues that argue it is buggy:
95+
96+
* [JDK-8027747](https://bugs.openjdk.java.net/browse/JDK-8027747)
97+
* [JDK-8187083](https://bugs.openjdk.java.net/browse/JDK-8187083)
98+
* [JDK-8187080](https://bugs.openjdk.java.net/browse/JDK-8187080)
99+
* [JDK-8187082](https://bugs.openjdk.java.net/browse/JDK-8187082)
100+
101+
Scala.js keeps the the JavaScript behavior, and does not try to replicate the JVM behavior (potentially at great cost).
102+
103+
## Avoiding the `MULTILINE` flag, aka `(?m)`
104+
105+
The 'm' flag of JavaScript's `RegExp` is subtly different from that of Java's `Pattern`.
106+
It considers that the position in the middle of a `\r\n` sequence is both the beginning and end of a line, whereas `Pattern` considers that neither is true.
107+
The semantics of `Pattern` correspond to Unicode recommendations.
108+
109+
In general, we cannot implement the `Pattern` behavior without look-behind asertions (`(?<=𝑋)`), which are only available in ECMAScript 2018+.
110+
However, in most concrete cases, it is possible to replace the usage of the 'm' flag with a combination of a) more complicated patterns and b) some ad hoc logic in the code using the regex.
111+
112+
Consider the following simple example, which matches every `foo` or `bar` or empty string on a line and prints them:
113+
114+
{% highlight scala %}
115+
val regex = """(?m)^(foo|bar|)$""".r
116+
for (m <- regex.findAllMatchIn(input))
117+
println(m.matched)
118+
{% endhighlight %}
119+
120+
Assuming that, in the particular use case we are facing, only UNIX newlines can appear in the `input` string, we can rewrite the regex without the `(?m)` flag:
121+
122+
{% highlight scala %}
123+
val regex2 = """(?:^|\n)(foo|bar|)(?=\n|$)""".r
124+
{% endhighlight %}
125+
126+
`regex2` has exactly one match for each match of `regex`, and can therefore be used instead.
127+
However, the specific string being matched changes, since the newline characters are included in the matched substrings.
128+
The surrounding code can compensate for that discrepancy, using the capturing group in the middle:
129+
130+
{% highlight scala %}
131+
for (m <- regex2.findAllMatchIn(input))
132+
println(m.group(1)) // `group(1)` instead of `matched`
133+
{% endhighlight %}
134+
135+
If other newline characters must be recognized, a more complicated pattern needs to be used.
136+
If it is acceptable to consider the position in the middle of `\r\n` as the start and end of a line (like JavaScript's `RegExp` does), the following regex works:
137+
138+
{% highlight scala %}
139+
val regex3 = """(?:^|[\n\r\u0085\u2028\u2029])(foo|bar|)(?=[\n\r\u0085\u2028\u2029]|$)""".r
140+
for (m <- regex3.findAllMatchIn(input))
141+
println(m.group(1))
142+
{% endhighlight %}
143+
144+
If not, invalid matches must be rejected a posteriori using ad hoc logic:
145+
146+
{% highlight scala %}
147+
def isBetweenCRAndNL(i: Int): Boolean =
148+
i > 0 && i < input.length() && input.charAt(i - 1) == '\r' && input.charAt(i) == '\n'
149+
150+
for {
151+
m <- regex3.findAllMatchIn(input)
152+
if !isBetweenCRAndNL(m.start(1)) && !isBetweenCRAndNL(m.end(1))
153+
} {
154+
println(m.group(1))
155+
}
156+
{% endhighlight %}

0 commit comments

Comments
 (0)