Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression on regex when upgrading from 0.6.28 to 0.6.31 #3901

Closed
dportabella opened this issue Dec 14, 2019 · 5 comments
Closed

Regression on regex when upgrading from 0.6.28 to 0.6.31 #3901

dportabella opened this issue Dec 14, 2019 · 5 comments
Assignees
Labels
bug
Milestone

Comments

@dportabella
Copy link

@dportabella dportabella commented Dec 14, 2019

// project/plugins.sbt: addSbtPlugin("org.scala-js" % "sbt-scalajs" % "0.6.31")
import scala.util.matching.Regex
val text = ???
val regex: String = ???
val it: Iterator[Regex.Match] = regex.r.findAllMatchIn(text)
println(it.hasNext)  // returns true
it.next // it fails with:

TypeError: Cannot read property 'length' of null
    at $c_ju_regex_Matcher.end__I__I (RuntimeString.scala:252)
    at $c_s_util_matching_Regex$Match.ends$lzycompute__p1__AI (Regex.scala:732)
    at $c_s_util_matching_Regex$Match.ends__p1__AI (Regex.scala:731)
    at $c_s_util_matching_Regex$Match.force__s_util_matching_Regex$Match (Regex.scala:743)
    at $c_s_util_matching_Regex$$anon$1.next__s_util_matching_Regex$Match (Regex.scala:410)
    at $c_s_util_matching_Regex$$anon$1.next__O (Regex.scala:406)

it works by changing sbt-scalajs version from 0.6.31 to 0.6.28.

Unfortunately I cannot give the text and regex as it is proprietary and very long.
I'll try to minimize the text and regex and share it here when I have some time.

@sjrd sjrd added the bug label Dec 14, 2019
@sjrd sjrd added this to the v0.6.32 milestone Dec 14, 2019
@sjrd sjrd changed the title regression from sbt-scalajs 0.6.28 to 0.6.31 on regex Regression on regex when upgrading from 0.6.28 to 0.6.31 Dec 16, 2019
@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Dec 16, 2019

Thank you for the report.

What version of Scala are you using?

@dportabella

This comment has been minimized.

Copy link
Author

@dportabella dportabella commented Dec 16, 2019

scala 2.12.10

@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Dec 17, 2019

This was most certainly caused by 56dd7ab.

The error is basically a NullPointerException that is a consequence of start(group) not returning -1 where before it correctly returned -1 at


Computing start(group) is the whole point of GroupStartMapper, so something went wrong there.

@dportabella A reproduction would be really welcome. One hint I can give is that, if my above diagnosis is correct, this has to do with a match that does match something as a whole, but one group inside does not match anything (it is in a non-chosen alternative with |, or in a non-present optional thing with ?).

@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Dec 18, 2019

Here is a minimized reproduction:

    val text = "b"
    val regex = "(?!(a))b"
    val pattern = java.util.regex.Pattern.compile(regex)
    val matcher = pattern.matcher(text)
    println(matcher.find())
    println(matcher.start(1)) // returns 0 instead of -1
    println(matcher.end(1)) // throws
    val it = regex.r.findAllMatchIn(text)
    while (it.hasNext)
      println(it.next()) // throws

It happens with a capturing group within a negative look-ahead.

Fix incoming.

sjrd added a commit to sjrd/scala-js that referenced this issue Dec 18, 2019
`groupStartMap` is initially filled with `-1` everywhere, which is
the correct value for non-matching groups. When propagating the
`start` and `end` of a `GroupNode`, we previously always overrode
that value with `start`, which can `>= 0` even for non-matching
groups if they are within a negative look-ahead. This is incorrect,
as non-matching groups must always report `-1` as their `start`.

We now explicitly avoid overriding the initial `-1` for
non-matching groups. This more closely corresponds to what was done
in `GroupStartMap.setEndReturnStart` and `setStartReturnEnd` before
the big rewrite of 56dd7ab.
@sjrd sjrd self-assigned this Dec 18, 2019
sjrd added a commit to sjrd/scala-js that referenced this issue Dec 19, 2019
`groupStartMap` is initially filled with `-1` everywhere, which is
the correct value for non-matching groups. When propagating the
`start` and `end` of a `GroupNode`, we previously always overrode
that value with `start`, which can `>= 0` even for non-matching
groups if they are within a negative look-ahead. This is incorrect,
as non-matching groups must always report `-1` as their `start`.

We now explicitly avoid overriding the initial `-1` for
non-matching groups. This more closely corresponds to what was done
in `GroupStartMap.setEndReturnStart` and `setStartReturnEnd` before
the big rewrite of 56dd7ab.
sjrd added a commit that referenced this issue Dec 20, 2019
Fix #3901: Do not fill `groupStartMap` for non-matching groups.
@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Dec 20, 2019

Fixed in 602005d.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.