Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp hangs forever #1189

Open
dvega opened this issue Feb 23, 2022 · 3 comments
Open

regexp hangs forever #1189

dvega opened this issue Feb 23, 2022 · 3 comments

Comments

@dvega
Copy link

dvega commented Feb 23, 2022

This code hangs forever (or is incredible slow). Tested in Rhino 1.7.14

var aregex = /^[A-Z]+\.([A-Z]+)*$/;
aregex.test('ABCDEFV.HSAKSHKASHSKAHSKAHSKJAHSQWWQIUEIUWEYWIEI@ABCDEFV.COM')

Also the regexp code is non-interruptible. Please add if (Thread.interrupted()) ... to the regexp evaluation loop

@p-bakker
Copy link
Collaborator

p-bakker commented Mar 1, 2022

Hi,

Its unfortunate that you ran into this. Tested this in Chrome and also quite slow there. Think this is just a limitation of many regex implementations out there, see https://v8.dev/blog/non-backtracking-regexp for an explanation of the problem/challenge.

As for your request for adding the if (Thread.interrupted()) call: if you can provide a PR for it, it can be looked at

@tuchida
Copy link
Contributor

tuchida commented Mar 6, 2022

https://github.com/makenowjust-labs/recheck
I have not tried it, but it seems that time-consuming RegExp can be detected by such a library.

@blutorange
Copy link
Contributor

In general, JavaScript regexp are powerful enough that there will always be some that are very slow, even in browsers. For example:

regexp: (.*){1,32000}[bc]
input: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

(If you try this in Chrome dev tools, be prepared to kill the browser tab)

We are currently still using Rhino to run JavaScript regexps, in order to perform server validation of some client side browser code.

Until now we started a separate thread and used the deprecated Thread#stop method to kill the thread if it did not complete within a certain timeout (as that is the only way to kill unresponsive threads).

Somewhere between Java 17 and 21, Thead#stop was changed not to work anymore and just throw a UnsupportedOperationException.

As such, a way to interrupt long running regexp would definitely be great. I'll try and clone this repo locally to see if the check for Thread.interrupted would work.

blutorange pushed a commit to blutorange/rhino that referenced this issue Jan 17, 2024
Uses Thread.currentThread.isInterrupted() so that the interruption flag remains set to true,
we only terminate the RegExp evaluation loop, but other (potentially third-party calling)
code may still have to check for the interrupted flag to stop its execution as well.

I also added a test with a long-running regexp that fails without the interrupt check.
gbrail pushed a commit that referenced this issue Apr 27, 2024
* Make regexp execution loop interruptible #1189

Uses Thread.currentThread.isInterrupted() so that the interruption flag remains set to true,
we only terminate the RegExp evaluation loop, but other (potentially third-party calling)
code may still have to check for the interrupted flag to stop its execution as well.

I also added a test with a long-running regexp that fails without the interrupt check.

Co-authored-by: Andre Wachsmuth <awa@xima.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants