Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make regexp execution loop interruptible #1189 #1440

Merged
merged 3 commits into from
Apr 27, 2024

Conversation

blutorange
Copy link
Contributor

@blutorange blutorange commented Jan 17, 2024

#1189

Uses Thread.currentThread.isInterrupted() so that the interruption flag remains set to true, we only terminate the RegExp evaluation loop, but other (potentially third-party calling) code may still have to check for the interrupted flag to stop its execution as well.

I also added a test with a long-running regexp that fails without the interrupt check.

Perhaps it would have been better to throw an InterruptedException, but that would require changing the signature of many methods.

Uses Thread.currentThread.isInterrupted() so that the interruption flag remains set to true,
we only terminate the RegExp evaluation loop, but other (potentially third-party calling)
code may still have to check for the interrupted flag to stop its execution as well.

I also added a test with a long-running regexp that fails without the interrupt check.
@gbrail
Copy link
Collaborator

gbrail commented Jan 25, 2024

Can I suggest another way to implement this that could help solve your problem, but do it without adding code for people who don't need that extra check?

The Context class has a methods called "setInstructionObserverThreshold." When set, the interpreter or the generated bytecode will call a method on the ContextFactory (which you implement using inheritance) whenever N instructions are executed. You can have this method be implemented to check for thread interruption if that's what works best for you.

The thing you'd do in this case, then, would be to add a call to ScriptRuntime.addInstructionCount on every iteration of that top-level loop in the regex. (I'd suggest just calling each trip through that loop as, I don't know, let's say 10 "instructions," but it could be a different value.)

I think this would be more general than just checking for interruption in that loop, and means that you can use the same mechanism to check for long-running regexps as what we already have for checking for long-running code in general.

@blutorange
Copy link
Contributor Author

Thanks for the feedback. We could work with that solution as well, if you think that helps performance. Previously believe I had already tried setting an observer threshold (which did not work, as it wasn't yet implemented) I'll take a look if I can adjust it accordingly.

@blutorange
Copy link
Contributor Author

ScriptRuntime.addInstructionCount works just as well, and yeah, that seems like a solution that integrates better into the existing mechanism. I used 5 for the instruction count. A regexp like /(.*){1,32000[bc]/.test("aaaaaaaaaaaaaaaaaaaaaaa"); takes a split second in browsers to execute (and a few seconds in rhino) and is enough to reach a quota of Integer.MAX_VALUE

@gbrail
Copy link
Collaborator

gbrail commented Apr 27, 2024

Thanks for that -- it seems to be fine!

@gbrail gbrail merged commit dcd8e05 into mozilla:master Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants