-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable regular expressions in Painless by default #20397
Comments
I don't know how complex it would be but maybe we can enable lucene regular expressions by default and only if the user enables it allow complex ones? |
A node level setting or something sent with the request like we're talking about for assert? If a node level setting then a dynamic one or a static one?
Lucene's syntax is not a drop in replacement for java. It'd be nice if we only had a single regex syntax but I can see the allure as I know they are safe. I'm also happy to investigate any alternative that lets us ship a safe regex engine out of the box. All of this stuff is experimental so it is allowed to change during the 5.x line and |
Whatever we do, I'm happy to take the issue. |
I think it should be a static node level setting, like the settings which enable/disable scripting.
Yeah, the bugger is that the regex syntax is already used by Java rx. I don't know how feasible it would be to have |
I think because we've marked it experimental we should allow ourselves to do this if we think it is right. I don't think it'd be impossible to do something like that. |
Ok - I've self assigned this. I'll make the setting soon-ish. |
Just to note, here's the regex that @clintongormley mentioned that was causing the slowness |
@rjernst I know you had some ideas about the way options should be implemented for languages... Curious to hear your thoughts since whatever option is added here would likely be how other options are done in the future as well (asserts, loop counter, etc.) |
I think this one has to be a node level non-dynamic setting because it is part of the sandbox protections. I'd love asserts to be a request level setting because it is part of the debugging process. I guess I just think those are different things. |
^^ Agreed these are different. I realize that I have done all of zero settings work so I will defer on this to others that know that code better than I do. I guess there's always the concern that the setting will end up being different on different nodes and crazy things happen? |
I think this is a more general problem. It'd be nice if |
I don't view loop counter as something we need to "protect" with security. DOS isn't something we can really protect against (and there are already numerous other ways to do this on an open cluster, for example with nasty prefix queries). As far as asserts or other debugging features, I don't think these should be flags at all. Instead, they should not be available in scripts for real apis (eg in score scripts etc), but we should have an api similar to _analyze where you can run a script once with some given input (and that input would depend on which context it is expected to be run in, for example, with a score script, you would have to pass in the doc variable essentially). In this api, we can always have debugging features available (eg asserts, logging, whatever). |
I think the goal of issues like this one are to do more to protect against it. |
I disagree. I think the goal of issues like this are to not have trappy features that perform poorly. |
If that were the goal then we'd expose this setting on the request level or at least make it dynamic. Or maybe remove regex support entirely. Instead @clintongormley wants this static so it can be tightly controlled by an administrator. |
Adds a new node level, non-dynamic setting, `script.painless.regex.enabled` can be used to enable regexes. Closes elastic#20397
I've pushed #20427 to master which makes the non-dynamic setting we talk about in the top of the issue. I've reopened this because we're still hashing out if that is exactly what we want. If it turns out to be what we want we can close then. If not then we can implement it on top of the start I've made in #20427. |
Adds a new node level, non-dynamic setting, `script.painless.regex.enabled` can be used to enable regexes. Closes #20397
Adds a new node level, non-dynamic setting, `script.painless.regex.enabled` can be used to enable regexes. Closes #20397
@nik9000 can we remove the blocker label from this since we've pushed something addressing this? |
I suspect we should, yeah. |
Given that the title of this issue has been dealt with, I'm going to close it. Further discussions can happen on their own tickets. |
Painless is designed to be fast and safe. The one chink in its armour is Java regular expressions. A pathological regular expression could run in quadratic time or even throw a StackOverflowError.
I think we should disable Java regular expressions by default and add a setting which allows them to be enabled.
Nice to have: support Lucene regular expressions for matching without capturing, as they have a safety mechanism built in.
The text was updated successfully, but these errors were encountered: