Disable regular expressions in Painless by default #20397

clintongormley · 2016-09-09T08:17:42Z

Painless is designed to be fast and safe. The one chink in its armour is Java regular expressions. A pathological regular expression could run in quadratic time or even throw a StackOverflowError.

I think we should disable Java regular expressions by default and add a setting which allows them to be enabled.

Nice to have: support Lucene regular expressions for matching without capturing, as they have a safety mechanism built in.

s1monw · 2016-09-09T09:04:38Z

I don't know how complex it would be but maybe we can enable lucene regular expressions by default and only if the user enables it allow complex ones?

nik9000 · 2016-09-09T12:26:56Z

I think we should disable Java regular expressions by default and add a setting which allows them to be enabled.

A node level setting or something sent with the request like we're talking about for assert? If a node level setting then a dynamic one or a static one?

I don't know how complex it would be but maybe we can enable lucene regular expressions by default and only if the user enables it allow complex ones?

Lucene's syntax is not a drop in replacement for java. It'd be nice if we only had a single regex syntax but I can see the allure as I know they are safe.

I'm also happy to investigate any alternative that lets us ship a safe regex engine out of the box. All of this stuff is experimental so it is allowed to change during the 5.x line and

nik9000 · 2016-09-09T14:29:12Z

Whatever we do, I'm happy to take the issue.

clintongormley · 2016-09-09T18:28:06Z

A node level setting or something sent with the request like we're talking about for assert? If a node level setting then a dynamic one or a static one?

I think it should be a static node level setting, like the settings which enable/disable scripting.

Lucene's syntax is not a drop in replacement for java. It'd be nice if we only had a single regex syntax but I can see the allure as I know they are safe.

Yeah, the bugger is that the regex syntax is already used by Java rx. I don't know how feasible it would be to have /.../ to mean Lucene rx, but (eg) have java/.../ mean Java rx.

nik9000 · 2016-09-09T18:31:39Z

Yeah, the bugger is that the regex syntax is already used by Java rx. I don't know how feasible it would be to have /.../ to mean Lucene rx, but (eg) have java/.../ mean Java rx.

I think because we've marked it experimental we should allow ourselves to do this if we think it is right. I don't think it'd be impossible to do something like that.

nik9000 · 2016-09-09T18:54:07Z

Ok - I've self assigned this. I'll make the setting soon-ish.

jdconrad · 2016-09-10T00:03:29Z

Just to note, here's the regex that @clintongormley mentioned that was causing the slowness
'xxxxxxxxxxxxxxxx'=~/(x*x*)+y/

jdconrad · 2016-09-12T17:23:29Z

@rjernst I know you had some ideas about the way options should be implemented for languages... Curious to hear your thoughts since whatever option is added here would likely be how other options are done in the future as well (asserts, loop counter, etc.)

nik9000 · 2016-09-12T17:34:34Z

I think this one has to be a node level non-dynamic setting because it is part of the sandbox protections. I'd love asserts to be a request level setting because it is part of the debugging process. I guess I just think those are different things.

jdconrad · 2016-09-12T17:43:08Z

^^ Agreed these are different. I realize that I have done all of zero settings work so I will defer on this to others that know that code better than I do. I guess there's always the concern that the setting will end up being different on different nodes and crazy things happen?

nik9000 · 2016-09-12T18:00:59Z

I guess there's always the concern that the setting will end up being different on different nodes and crazy things happen?

I think this is a more general problem. It'd be nice if elasticsearch.yaml could contain just the connection information and you could use the API to configure all the settings but this doesn't play nice with settings that we don't want to be changed easily like this one. So far for those settings we've just made them non-dynamic so they must be read from elasticsearch.yaml and lived with the possibility that they differ on different nodes.

rjernst · 2016-09-12T18:01:04Z

I don't view loop counter as something we need to "protect" with security. DOS isn't something we can really protect against (and there are already numerous other ways to do this on an open cluster, for example with nasty prefix queries).

As far as asserts or other debugging features, I don't think these should be flags at all. Instead, they should not be available in scripts for real apis (eg in score scripts etc), but we should have an api similar to _analyze where you can run a script once with some given input (and that input would depend on which context it is expected to be run in, for example, with a score script, you would have to pass in the doc variable essentially). In this api, we can always have debugging features available (eg asserts, logging, whatever).

nik9000 · 2016-09-12T18:03:45Z

DOS isn't something we can really protect against (and there are already numerous other ways to do this on an open cluster, for example with nasty prefix queries).

I think the goal of issues like this one are to do more to protect against it.

rjernst · 2016-09-12T18:19:04Z

I think the goal of issues like this one are to do more to protect against it.

I disagree. I think the goal of issues like this are to not have trappy features that perform poorly.

nik9000 · 2016-09-12T18:23:11Z

I think the goal of issues like this are to not have trappy features that perform poorly.

If that were the goal then we'd expose this setting on the request level or at least make it dynamic. Or maybe remove regex support entirely. Instead @clintongormley wants this static so it can be tightly controlled by an administrator.

Adds a new node level, non-dynamic setting, `script.painless.regex.enabled` can be used to enable regexes. Closes elastic#20397

nik9000 · 2016-09-12T18:31:03Z

I've pushed #20427 to master which makes the non-dynamic setting we talk about in the top of the issue. I've reopened this because we're still hashing out if that is exactly what we want. If it turns out to be what we want we can close then. If not then we can implement it on top of the start I've made in #20427.

Adds a new node level, non-dynamic setting, `script.painless.regex.enabled` can be used to enable regexes. Closes #20397

dakrone · 2016-09-14T21:14:09Z

@nik9000 can we remove the blocker label from this since we've pushed something addressing this?

nik9000 · 2016-09-14T21:24:38Z

I suspect we should, yeah.

clintongormley · 2016-09-15T12:48:58Z

Given that the title of this issue has been dealt with, I'm going to close it. Further discussions can happen on their own tickets.

clintongormley added >enhancement blocker :Plugin Lang Painless v5.0.0-beta1 labels Sep 9, 2016

nik9000 self-assigned this Sep 9, 2016

nik9000 mentioned this issue Sep 12, 2016

Disable regexes by default in painless #20427

Merged

nik9000 added a commit to nik9000/elasticsearch that referenced this issue Sep 12, 2016

Disable regexes by default in painless

69bf08f

Adds a new node level, non-dynamic setting, `script.painless.regex.enabled` can be used to enable regexes. Closes elastic#20397

nik9000 closed this as completed in #20427 Sep 12, 2016

nik9000 reopened this Sep 12, 2016

nik9000 added a commit that referenced this issue Sep 12, 2016

Disable regexes by default in painless

1bd9905

Adds a new node level, non-dynamic setting, `script.painless.regex.enabled` can be used to enable regexes. Closes #20397

nik9000 added a commit that referenced this issue Sep 12, 2016

Disable regexes by default in painless

23664c1

Adds a new node level, non-dynamic setting, `script.painless.regex.enabled` can be used to enable regexes. Closes #20397

clintongormley added the v5.0.0 label Sep 14, 2016

clintongormley removed the v5.0.0-beta1 label Sep 14, 2016

nik9000 removed the blocker label Sep 14, 2016

clintongormley closed this as completed Sep 15, 2016

clintongormley added v5.0.0-beta1 and removed v5.0.0 labels Sep 15, 2016

clintongormley added :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache and removed :Plugin Lang Painless labels Feb 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable regular expressions in Painless by default #20397

Disable regular expressions in Painless by default #20397

clintongormley commented Sep 9, 2016

s1monw commented Sep 9, 2016

nik9000 commented Sep 9, 2016

nik9000 commented Sep 9, 2016

clintongormley commented Sep 9, 2016

nik9000 commented Sep 9, 2016

nik9000 commented Sep 9, 2016

jdconrad commented Sep 10, 2016 •

edited

jdconrad commented Sep 12, 2016

nik9000 commented Sep 12, 2016

jdconrad commented Sep 12, 2016 •

edited

nik9000 commented Sep 12, 2016

rjernst commented Sep 12, 2016 •

edited

nik9000 commented Sep 12, 2016

rjernst commented Sep 12, 2016

nik9000 commented Sep 12, 2016

nik9000 commented Sep 12, 2016

dakrone commented Sep 14, 2016

nik9000 commented Sep 14, 2016

clintongormley commented Sep 15, 2016

Disable regular expressions in Painless by default #20397

Disable regular expressions in Painless by default #20397

Comments

clintongormley commented Sep 9, 2016

s1monw commented Sep 9, 2016

nik9000 commented Sep 9, 2016

nik9000 commented Sep 9, 2016

clintongormley commented Sep 9, 2016

nik9000 commented Sep 9, 2016

nik9000 commented Sep 9, 2016

jdconrad commented Sep 10, 2016 • edited

jdconrad commented Sep 12, 2016

nik9000 commented Sep 12, 2016

jdconrad commented Sep 12, 2016 • edited

nik9000 commented Sep 12, 2016

rjernst commented Sep 12, 2016 • edited

nik9000 commented Sep 12, 2016

rjernst commented Sep 12, 2016

nik9000 commented Sep 12, 2016

nik9000 commented Sep 12, 2016

dakrone commented Sep 14, 2016

nik9000 commented Sep 14, 2016

clintongormley commented Sep 15, 2016

jdconrad commented Sep 10, 2016 •

edited

jdconrad commented Sep 12, 2016 •

edited

rjernst commented Sep 12, 2016 •

edited