Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

require "lang" when using inline scripting #20122

Closed
djschny opened this issue Aug 23, 2016 · 21 comments

Comments

Projects
None yet
10 participants
@djschny
Copy link
Contributor

commented Aug 23, 2016

Currently the lang option when using an inline script is optional and depending on the version of Elasticsearch, it defaults to a particular language. The issue is to represent the merit of instead requiring the lang option always. I will offer up some initial benefits that I can see with this:

  • consistency - for Stored Scripts (previously Indexed Scripts) and file based scripts the language is required in some form or another (URL path, filename, etc.). But it is not for inline scripts. I'm not sure if there is a historical reason for this or not.
  • expectation - if it is required then a user will know what to expect when their script is run; for simple scripts it is possible the syntax may work in multiple languages but is that desirable? When an engineer is writing a script they probably know what language they are writing for so why not have this be explicit? Especially as a script may evolve in the future as application requirements change
  • language changes - thinking forward as languages change if there are certain items that are not backwards compatible then it would give the option for a user to control this as they convert/migrate their scripts. For example painless and the painless-v2

Just some initial options/rationale in regards to why I had mentioned it on the original PR of where the topic came up.

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2016

I like that it would make the change of default language less trappy and help debug slow queries since we would figure out the script language being used by just looking at the query instead of having to ask what the script.default_lang is.

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2016

I also think that being more verbose here and explicit is actually not just desirable i think it's also required. Think of a percolator query that has a script in it (I am not saying you should do that) - with a default language that is mutable we might fail the execution after upgrade. I also think it's very very trappy to have it as a node level setting that can be different per node.

@clintongormley

This comment has been minimized.

Copy link
Member

commented Aug 23, 2016

I'm +1 on doing this. Should we perhaps then leave groovy as the default value for existing scripts in percolator and watcher, for bwc?

@rjernst

This comment has been minimized.

Copy link
Member

commented Aug 23, 2016

If we make it required, why would there be a default?

@rmuir

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2016

Please don't make groovy a default to anything.

Remember how it resulted in remote execution simply by being on the classpath at all? We should push it out to plugins, then out of here completely, as soon as possible.

its large, its slow, its trappy, its insecure, its good-for-nothing.

@rmuir

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2016

I am -1 to this issue myself, because it only tends to lead us back towards groovy. Lets just have painless as the default.

@clintongormley

This comment has been minimized.

Copy link
Member

commented Aug 23, 2016

If we make it required, why would there be a default?

I mean that we should remove the node-level default scripting language setting, but then only for existing scripts in watcher and the percolator that don't specify a language (in other words are already using groovy), we allow those to continue using groovy.

@rjernst

This comment has been minimized.

Copy link
Member

commented Aug 23, 2016

we allow those to continue using groovy.

That defeats all the work that was done to add groovy features to painless to make migration seamless. If the user must manually do some work to change to painless, they won't, and they will still be hit by trappy performance issues and insecurity.

@rmuir

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2016

Since we agreed to clarify, my -1 is a veto. My technical justifications are already public as CVEs.

@jdconrad

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2016

We have worked very hard to make a embedded language that makes sense for a search engine that can be used remotely without getting root-kitted, Groovy is the exact opposite of those things. Why would we switch it back to be the default again? Anyone upgrading to 5.0 is going to have some work to do if they have any complexities in their settings unrelated to scripting, so scripting would be another thing they have to upgrade appropriately.

@clintongormley

This comment has been minimized.

Copy link
Member

commented Aug 23, 2016

Why would we switch it back to be the default again?

I am talking only about scripts which already exist in the index, and which don't specify a language. They are already in groovy, so the user either has to reindex them to add the lang or rewrite them after upgrading. ONLY for these existing scripts would I consider the default to be groovy. All other scripts would require a lang parameter.

@jdconrad

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2016

But for those users when upgrading they can just switch the default lang parameter to Groovy, I thought that was sort of the compromised work around here as we specified in the migration docs.

@rjernst

This comment has been minimized.

Copy link
Member

commented Aug 23, 2016

All the work to make painless more inline with groovy was done explicitly to allow easier upgrading. If a script only uses features of groovy which are also in painless, then there is no work necessary if we continue having a default (but there is a lot of benefit, both in performance and security, which was the point of developing painless in the first place). If we remove having a default, then they must make a choice, but that choice will tend to be "just put in groovy" because that is what the default was before (and hence Robert's disinclination I believe). I'm inclined to think we should only remove the default once groovy is gone, to further dissuade users from falling into the trap-in-a-box called groovy.

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2016

several aspects:

  • this is orthogonal to the default changing from groovy to painless
  • don't take things personal as if anybody questioned the move to painless, nobody did
  • we have a real problem that is with scripts that have no explicit lang set in the percolator / watcher and we need to fix it.
  • the default settings is trappy as hell as it needs to be per node and consistent. WE NEED TO FIX THIS.
  • if we do anything here, what we do is up in the air but it's unrelated to the default change.

we all calm down here we are discussing options, we have found some real issues here that are entirely unrelated to groovy, painless, javascript, haskell whatever. We can name options for BWC even if they are named groovy. By no means its anything we decided or even consider to put in a PR at this point but lets brainstorm what we do.

@rmuir

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2016

Its not orthogonal at all.

its both linked from and created by the issue to change the default. Also it continues the trend of making it impossible to fix anything here, even if that thing causes remote execution vulnerabilities, has slow performance traps, doesnt matter.

As I said, I'm -1 to the issue. That means I will work very very very hard to prevent it. At least as hard as you guys are working against us to fix these issues. Two can play at this game.

@rjernst

This comment has been minimized.

Copy link
Member

commented Aug 23, 2016

the default settings is trappy as hell as it needs to be per node and consistent. WE NEED TO FIX THIS.

I agree with this, the default lang setting needs to be a cluster setting, not a node setting. I can work on a PR for that.

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2016

As I said, I'm -1 to the issue. That means I will work very very very hard to prevent it. At least as hard as you guys are working against us to fix these issues. Two can play at this game.

this is not a game and I am not playing with you. Find somebody else.

@rmuir

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2016

It sure seems like one, that's why elasticsearch was repeatedly remote-CVE'd.

What should have happened is, nearly immediately this crazy broken feature should have been dropped, falling back to only what was safe (expressions).

Instead, various attempts were made to preserve it at all costs: a sandbox was added but that was easily foiled (CVE-2015-1427), then it was "disabled" but that also did not prevent the problems (CVE-2015-5377), users could be hacked simply by it being on the classpath.

Besides those issues (which should have been enough), there are constantly performance issues around the thing among other problems.

This is the penultimate broken feature. We should have simply replaced it with expressions ASAP (#10491), but instead it seemed we had to move mountains before we could remove the broken feature.

And move mountains @jdconrad did! Yet still here we are, despite all that work, unable to do the right thing.

@eskibars

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2016

Maybe I'm missing something, but I don't understand how only switching the default language does anything but make things more trappy. You had a script that used to work in groovy and now it suddenly doesn't. Or maybe it does... Or worst possible, maybe it works 99% of the time because we didn't make 100% feature and syntax parity a design specification of Painless (not that we should have!) and some operational divergence is only hit in the script e.g. 1 out of a million docs. For those reasons, even if Painless looks and feels like Groovy the vast majority of the time, switching existing Groovy scripts behind the scenes to execute in Painless seems super trappy to me unless we could somehow essentially guarantee exactly the same results.

@rmuir

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2016

So we are stuck with groovy forever now because bad decisions were made in the past and despite our software still being insanely immature, we'd like to pretend our company is so big that we cant fix such things? Screw that. I wont let it happen. Follow the cve links i provided and start to educate yourself on the technicals. Unauthenticated remote code execution is the worst of the worst and we let it happen time and time again, because of folks that are clueless o the security issues at hand.

Stop fighting me here, the technicals are cleqr and that is what we must do. You cant not have authentication and support features like this too without serious work. Browser companies and shockwave and java applets and all of it tell how that story goes. It never works out.

Im not concerned about the end user experience whatsoever here. The situation is dire and the only alternative i see to painless is removing this scripting stuff completely.

@martijnvg

This comment has been minimized.

Copy link
Member

commented Sep 2, 2016

This was discussed during fixitfriday and came up with the following:

  • Upon storing / serialising the script the language is always going to be persisted. (even if the default language is used)
  • The lang parameter is not going to be required upon creation of the script, and will always default to painless (hardcoded). The script.default_lang setting will disappear.
  • The legacy default lang for already stored scripts (for example in percolator query) is going to be groovy. This would actually help upgrading to ES 5 and this default for already stored scripts will be removed later. For these scripts the default can be configured via script.legacy.default_lang setting and defaults groovy.

@martijnvg martijnvg closed this in 245882c Sep 6, 2016

MaineC pushed a commit to MaineC/elasticsearch that referenced this issue Sep 7, 2016

* Removed `script.default_lang` setting and made `painless` the hardc…
…oded default script language.

** The default script language is now maintained in `Script` class.
* Added `script.legacy.default_lang` setting that controls the default language for scripts that are stored inside documents (for example percolator queries).  This defaults to groovy.
** Added `QueryParseContext#getDefaultScriptLanguage()` that manages the default scripting language. Returns always `painless`, unless loading query/search request in legacy mode then the returns what is configured in `script.legacy.default_lang` setting.
** In the aggregation parsing code added `ParserContext` that also holds the default scripting language like `QueryParseContext`. Most parser don't have access to `QueryParseContext`. This is for scripts in aggregations.
* The `lang` script field is always serialized (toXContent).

Closes elastic#20122
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.