Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

5.0: pipeline painless script error doesn't fail fast #21842

Closed
niemyjski opened this issue Nov 29, 2016 · 6 comments
Closed

5.0: pipeline painless script error doesn't fail fast #21842

niemyjski opened this issue Nov 29, 2016 · 6 comments
Labels
:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement

Comments

@niemyjski
Copy link
Contributor

I'm in the process of converting a groovy script to painless (previously run as a transform). One thing I noticed was that even through the script was incorrect it was accepted. Part of the mantra for 5.x is fail fast so as to give you a good dev experience.

PUT _ingest/pipeline/events-pipeline
{
  "processors": [
    {
      "script": {
        "inline": "\r\nif (!ctx._source.containsKey('data') || !(ctx._source.data.containsKey('@error') || ctx._source.data.containsKey('@simple_error')))\r\n    return\r\n\r\ndef types = []\r\ndef messages = []\r\ndef codes = []\r\ndef err = ctx._source.data.containsKey('@error') ? ctx._source.data['@error'] : ctx._source.data['@simple_error']\r\ndef curr = err\r\nwhile (curr != null) {\r\n    if (curr.containsKey('type'))\r\n        types.add(curr.type)\r\n    if (curr.containsKey('message'))\r\n        messages.add(curr.message)\r\n    if (curr.containsKey('code'))\r\n        codes.add(curr.code)\r\n    curr = curr.inner\r\n}\r\n\r\nerr['all_types'] = types.join(' ')\r\nerr['all_messages'] = messages.join(' ')\r\nerr['all_codes'] = codes.join(' ')"
      }
    }
  ]
}

I don't see a failure until I try to ingest data.

PUT /test-events-201502/events/54dbc16ca0f5c61398427b00?op_type=create&pipeline=events-pipeline
{
  "id": "54dbc16ca0f5c61398427b00",
  "organization_id": "1ecd0826e447ad1e78877555",
  "project_id": "1ecd0826e447ad1e78877ab2",
  "stack_id": "1ecd0826e447a44e78877ab1",
  "is_first_occurrence": false,
  "is_fixed": true,
  "is_hidden": false,
  "created_utc": "2015-02-11T20:54:04.4082284",
  "type": "log",
  "source": "GET /Print",
  "date": "2015-02-11T20:54:04.3457274+00:00",
  "data": {
    "@version": "1.2.3.0",
    "@level": "Error"
  }
}

With the error

{
  "error": {
    "root_cause": [
      {
        "type": "exception",
        "reason": "java.lang.IllegalArgumentException: ScriptException[compile error]; nested: IllegalArgumentException[invalid sequence of tokens near ['types'].]; nested: NoViableAltException;",
        "header": {
          "processor_type": "script"
        }
      }
    ],
    "type": "exception",
    "reason": "java.lang.IllegalArgumentException: ScriptException[compile error]; nested: IllegalArgumentException[invalid sequence of tokens near ['types'].]; nested: NoViableAltException;",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "ScriptException[compile error]; nested: IllegalArgumentException[invalid sequence of tokens near ['types'].]; nested: NoViableAltException;",
      "caused_by": {
        "type": "script_exception",
        "reason": "compile error",
        "caused_by": {
          "type": "illegal_argument_exception",
          "reason": "invalid sequence of tokens near ['types'].",
          "caused_by": {
            "type": "no_viable_alt_exception",
            "reason": null
          }
        },
        "script_stack": [
          "... r')))\r\n    return\r\n\r\ndef types = []\r\ndef messages  ...",
          "                             ^---- HERE"
        ],
        "script": "\r\nif (!ctx._source.containsKey('data') || !(ctx._source.data.containsKey('@error') || ctx._source.data.containsKey('@simple_error')))\r\n    return\r\n\r\ndef types = []\r\ndef messages = []\r\ndef codes = []\r\ndef err = ctx._source.data.containsKey('@error') ? ctx._source.data['@error'] : ctx._source.data['@simple_error']\r\ndef curr = err\r\nwhile (curr != null) {\r\n    if (curr.containsKey('type'))\r\n        types.add(curr.type)\r\n    if (curr.containsKey('message'))\r\n        messages.add(curr.message)\r\n    if (curr.containsKey('code'))\r\n        codes.add(curr.code)\r\n    curr = curr.inner\r\n}\r\n\r\nerr['all_types'] = types.join(' ')\r\nerr['all_messages'] = messages.join(' ')\r\nerr['all_codes'] = codes.join(' ')",
        "lang": "painless"
      }
    },
    "header": {
      "processor_type": "script"
    }
  },
  "status": 500
}
@niemyjski
Copy link
Contributor Author

Is there any good way to evaluate a script for syntax errors when adding a pipeline?

@rjernst
Copy link
Member

rjernst commented Nov 29, 2016

You are using an inline script. Instead, store the script, then reference it by id. See https://www.elastic.co/guide/en/elasticsearch/reference/5.0/modules-scripting-using.html#modules-scripting-stored-scripts

@niemyjski
Copy link
Contributor Author

I really don't think that should matter, when a pipeline is created any script should at least be checked to ensure it can be compiled.

On a side note any reason for not using an inline script (besides it could be evicted from memory)

@clintongormley
Copy link

I really don't think that should matter, when a pipeline is created any script should at least be checked to ensure it can be compiled.

I agree.

@clintongormley clintongormley added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache >enhancement labels Nov 29, 2016
@clintongormley
Copy link

@talevy could you take a look at this when you have time please?

@talevy
Copy link
Contributor

talevy commented Nov 29, 2016

sure. This does seem to be an oversight on our part for sure. The ScriptProcessor leverages the ScriptService's caching strategies to keep around a compiled version of each script passed in. The reason we chose not to compile the script up-front is due to any changes that may occur with respect to stored scripts, or file-based scripts. Didn't want the pipeline to have to listen to such changes.

Since inline scripts are not subject to such management, should be straightforward to treat inline scripts separately and compile it ourselves.

talevy added a commit to talevy/elasticsearch that referenced this issue Dec 14, 2016
talevy added a commit that referenced this issue Dec 15, 2016
…#21858)

Inline scripts defined in Ingest Pipelines are now compiled at creation time to preemptively catch errors on initialization of the pipeline.

Fixes #21842.
talevy added a commit that referenced this issue Dec 15, 2016
…#21858)

Inline scripts defined in Ingest Pipelines are now compiled at creation time to preemptively catch errors on initialization of the pipeline.

Fixes #21842.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement
Projects
None yet
Development

No branches or pull requests

4 participants