Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

6.7: Remove IP fields from default_field in Elasticsearch template #11399

Merged
merged 2 commits into from
Mar 25, 2019

Conversation

cwurm
Copy link
Contributor

@cwurm cwurm commented Mar 22, 2019

#11035 added IP fields to the default_field array in Elasticsearch templates. Unfortunately, this has the side effect of breaking simple queries directly against Elasticsearch:

GET packetbeat-*/_search
{ 
  "query": {
    "query_string": {
      "query": "foo"
    }  
  }
}

Leads to:

{
  "error": {
    "root_cause": [
      {
        "type": "query_shard_exception",
        "reason": "failed to create query: {\n  \"query_string\" : {\n    \"query\" : \"foo\",\n    \"fields\" : [ ],\n    \"type\" : \"best_fields\",\n    \"default_operator\" : \"or\",\n    \"max_determinized_states\" : 10000,\n    \"enable_position_increments\" : true,\n    \"fuzziness\" : \"AUTO\",\n    \"fuzzy_prefix_length\" : 0,\n    \"fuzzy_max_expansions\" : 50,\n    \"phrase_slop\" : 0,\n    \"escape\" : false,\n    \"auto_generate_synonyms_phrase_query\" : true,\n    \"fuzzy_transpositions\" : true,\n    \"boost\" : 1.0\n  }\n}",
        "index_uuid": "50TUmx-qT4et8BprFf8n4w",
        "index": "packetbeat-6.7.0-2019.03.22"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "packetbeat-6.7.0-2019.03.22",
        "node": "P4VMxQ5zTqaTp1KYV4pFOg",
        "reason": {
          "type": "query_shard_exception",
          "reason": "failed to create query: {\n  \"query_string\" : {\n    \"query\" : \"foo\",\n    \"fields\" : [ ],\n    \"type\" : \"best_fields\",\n    \"default_operator\" : \"or\",\n    \"max_determinized_states\" : 10000,\n    \"enable_position_increments\" : true,\n    \"fuzziness\" : \"AUTO\",\n    \"fuzzy_prefix_length\" : 0,\n    \"fuzzy_max_expansions\" : 50,\n    \"phrase_slop\" : 0,\n    \"escape\" : false,\n    \"auto_generate_synonyms_phrase_query\" : true,\n    \"fuzzy_transpositions\" : true,\n    \"boost\" : 1.0\n  }\n}",
          "index_uuid": "50TUmx-qT4et8BprFf8n4w",
          "index": "packetbeat-6.7.0-2019.03.22",
          "caused_by": {
            "type": "illegal_argument_exception",
            "reason": "'foo' is not an IP string literal."
          }
        }
      }
    ]
  },
  "status": 400
}

Kibana avoids this by adding lenient: true:

The lenient parameter can be set to true to ignore exceptions caused by data-type mismatches, such as trying to query a numeric field with a text query string. Defaults to false.
https://www.elastic.co/guide/en/elasticsearch/reference/6.7/query-dsl-match-query.html:

So it's never a problem for Kibana, and the workaround for the above query is to just add it:

GET _search
{ 
  "query": {
    "query_string": {
      "query": "foo",
      "lenient": true
    }  
  }
}

This PR simply removes the IP fields again to restore previous functionality and avoid breaking any existing queries and dashboards.

But we're stuck between a rock and a hard place here: Adding IP fields is clearly valuable to allow users to just paste an IP into the KQL bar (and into a query_string query for that matter), but we cannot do it without breaking simple queries because lenient defaults to false. :-(

I'll open the same PR for 7.0.

@cwurm
Copy link
Contributor Author

cwurm commented Mar 22, 2019

We might also want to keep this as is if nothing that's shipped is actually broken.

The templates are specifically generated to work within Kibana, and Kibana does not seem to have a problem. That when directly querying Elasticsearch indexes created from Beats templates it's now necessary to add lenient: true is not ideal, but it might be ok - they were not quite designed for that, and it's easy to fix any queries. The tradeoff of making it possible to search for IPs in the KQL bar seems worthy. It might be a breaking change though, so we might only want to do it in 7.0.

What do people think?

@webmat
Copy link
Contributor

webmat commented Mar 22, 2019

Note that a properly populated related.ip may help address this issue, without any changes to query execution options.

@cwurm
Copy link
Contributor Author

cwurm commented Mar 22, 2019

Note that a properly populated related.ip may help address this issue, without any changes to query execution options.

@webmat Isn't related.ip an IP field as well, so it would not be searched by default without the change?

@ruflin
Copy link
Member

ruflin commented Mar 25, 2019

jenkins, test this

@ruflin
Copy link
Member

ruflin commented Mar 25, 2019

I think we should definitively revert this change for 6.7. I would consider it a bug here as it's an unintended breaking change which could cause issues when users updated from 6.6 to 6.7. We could introduce this in 7.0 potentially so I suggest we move this discussion there.

@ruflin
Copy link
Member

ruflin commented Mar 25, 2019

jenkins, test this

ruflin pushed a commit that referenced this pull request Mar 25, 2019
…11400)

Removes IP fields from the `default_field` array in the generated Elasticsearch template.

Further details in the 6.7 PR: #11399
@ruflin
Copy link
Member

ruflin commented Mar 25, 2019

Failing test in winlogbeat is not related. @adriansr will open a follow up PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants