Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement field aliases as runtime fields #87969

Open
javanna opened this issue Jun 23, 2022 · 8 comments
Open

Reimplement field aliases as runtime fields #87969

javanna opened this issue Jun 23, 2022 · 8 comments
Labels
>enhancement :Search/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team >tech debt

Comments

@javanna
Copy link
Member

javanna commented Jun 23, 2022

Support for field aliases was added before runtime fields existed. Field aliases are mapped under properties, can only point to a field mapped under properties, and cannot be modified/removed. Also, field aliases don't play well with multi_fields.

Via runtime field, it is already possible to manually create a field alias: create a runtime field with a small script that returns the value of the field that needs to be aliased. This is much more flexible than declaring a field alias as it solves the two issues above: it can be removed (see #36418), and it can point to any field (runtime as well as indexed). Additionally, it can be declared in the search request. While this is already possible, it is too manual and we would rather like to streamline the notion of alias as part of the runtime section, without requiring to specify an alias.

There are some backwards compatibility concerns in case we want to go ahead and remove field aliases defined under properties, that could probably be dealt with separately. Ideally, we add support for aliases to the runtime section and the existing field aliases can shortcut to the new implementation?

@javanna javanna added >enhancement :Search/Mapping Index mappings, including merging and defining field types labels Jun 23, 2022
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Jun 23, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@inqueue
Copy link
Member

inqueue commented Sep 15, 2022

I have a compound query from a Kibana dashboard showing a fairly slow took time with a runtime field. The query responds timely when either the multi_match or runtime_mappings are removed, and not so timely when both are used. The query has been simplified for testing:

Request: Runtime Field with Multi Match

POST /my_index*/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "multi_match": {
            "query": "d4e75afa-fc01-42c1-a63f-3c92a1289ab1",
            "fields": [],
            "type": "best_fields",
            "operator": "OR",
            "slop": 0,
            "prefix_length": 0,
            "max_expansions": 50,
            "lenient": true,
            "zero_terms_query": "NONE",
            "auto_generate_synonyms_phrase_query": true,
            "fuzzy_transpositions": true,
            "boost": 1.0
          }
        },
        {
          "range": {
            "@timestamp": {
              "from": "2021-09-14T00:00:00.000Z",
              "to": "2022-09-14T13:57:43.153Z",
              "include_lower": true,
              "include_upper": true,
              "format": "strict_date_optional_time",
              "boost": 1.0
            }
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1.0
    }
  },
  "stored_fields": "*",
  "runtime_mappings": {
    "fields.env_id": {
      "type": "keyword",
      "script": {
        "source": "if (doc['meta.tag_env-id'].size() != 0){\n    emit(doc['meta.tag_env-id'].value);\n}"
      }
    }
  }
}
Response
{
  "took": 75818,
  "timed_out": false,
  "_shards": {
    "total": 73,
    "successful": 73,
    "skipped": 12,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  }
}

Request: Multi Match, No Runtime Field

POST /my_index*/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "multi_match": {
            "query": "d4e75afa-fc01-42c1-a63f-3c92a1289ab1",
            "fields": [],
            "type": "best_fields",
            "operator": "OR",
            "slop": 0,
            "prefix_length": 0,
            "max_expansions": 50,
            "lenient": true,
            "zero_terms_query": "NONE",
            "auto_generate_synonyms_phrase_query": true,
            "fuzzy_transpositions": true,
            "boost": 1.0
          }
        },
        {
          "range": {
            "@timestamp": {
              "from": "2021-09-14T00:00:00.000Z",
              "to": "2022-09-14T13:57:43.153Z",
              "include_lower": true,
              "include_upper": true,
              "format": "strict_date_optional_time",
              "boost": 1.0
            }
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1.0
    }
  },
  "stored_fields": "*"
}
Response
{
  "took": 440,
  "timed_out": false,
  "_shards": {
    "total": 73,
    "successful": 73,
    "skipped": 12,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  }
}

Request: Runtime Field, No Multi-Match

POST /my_index*/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "@timestamp": {
              "from": "2021-09-14T00:00:00.000Z",
              "to": "2022-09-14T13:57:43.153Z",
              "include_lower": true,
              "include_upper": true,
              "format": "strict_date_optional_time",
              "boost": 1.0
            }
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1.0
    }
  },
  "stored_fields": "*",
  "runtime_mappings": {
    "fields.env_id": {
      "type": "keyword",
      "script": {
        "source": "if (doc['meta.tag_env-id'].size() != 0){\n    emit(doc['meta.tag_env-id'].value);\n}"
      }
    }
  }
}
Response
{
  "took": 19,
  "timed_out": false,
  "_shards": {
    "total": 73,
    "successful": 73,
    "skipped": 12,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 10000,
      "relation": "gte"
    },
    "max_score": null,
    "hits": []
  }
}

@javanna
Copy link
Member Author

javanna commented Sep 19, 2022

@inqueue I have some ideas, but why have you commented on this existing issue? I don't see the link between what you bring up and what the issue proposes to do.

@inqueue
Copy link
Member

inqueue commented Sep 19, 2022

@inqueue I have some ideas, but why have you commented on this existing issue? I don't see the link between what you bring up and what the issue proposes to do.

Hi @javanna, my comment is an expressed concern about the proposal outlined in this issue and I thought it best to bring it up here. Perhaps it would be better to address the concern separately.

@javanna
Copy link
Member Author

javanna commented Sep 19, 2022

@inqueue can you help me understand your concern with the proposal of re-implementing field aliases as runtime fields? What does that have to do with the recreation you posted above? Do you need help understanding what is causing the slowness there?

@inqueue
Copy link
Member

inqueue commented Sep 19, 2022

@javanna the concern I have is there seems to be a scenario, like in my example, where using runtime fields in a way that mimics field aliases introduces a large query performance hit, one that is painfully felt while clicking through Kibana dashboards. That is, a multi-match query with a runtime field performs much slower than the same query without the runtime field. I do need help getting to the cause of the slowness which I think is something we should do outside of this issue.

@javanna
Copy link
Member Author

javanna commented Sep 21, 2022

@inqueue I see, thanks for clarifying. The proposal of this issue is to move field aliases to the runtime section, allow for more flexibility (another runtime field can be aliases, field aliases can then be removed), but without the use of scripting, meaning the actual implementation wouldn't be much different from the current one.

@felixbarny
Copy link
Member

In addition to

Maybe this could also enable to define an alias to make two fields equivalent to one another. This would make it a lot easier to handle cases where a field was renamed and we have both existing and new incoming data with a mix of the old and the new name. Existing queries targeting the old name would then just work on both new and old data. New queries, using the new field names, could then also query the old data.

However, I'm not sure if the runtime section of the mappings is the best place to define such bi-directional aliases. That's because the runtime section is basically a map where the key is a field name. But for this, we'd want to specify multiple field names that should be treated equivalently to one another. The field names may also exist in the mappings section, as indexed fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team >tech debt
Projects
None yet
Development

No branches or pull requests

4 participants