Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Allow limiting search to certain data tiers #108264

Open
nik9000 opened this issue May 3, 2024 · 2 comments
Open

ESQL: Allow limiting search to certain data tiers #108264

nik9000 opened this issue May 3, 2024 · 2 comments
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@nik9000
Copy link
Member

nik9000 commented May 3, 2024

Description

Kibana's discover has a setting that lets you limit searching to certain tiers (hot, warm, cold, frozen, whatever). That's something ESQL should at least be able to work with. And we should be able to provide the same speed savings for it - namely, we should make sure that such limits prevent us from hitting shards on that tier at all.

We should probably also be able to reference the tier as a bit of index metadata so you can view it in the language. And, likely filter on it there as well.

@nik9000 nik9000 added >enhancement needs:triage Requires assignment of a team area label :Analytics/ES|QL AKA ESQL labels May 3, 2024
@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) and removed needs:triage Requires assignment of a team area label labels May 3, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@bpintea
Copy link
Contributor

bpintea commented May 13, 2024

I've opened #108558, exposing the _tier metadata field might then make it easy to apply the restriction.
Currently this should be doable by using this field in the filter param.

jasonroy7dct added a commit to jasonroy7dct/elasticsearch that referenced this issue May 31, 2024
jasonroy7dct added a commit to jasonroy7dct/elasticsearch that referenced this issue May 31, 2024
vitaliidm added a commit to elastic/kibana that referenced this issue Jul 24, 2024
…a advanced settings (#186908)

## Summary

- addresses elastic/security-team#9228
- introduces new Kibana advanced settings option
`securitySolution:excludedDataTiersForRuleExecution`, that allows to
exclude cold and frozen data tiers from search during rule execution
  - users would be able to add `data_cold` or/and `data_frozen` tiers
- **ES|QL** rule does not support this feature:
elastic/elasticsearch#108264
- **Machine learning** rule does not support this feature
- Advanced setting available only for ESS

### UI

<img width="2300" alt="Screenshot 2024-07-04 at 17 31 34"
src="https://github.com/elastic/kibana/assets/92328789/39beeda3-8030-4943-959c-53eb064fe5ae">


### Demo

1. Checking there are 3M+ documents in cold data tier of `test-frozen`
index
2. When rule executes, it generates alerts.
3. Checking kibana ancestor index of generated alert - it's
`restored-test-frozen-000001`, which confirms alert was created from a
document in cold tier
4. In advanced settings exlcude `data_cold` tier
5. Execute rule again, observe no alerts were created


https://github.com/elastic/kibana/assets/92328789/c8b2f612-628a-452d-98e5-555c2e89d957

### How to test

Create a deployment with cold and frozen data tiers and use following
commands to create index and ILM

<details>
<summary>Data tiers commands</summary>

```JSON

PUT /_cluster/settings
{
  "persistent": {
    "indices.lifecycle.poll_interval": "1m"
  }
}


PUT /_ilm/policy/filtering_data_tiers
{
  "policy": {
    "phases": {
        "frozen": {
          "min_age": "10m",
          "actions": {
            "searchable_snapshot": {
              "snapshot_repository": "found-snapshots",
              "force_merge_index": true
            }
          }
        },
        "cold": {
          "min_age": "1m",
          "actions": {
            "searchable_snapshot": {
              "snapshot_repository": "found-snapshots",
              "force_merge_index": true
            },
            "set_priority": {
              "priority": 0
            }
          }
        },
        "hot": {
          "min_age": "0ms",
          "actions": {
            "set_priority": {
              "priority": 100
            }
          }
        }
    }
  }
}


PUT /_index_template/filtering_data_tiers_template
{
  "index_patterns": [
    "filtering_data_tiers*"
  ],
  "template": {
    "settings": {
      "index.lifecycle.name": "filtering_data_tiers",
      "index.lifecycle.rollover_alias": "test-filtering_data_tiers"
    },
    "mappings": {
      "_meta": {
        "version": "1.6.0"
      },
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "host": {
          "properties": {
            "name": {
              "type": "keyword",
              "ignore_above": 1024
            }
          }
        }
      }
    }
  }
}

PUT /filtering_data_tiers-000001
{
  "aliases": {
    "filtering_data_tiers": {
      "is_write_index": true
    }
  }
}


POST filtering_data_tiers/_doc
{
  "@timestamp": "2024-07-08T17:00:01.000Z",
  "host.name": "test-0"
}


```

</details>

**OR**
reach out to @vitaliidm to get access to already existing deployment/pR
deployment, where `test-frozen` index has cold and frozen nodes and ILM
policy that move any data to a tier according to config.

Check number of documents in tier by

```JSON
GET test-frozen/_count
{
    "query": {
     "bool": {
        "must": {
          "terms": {
            "_tier": ["data_cold"]
          }
        }
     }
   }
}
```

Create rule of supported type and query that index


### Checklist

- [x] Functional changes are covered with a test plan and automated
tests.
  - elastic/security-team#9896

- [x] Comprehensive manual testing is done by two engineers: the PR
author and one of the PR reviewers. Changes are tested in both ESS and
Serverless.

- [x] Functional changes are communicated to the Docs team. A ticket or
PR is opened in https://github.com/elastic/security-docs. The following
information is included: any feature flags used, affected environments
(Serverless, ESS, or both).
  - elastic/security-docs#5483
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

3 participants