Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transform] Support top_metrics size param #74420

Open
sophiec20 opened this issue Jun 22, 2021 · 1 comment
Open

[Transform] Support top_metrics size param #74420

sophiec20 opened this issue Jun 22, 2021 · 1 comment
Labels
>enhancement :ml/Transform Transform Team:ML Meta label for the ML team

Comments

@sophiec20
Copy link
Contributor

Request

Include support for top_metrics.size param.

Note: This is more complicated than it seems, because it means a single input doc will produce multiple output docs in the transform destination index.

Background

Transforms top_metrics support was added in 7.14+.

To recap, adding top_metrics allows for:

  • Performance optimisations - it is more efficient to include entity attributes as top_metrics instead of including them as group_by fields. For example, if grouping a transform by customer_id, then customer first name, last name and email address can be more efficiently included as top_metrics.
  • Multi-value aggregation support - Code refactored to allow support for aggs that return multiple values, which benefits future code extensibility

The top_metrics aggregation includes a size parameter - however the implementation of top_metrics in transforms will only return the first element as described in the limitations #71850.

Currently, the size param can be specified in the transform configuration. This is passed to the elasticsearch aggregation framework, and only the first element will ever be written to the transform destination index. It would have been better from an end-user perspective for values other than size:1 to fail fast, however because validation is being performed by the aggs framework then this would have added disproportionate code complexity.

Status: Not scheduled, requires use case.

@elasticmachine elasticmachine added the Team:ML Meta label for the ML team label Jun 22, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :ml/Transform Transform Team:ML Meta label for the ML team
Projects
None yet
Development

No branches or pull requests

2 participants