Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a "start_after" parameter or something to Transforms #88646

Closed
benwtrent opened this issue Jul 20, 2022 · 3 comments
Closed

Add a "start_after" parameter or something to Transforms #88646

benwtrent opened this issue Jul 20, 2022 · 3 comments
Assignees
Labels
>enhancement :ml/Transform Transform Team:ML Meta label for the ML team

Comments

@benwtrent
Copy link
Member

Description

A very common use case for transforms is pivoting historical data. But, when there is a large amount of past data, that data may not be very useful in the resulting index.

User's typically achieve this by adding a query filter to the transform:

{"range":{"timestamp": {"gte": "now-1d"}}}

But this may introduce additional issues around

  • search request caching
  • strange interactions with frequency, change detections && date histogram

What the user usually wants is:

  • When handling the initial batch of data (all historical), only worry about the last day
  • Then, take over when running continuously when data changes

It would be very useful for a new parameter to _start or something to indicate that the initial batch of the transform should be restricted to after some page of results, and then when running continuously don't worry about that filter any longer.

@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jul 20, 2022
@mark-vieira mark-vieira added the Team:ML Meta label for the ML team label Jul 20, 2022
@elasticsearchmachine elasticsearchmachine removed the Team:ML Meta label for the ML team label Jul 20, 2022
@mark-vieira mark-vieira added the Team:ML Meta label for the ML team label Jul 20, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Jul 20, 2022
@mikeh-elastic
Copy link

We are simulating this feature with starting with a query for recent data and then will update the transform to relax the time to a much longer interval or no time query at all. It would be nice to have this as a core feature of transforms.

@droberts195
Copy link
Contributor

We added a from parameter in #91116.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :ml/Transform Transform Team:ML Meta label for the ML team
Projects
None yet
Development

No branches or pull requests

6 participants