-
Notifications
You must be signed in to change notification settings - Fork 114
[Transform] Specify use_point_in_time in settings #5322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
f8f5745
de6d072
f03d532
7765806
e6f662c
22bdb38
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -130,6 +130,16 @@ export class Settings { | |||
* @server_default 500 | ||||
*/ | ||||
max_page_search_size?: integer | ||||
/** | ||||
* Specifies whether the transform checkpoint will use the Point In Time API while searching over the source index. | ||||
* In general, Point In Time is an optimization that will reduce pressure on the source index by reducing the amount | ||||
* of refreshes and merges, but it can be expensive if a large number of Point In Times are opened and closed for a | ||||
* given index. The benefits and impact depend on the data being searched, the ingest rate into the source index, and | ||||
* the amount of other consumers searching the same source index. | ||||
* @ext_doc_id point-in-time-api | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @elastic/developer-docs Is this a correct usage of This Transform API can be configured to use the PIT API, and here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hey @pquentin, the
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems to have worked as intended: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FYI we could add a description field to the link definition in
and this would be render nicer link text than just There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you have an example of that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this could be as simple as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that it would be nice to name this "Open a point in time API" instead of "External documentation." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||
* @server_default true | ||||
*/ | ||||
use_point_in_time?: boolean | ||||
|
||||
/** | ||||
* If `true`, the transform runs in unattended mode. In unattended mode, the transform retries indefinitely in case | ||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if you set it to false? It uses the scroll API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah it just uses a search request without PIT: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/transform/src/main/java/org/elasticsearch/xpack/transform/transforms/TransformIndexer.java#L1154
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally we'd explain the pros/cons differences of each approach, or at least outline why you'd want to use the PIT, rather than just the how :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should that go on the PIT page? in either case, we'd need someone familiar with PIT to go into that detail, I'm just trying to show what users can configure rather than keeping the setting hidden
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what I missed is that transforms use composite aggregations, which offer pagination without using PIT. However, transforms use PIT by default, as it returns more correct results. That said, as it can be expensive if many PITs are open at any given time, this can be disabled. Is my understanding correct?
This is not about explaining what PIT does (especially given that we have a link), but making it clear what the setting does. For example,
max_page_search_size
is clear here!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right Quentin said it much clearer than me 😄