Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch enable Point In Time based searches #30824

Merged
merged 16 commits into from Apr 30, 2024

Conversation

prodriguezdefino
Copy link
Contributor

@prodriguezdefino prodriguezdefino commented Apr 2, 2024

This change adds a BoundedReader implementation based on the newer Point In Time search API (see doc). This API is the recommended mode when reading data from large indexes that imply retrieving documents using deep iteration (more than 10000 hits per slice).

This mode should be available for Elasticsearch clusters with version 8 and up.

When using the default read implementation, based on Scrolls, when the index is big enough and the number of documents is large, the creation of slices with more than 10000 elements to be iterated would make the read operations to fail. Now, with the PIT implementation, this is not a problem since ES does not have to keep track of the whole state of the index while iterating, only the current window for the PIT iterator.

Adding @lord-skinner from Elastic.co for visibility.

@prodriguezdefino
Copy link
Contributor Author

Run Java_ElasticSearch_IO_Direct PreCommit

@prodriguezdefino
Copy link
Contributor Author

Run Java_ElasticSearch_IO_Direct PreCommit

@prodriguezdefino
Copy link
Contributor Author

Run Java_ElasticSearch_IO_Direct PreCommit

@prodriguezdefino prodriguezdefino marked this pull request as ready for review April 10, 2024 06:21
Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @Abacn for label java.
R: @johnjcasey for label io.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

Copy link
Contributor

Reminder, please take a look at this pr: @Abacn @johnjcasey

@Abacn
Copy link
Contributor

Abacn commented Apr 17, 2024

sorry for delay, taking a look

@Abacn Abacn merged commit 9612fe1 into apache:master Apr 30, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants