You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Scan/Scroll functionality of Elasticsearch is similar to search, but different in many ways. This initiates a "scan window" which will remain open for the duration of the scan. This allows proper, consistent pagination.
224
+
The Scrolling functionality of Elasticsearch is used to paginate over many documents in a bulk manner, such as exporting
225
+
all the documents belonging to a single user. It is more efficient than regular search because it doesn't need to maintain
226
+
an expensive priority queue ordering the documents.
225
227
226
-
Once a scan window is open, you may start `_scrolling` over that window. This returns results matching your query... but returns them in random order. This random ordering is important to performance. Deep pagination is expensive when you need to maintain a sorted, consistent order across shards. By removing this obligation, Scan/Scroll can efficiently export all the data from your index.
228
+
Scrolling works by maintaining a "point in time" snapshot of the index which is then used to page over.
229
+
This window allows consistent paging even if there is background indexing/updating/deleting. First, you execute a search
230
+
request with `scroll` enabled. This returns a "page" of documents, and a scroll_id which is used to continue
231
+
paginating through the hits.
232
+
233
+
More details about scrolling can be found in the https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html[Link: reference documentation].
227
234
228
235
This is an example which can be used as a template for more advanced operations:
229
236
@@ -241,29 +248,27 @@ $params = [
241
248
]
242
249
];
243
250
244
-
$docs = $client->search($params); // Execute the search
245
-
$scroll_id = $docs['_scroll_id']; // The response will contain no results, just a _scroll_id
251
+
// Execute the search
252
+
// The response will contain the first batch of documents
253
+
// and a scroll_id
254
+
$response = $client->search($params);
246
255
247
256
// Now we loop until the scroll "cursors" are exhausted
248
-
while (\true) {
257
+
while (isset($response['hits']['hits']) && count($response['hits']['hits']) > 0) {
258
+
259
+
// **
260
+
// Do your work here, on the $response['hits']['hits'] array
261
+
// **
262
+
263
+
// When done, get the new scroll_id
264
+
// You must always refresh your _scroll_id! It can change sometimes
0 commit comments