Skip to content

Commit 006b3c2

Browse files
committed
[DOCS] Fix scrolling example
Closes #553
1 parent 01f9a06 commit 006b3c2

File tree

1 file changed

+24
-19
lines changed

1 file changed

+24
-19
lines changed

docs/search-operations.asciidoc

Lines changed: 24 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -219,11 +219,18 @@ $results = $client->search($params);
219219
{zwsp} +
220220

221221

222-
=== Scan/Scroll
222+
=== Scrolling
223223

224-
The Scan/Scroll functionality of Elasticsearch is similar to search, but different in many ways. This initiates a "scan window" which will remain open for the duration of the scan. This allows proper, consistent pagination.
224+
The Scrolling functionality of Elasticsearch is used to paginate over many documents in a bulk manner, such as exporting
225+
all the documents belonging to a single user. It is more efficient than regular search because it doesn't need to maintain
226+
an expensive priority queue ordering the documents.
225227

226-
Once a scan window is open, you may start `_scrolling` over that window. This returns results matching your query... but returns them in random order. This random ordering is important to performance. Deep pagination is expensive when you need to maintain a sorted, consistent order across shards. By removing this obligation, Scan/Scroll can efficiently export all the data from your index.
228+
Scrolling works by maintaining a "point in time" snapshot of the index which is then used to page over.
229+
This window allows consistent paging even if there is background indexing/updating/deleting. First, you execute a search
230+
request with `scroll` enabled. This returns a "page" of documents, and a scroll_id which is used to continue
231+
paginating through the hits.
232+
233+
More details about scrolling can be found in the https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html[Link: reference documentation].
227234

228235
This is an example which can be used as a template for more advanced operations:
229236

@@ -241,29 +248,27 @@ $params = [
241248
]
242249
];
243250
244-
$docs = $client->search($params); // Execute the search
245-
$scroll_id = $docs['_scroll_id']; // The response will contain no results, just a _scroll_id
251+
// Execute the search
252+
// The response will contain the first batch of documents
253+
// and a scroll_id
254+
$response = $client->search($params);
246255
247256
// Now we loop until the scroll "cursors" are exhausted
248-
while (\true) {
257+
while (isset($response['hits']['hits']) && count($response['hits']['hits']) > 0) {
258+
259+
// **
260+
// Do your work here, on the $response['hits']['hits'] array
261+
// **
262+
263+
// When done, get the new scroll_id
264+
// You must always refresh your _scroll_id! It can change sometimes
265+
$scroll_id = $response['_scroll_id'];
249266
250-
// Execute a Scroll request
267+
// Execute a Scroll request and repeat
251268
$response = $client->scroll([
252269
"scroll_id" => $scroll_id, //...using our previously obtained _scroll_id
253270
"scroll" => "30s" // and the same timeout window
254271
]
255272
);
256-
257-
// Check to see if we got any search hits from the scroll
258-
if (count($response['hits']['hits']) > 0) {
259-
// If yes, Do Work Here
260-
261-
// Get new scroll_id
262-
// Must always refresh your _scroll_id! It can change sometimes
263-
$scroll_id = $response['_scroll_id'];
264-
} else {
265-
// No results, scroll cursor is empty. You've exported all the data
266-
break;
267-
}
268273
}
269274
----

0 commit comments

Comments
 (0)