Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.0] Use chunkById instead of chunk #360

Merged
merged 1 commit into from Mar 5, 2019

Conversation

@matt-allan
Copy link
Contributor

@matt-allan matt-allan commented Mar 5, 2019

Using chunkById is dramatically faster when importing millions of rows. This issue was originally reported in #182.

On a table with 1,632,576 rows, using the null scout driver so we don't have any noise from HTTP requests, an import goes from 29 minutes and 57 seconds to 28 seconds.

Before:

$ time php artisan scout:import "App\User"
All [App\User] records have been imported.
php artisan scout:import "App\User"  14.80s user 3.69s system 1% cpu 29:57.14 total

After:

$ time php artisan scout:import "App\User"
php artisan scout:import "App\User"  17.18s user 4.06s system 73% cpu 28.809 total

Concerns:

chunkById currently removes all ORDER BY clauses. If someone had called searchable like this it wouldn't work as expected anymore:

App\Order::where('price', '>', 100)->orderBy('price')->searchable();

I doubt that anyone actually needs that behavior? If they do there are a few options:

  • Make this opt in with a config setting, i.e. config('scout.chunk.by_id')
  • Make chunkById preserve order by clauses. This is explained here but it's a little confusing. I've implemented this before for a proprietary ElasticSearch indexer. It shouldn't be too difficult to add to Eloquent.
Using chunkById is dramatically faster when importing millions of rows.
@driesvints driesvints changed the title Use chunkById instead of chunk [8.0] Use chunkById instead of chunk Mar 5, 2019
@driesvints driesvints requested a review from nunomaduro Mar 5, 2019
@driesvints
Copy link
Member

@driesvints driesvints commented Mar 5, 2019

Hmm, I agree that the trade-off is definitely worth it. Wondering what @nunomaduro thinks.

@taylorotwell taylorotwell merged commit 7ec8e12 into laravel:master Mar 5, 2019
1 check passed
@matt-allan matt-allan deleted the chunk-by-id branch Mar 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants