Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.0] Use chunkById instead of chunk #360

Merged
merged 1 commit into from Mar 5, 2019

Conversation

Projects
None yet
3 participants
@matt-allan
Copy link

matt-allan commented Mar 5, 2019

Using chunkById is dramatically faster when importing millions of rows. This issue was originally reported in #182.

On a table with 1,632,576 rows, using the null scout driver so we don't have any noise from HTTP requests, an import goes from 29 minutes and 57 seconds to 28 seconds.

Before:

$ time php artisan scout:import "App\User"
All [App\User] records have been imported.
php artisan scout:import "App\User"  14.80s user 3.69s system 1% cpu 29:57.14 total

After:

$ time php artisan scout:import "App\User"
php artisan scout:import "App\User"  17.18s user 4.06s system 73% cpu 28.809 total

Concerns:

chunkById currently removes all ORDER BY clauses. If someone had called searchable like this it wouldn't work as expected anymore:

App\Order::where('price', '>', 100)->orderBy('price')->searchable();

I doubt that anyone actually needs that behavior? If they do there are a few options:

  • Make this opt in with a config setting, i.e. config('scout.chunk.by_id')
  • Make chunkById preserve order by clauses. This is explained here but it's a little confusing. I've implemented this before for a proprietary ElasticSearch indexer. It shouldn't be too difficult to add to Eloquent.
Use chunkById instead of chunk in searchable & unsearchable scopes
Using chunkById is dramatically faster when importing millions of rows.

@driesvints driesvints changed the title Use chunkById instead of chunk [8.0] Use chunkById instead of chunk Mar 5, 2019

@driesvints driesvints requested a review from nunomaduro Mar 5, 2019

@driesvints

This comment has been minimized.

Copy link
Member

driesvints commented Mar 5, 2019

Hmm, I agree that the trade-off is definitely worth it. Wondering what @nunomaduro thinks.

@taylorotwell taylorotwell merged commit 7ec8e12 into laravel:master Mar 5, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@matt-allan matt-allan deleted the matt-allan:chunk-by-id branch Mar 5, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.