Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queue performance question/documentation update? #27148

Closed
owenconti opened this issue Jan 12, 2019 · 8 comments · Fixed by #30802
Closed

Queue performance question/documentation update? #27148

owenconti opened this issue Jan 12, 2019 · 8 comments · Fixed by #30802

Comments

@owenconti
Copy link
Contributor

  • Laravel Version: 5.7.16
  • PHP Version: 7.1
  • Database Driver & Version: MySQL 8

Description:

I noticed my queue was taking a long time to process a specific type of job. While debugging the issue, I eventually got to the point where the handle() method of my job would return right away, without doing anything. Even when returning right away, the job was taking ~10 seconds to run.

Upon digging into the internals, it seems like the entire model may be stored in Redis, instead of just the identifier of the model, as documented here:

If your queued job accepts an Eloquent model in its constructor, only the identifier for the model will be serialized onto the queue. When the job is actually handled, the queue system will automatically re-retrieve the full model instance from the database. It's all totally transparent to your application and prevents issues that can arise from serializing full Eloquent model instances. (https://laravel.com/docs/5.7/queues#creating-jobs)

I have some screenshots to help demonstrate the issue.

First, the Redis value stored for a job. Note you can see data is being stored about the relations on the model (this may be normal/intended, I'm not sure):

screenshot from 2019-01-12 11-17-09

Second, a screenshot showing the processing time when the job is passed an Eloquent model. The first and fourth lines are for the same job. Note the 7 second time to run:

screenshot from 2019-01-12 13-31-18

Third, a screenshot showing the processing time when the job is passed a primitive:

screenshot from 2019-01-12 13-32-04

Steps To Reproduce:

Dispatch a job with a lot of nested relations. Compare the processing time against the same job, but passed the ID of the model instead.

Note

This does not cause a performance issue with all models. In our case, it tends to only come up with models that have a lot of nested relations.

If everything seems normal or is working as expected, can someone recommended what we can do to improve performance of the queue, other than passing primitives instead of models?

@mfn
Copy link
Contributor

mfn commented Jan 12, 2019

Sorry if this might be a slightly off-topic comment.

I never directly passed models, always only primary keys and used <Model>::findOrFail in the job.

First time I saw the serialization result of this for a job, I immediately had a bad feeling about this. It simply didn't feel right.

This however requires that your models state is persisted already in the DB.

@sisve
Copy link
Contributor

sisve commented Jan 14, 2019

Are you using the SerializesModels trait?

@driesvints
Copy link
Member

Can you please post some code which recreates the problem?

@owenconti
Copy link
Contributor Author

owenconti commented Jan 14, 2019

@driesvints I have an example repo here: https://github.com/owenconti/laravel-queues-performance-repo

After more research, I think this functionality is intended, it just isn't documented as such. Here's what I think is happening:

  • When Laravel is passed a Model to dispatch, it determines which of the Model's relationships are already loaded, and serializes that into the queue, along with the ID of the Model
  • Then, when processing the Job, it pulls out the ID of the Model and which relationships were loaded when the Model was serialized
  • It then loads the Model from the database, and also eagerly loads whatever relationships were already loaded

Thus, in my original example, I had a ton of relationships already loaded, so when processing the job, it took a few seconds to load all the data out of the database, even though I wasn't accessing any of it.

I'm not sure whether that is intended functionality (I could argue on either side for/against it), but if it is intended, I think this section of the documentation should mention it:

If your queued job accepts an Eloquent model in its constructor, only the identifier for the model will be serialized onto the queue. When the job is actually handled, the queue system will automatically re-retrieve the full model instance from the database. It's all totally transparent to your application and prevents issues that can arise from serializing full Eloquent model instances.

@owenconti owenconti changed the title Redis Queue appears to be serializing entire Eloquent model instead of just the ID Queue performance question/documentation update? Jan 15, 2019
@derekmd
Copy link
Contributor

derekmd commented Feb 3, 2019

The "If your queued job..." docs mentioned in the root post would have been written before this change was made in Laravel 5.6: #21229

@mfn I never directly passed models, always only primary keys and used ::findOrFail in the job

You can still use the SerializesModels trait and avoid excessive nested relationship reloads by disabling the queued relationships feature as described here (to fix a polymorphic reload bug): #26126 (comment)

The framework change was intended to allow broadcast event payloads (to Pusher or the Laravel WebSockets package) to contain nested relationship data since beforehand SerializesModels would only fetch the base model's attributes.

However it can turn queued jobs, listeners, and mailables into greedy database query resource hogs. The queue worker can run a dozen queries that reload hundreds (or thousands) of models unrelated to the job being run, all because pre-dispatch those relations were hydrated in the serialized model.

Unless you're using event broadcasting, I suggest overriding Model@getQueueableRelations() to return nothing and instead call $model->load() in YourJob@handle() to avoid N+1 queries.

@mfn
Copy link
Contributor

mfn commented Feb 3, 2019

You can still use the SerializesModels trait and avoid excessive nested relationship reloads by disabling the queued relationships feature as described here (to fix a polymorphic reload bug): #26126 (comment)

Thanks, that's quite interesting and I didn't know about that!

But thinking again, this is a potential pitfall: if you don't foresee this and change the model in advance, you can be exposed to this problem.

I guess I'm erring on the safe side, keeping Job payloads small and just resolve data within the jobs. I prefer it being more explicit.

Oh, and feels wrong to me that the model would have to have knowledge about this…

@driesvints
Copy link
Member

I sent in a PR to easily unset relations for a queue job here: #30802

This way they won't be serialized. I'll also make sure to document this in the queue docs once it's merged.

@driesvints
Copy link
Member

Documented here: laravel/docs#5655

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants