Warn or sensible error when job size is too large #649

Open
pivotal-rebecca opened this Issue Apr 17, 2014 · 7 comments

Comments

Projects
None yet
4 participants

Related: #595

Jobs will fail only when attempting to invoke when the job handler ended up being too large to serialize and insert into the queue.

Intelligently warn or produce more sensible output when it might be that the job handler is too large.

Aside from logging this error, I believe the handler should be increased in length limit.

  1. Increase the size of the handler column via limit:
LONGTEXT_LENGTH = 2**32 - 1
def up
  change_column 'delayed_jobs', 'handler', 'text', limit: LONGTEXT_LENGTH
end
Owner

albus522 commented Oct 22, 2014

Unfortunately just increasing the column size will not work. Mysql also has a max query packet length that by default is much shorter than what it technically allows in a longtext field. See http://dev.mysql.com/doc/refman/5.0/en/blob.html for more details.

The real problem is that mysql quietly hides the problem instead of rejecting the insert. The real solution is to not serialize large amounts of data as it is not efficiently handled by the database, but if you absolutely need to handle larger fields you can tweak the appropriate db settings.

We've been talking about adding a length validation to fail early. First glance suggests the most appropriate place to be delayed_job_active_record's [lib/delayed/backend/active_record.rb]. Thoughts?(https://github.com/collectiveidea/delayed_job_active_record/blob/master/lib/delayed/backend/active_record.rb)

Owner

albus522 commented Oct 22, 2014

I thought about the validation as well. The problem is that if we add the validation people would not be able to expand the field size and have things work and there was no reliable way to detect what the field limit was currently set to through AR.

We just got bit by this last week. Does anyone know how big "too big" is, and is that a variable based on some other factor?

longtext in mysql supports up to 16gb iirc. As @albus522 mentions, max_allowed_packet becomes the limitation. In most cases this will limit you to either 1mb or 16mb, which is REALLY big for job data.

We're actively trying to get job data (binary image data in our case) like this out of our Rails mailers which is the only place we don't use versioned keys as data for jobs.

Interesting, as we've seen our workers stop cold in the face of 26k and 296k jobs, they both fit into the mediumtext column.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment