Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PostgreSQL notification payload limit should not restrict job payload size #42

Closed
mrputty opened this issue Sep 18, 2020 · 3 comments · Fixed by #43
Closed

PostgreSQL notification payload limit should not restrict job payload size #42

mrputty opened this issue Sep 18, 2020 · 3 comments · Fixed by #43

Comments

@mrputty
Copy link
Contributor

mrputty commented Sep 18, 2020

I have a use case for which Odd Jobs is very appealing, but it requires the ability to handle somewhat larger job payloads. Right now, creating a job with a large payload fails in the database with this error:

ERROR:  payload string too long
CONTEXT:  SQL statement "SELECT pg_notify('job_created_jobs', row_to_json(new)::text)"
PL/pgSQL function notify_job_monitor_for_jobs() line 2 at PERFORM

Out of the box, PostgreSQL limits the size of the notification payload to 8000 bytes (see https://www.postgresql.org/docs/12/sql-notify.html). Because the trigger created by createNotificationTrigger passes the full job table row (serialized to JSON) to pg_notify, the maximum size of the job payload is significantly less that 8000 bytes.

Fortunately, the de-serialization of the notification payload in jobEventListener only cares about the id, run_at, and locked_at columns. It seems that it should be easy to accommodate larger job payloads by changing the trigger to only pass these values, suitably JSON-serialized, to pg_notify, instead of the entire row.

Where the trigger function currently has row_to_json(new)::text, simply substituting json_build_object('id', new.id, 'run_at', new.run_at, 'locked_at', new.locked_at)::text seems to do the trick.

Is there a problem that I'm missing with in this approach?

@saurabhnanda
Copy link
Owner

Thank you for the PR @mrputty . It seems absolutely fine. In fact, in another place (unrelated to odd-jobs) where we are using LISTEN/NOTIFY for larger payloads, even we have faced this issue.

Just curious, what kind of large payloads are you putting in the job queue? Usually, we end-up putting the PK of another row in the DB as the job-payload (where the target row may be very large, not the job-payload itself).

Nevertheless, this optimization is good-to-have. I will merge it after compiling locally once.

@mrputty
Copy link
Contributor Author

mrputty commented Sep 18, 2020

Great, glad to hear this will be merged.

My use case involves using odd-jobs to orchestrate the retrieval of individual items from a large data set. Within each item there are different subsets of data that may or may not need to be retrieved. The job-payload consists primarily of a list of item ids, tagged with the subsets that need to be retrieved. Some of these requests may include thousands of items, and we would quickly run into the 8000 byte limit, but we don't think they'll ever be terribly large.

Because the set of items to be retrieved does not have any meaning outside of the request, it seems to me that the overhead of maintaining them is in a secondary table is unnecessary, and also potentially difficult. For example, managing the lifetime of rows in the secondary table seems complicated. If one of these jobs fails, we would have to remember to delete the corresponding row (whose id would be embedded in the JSON job payload) when we were cleaning up the failed job records.

As I thought through managing this, I realized that there's also no way for me to enforce guarantees about record lifetimes even during normal processing. For example, if I were to write code to delete the secondary record at the end of job execution in my code, I have no way to guarantee that some other exception doesn't occur after the delete but before odd-jobs actually finalized the job. Since neither job runner nor any onJobStart callback have access to the underlying database connection, this cannot be done in a transaction.

Thinking about that was enough to encourage me to look for a simpler solution, which lead me to how pg_notify was being used.

@saurabhnanda
Copy link
Owner

@mrputty would you like to close and re-open a fresh MR, if you care about Hacktoberfest? Else I'll merge this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants