Skip to content
This repository has been archived by the owner on Nov 4, 2021. It is now read-only.

Clean up Jobqueues, minor fixes for S3 Queue #451

Merged
merged 5 commits into from
Jun 4, 2021
Merged

Conversation

neilkakkar
Copy link
Contributor

@neilkakkar neilkakkar commented Jun 1, 2021

Changes

Ensures S3 queue consumer runs.

Makes GraphileQueue adhere to JobQueueBase.

Checklist

  • Updated Settings section in README.md, if settings are affected
  • Jest tests

@@ -62,8 +64,10 @@ export class JobQueueBase implements JobQueue {
resumeConsumer(): void
// eslint-disable-next-line @typescript-eslint/require-await
async resumeConsumer(): Promise<void> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resume is called on drain events emitted by Piscina, which results in multiple calls to sync state. This is okay, as long as dequeue code is idempotent (which it isn't, thus leading to race conditions).

We didn't face this with graphile because the runner object is unchanged over multiple sync state calls.

@@ -77,8 +81,8 @@ export class JobQueueBase implements JobQueue {
clearTimeout(this.timeout)
}
// eslint-disable-next-line @typescript-eslint/await-thenable
const hadSomething = await this.readState()
this.timeout = setTimeout(() => this.syncState(), hadSomething ? 0 : this.intervalSeconds * 1000)
await this.readState()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given existing semantics, readState is supposed to finish everything required with the jobs seen so far. So, it doesn't make sense to poll again right after, if it found some jobs to process.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is that if you have 10k jobs to run, you shouldn't wait 5 sec between them (or between every 100 of them) :). So if we got a job with the last run, immediately see if there's another one. Otherwise wait a few seconds and then try again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense... assuming there's batching involved. I haven't seen it so far (am I missing something?) - so didn't assume such a state was possible.

Will revert.

@@ -284,5 +308,44 @@ describe('job queues', () => {
Key: `prefix/2020-01-01/20200101-123456.123Z-deadbeef.json.gz`,
})
})

test('polls for new jobs', async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these tests would fail after 60s (jest timeout) if there's no polling.

Copy link
Collaborator

@mariusandra mariusandra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some thoughts below, still haven't ran it locally

@@ -77,8 +81,8 @@ export class JobQueueBase implements JobQueue {
clearTimeout(this.timeout)
}
// eslint-disable-next-line @typescript-eslint/await-thenable
const hadSomething = await this.readState()
this.timeout = setTimeout(() => this.syncState(), hadSomething ? 0 : this.intervalSeconds * 1000)
await this.readState()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is that if you have 10k jobs to run, you shouldn't wait 5 sec between them (or between every 100 of them) :). So if we got a job with the last run, immediately see if there's another one. Otherwise wait a few seconds and then try again.

tests/jobs.test.ts Outdated Show resolved Hide resolved
tests/jobs.test.ts Outdated Show resolved Hide resolved
Copy link
Collaborator

@mariusandra mariusandra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes make sense

@neilkakkar neilkakkar merged commit b499825 into master Jun 4, 2021
@neilkakkar neilkakkar deleted the jobqueues branch June 4, 2021 09:49
fuziontech pushed a commit to PostHog/posthog that referenced this pull request Oct 12, 2021
…/plugin-server#451)

* fix s3 consumer, cleanup graphile queue

* tests

* address comments
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants