Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to a single worker model #42

Merged
merged 26 commits into from
Mar 23, 2015
Merged

Move to a single worker model #42

merged 26 commits into from
Mar 23, 2015

Conversation

joshk
Copy link
Contributor

@joshk joshk commented Mar 1, 2015

This change moves from using a Dispatcher -> Worker model, to having Workers have multiple subscriptions to the builds queue but use SERIALIZABLE transactions and retries to ensure safety when updating Jobs and Build information.

This works similar to what we do with travis-logs, each process can run 1 or more subscribers, with the current default being 2.

This allows us to scale up hub processing on the fly by either increasing the number of subscribers per process, or more processes in general.

I have tested this on staging and things look good, but we might want to run some load through it to make sure the transactions are doing their job.

wrap the job update service call with a SERIALIZABLE transaction
this allows each process to run multiple processors
the transaction makes sure that job event prorogation aren’t a fight to the finish

the retry in the event handling code makes sure that if two jobs are being processed at the same time that the one that fails is tried again
@joshk
Copy link
Contributor Author

joshk commented Mar 1, 2015

Already noticed a few jobs not having their state and information saved and propagated correctly

@joshk
Copy link
Contributor Author

joshk commented Mar 1, 2015

After removing the query cache things seems to be behaving correctly. I run a bit of load through this and I don't see any more jobs or builds where their state info isn't updated correctly.

@joshk
Copy link
Contributor Author

joshk commented Mar 2, 2015

The transaction code could probably be moved to the service. I'll be happy to do this once this PR is given the thumbs up.

@joshk
Copy link
Contributor Author

joshk commented Mar 2, 2015

And still not behaving correctly 😕

use a branch for testing a change in how event propagation works
@joshk joshk changed the title Move to a single worker model Move to a single worker model - NOT READY Mar 2, 2015
@joshk
Copy link
Contributor Author

joshk commented Mar 3, 2015

I'm pretty sure I have this working correctly!

I updated the UpdateJob service in core to use a SERIALIZABLE transaction, and a Postgres Advisory Lock around the build id.

The Advisory Lock works like an application mutex based on the lock id.

The Advisory Lock code is in core, but might be worth moving to support.

Before we can use this (if we decide to use this) I need to merge my core changes into master.

@joshk joshk changed the title Move to a single worker model - NOT READY Move to a single worker model Mar 3, 2015
@joshk
Copy link
Contributor Author

joshk commented Mar 3, 2015

This is now ready for review.

Conflicts:
	Gemfile.lock
	bin/hub
	lib/travis/hub/dispatcher.rb
	lib/travis/hub/solo.rb
	lib/travis/hub/worker.rb
joshk added a commit that referenced this pull request Mar 23, 2015
@joshk joshk merged commit 4ebefa8 into master Mar 23, 2015
@joshk joshk deleted the jk_single_worker_model branch November 27, 2015 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant