Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[14.0] Support multi-nodes with lock on jobrunner #15

Merged
merged 4 commits into from
May 17, 2021

Conversation

PieterPaulussen
Copy link

Forward port of OCA#256
Copy of OCA#346

guewen and others added 4 commits May 14, 2021 15:15
Starting several odoo (main) processes with "--load=web,queue_job"
was unsupported, as it would start several job runner, which would all
listen to postgresql notifications and try to enqueue jobs in concurrent
workers.

This is an issue in several cases:

* it causes issues on odoo.sh that uses an hybrid model for workers
and starts several job runners [0]
* it defeats any setup that would use several nodes to keep the service
available in case of failure of a node/host

The solution implemented here is using a PostgreSQL advisory lock,
at session level in a connection on the "postgres" database, which
ensure 2 job runners are not working on the same set of databases.

At loading, the job runner tries to acquire the lock. If it can, it
initializes the connection and listen for jobs. If the lock is taken
by another job runner, it waits and retry to acquire it every 30
seconds.

Example when a job runner is started and another one starts:

    INFO ? odoo.addons.queue_job.jobrunner.runner: starting
    INFO ? odoo.addons.queue_job.jobrunner.runner: already started on another node

The shared lock identifier is computed based on the set of databases
the job runner has to listen to: if a job runner is started with
``--database=queue1`` and another with ``--database=queue2``, they will
have different locks and such will be able to work in parallel.

Important: new databases need a restart of the job runner. This was
already the case, and would be a great improvement, but is out of
scope for this improvement.

[0] OCA#169 (comment)
@StefanRijnhart StefanRijnhart self-requested a review May 17, 2021 11:43
Copy link

@StefanRijnhart StefanRijnhart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reapproving because this PR does not suffer from the missing PR as the PR onto OCA/queue does.

@StefanRijnhart StefanRijnhart merged commit a47553f into 14.0-nova May 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants