[8.0] job runner #52

Merged
merged 48 commits into from May 15, 2015

Conversation

Projects
None yet
5 participants
@sbidoul
Member

sbidoul commented Mar 31, 2015

This is an alternative to connector workers that

  • avoids polling,
  • is easier to deploy,
  • implements job channels (a variation on #43).

For more information on how to use it, see the docstring in runner.py.
For more information on job channels, see the docstring of the Channel class in channels.py.

It is safe to try in real deployments because it is fully compatible and it is easy to switch between the new runner and workers.

TODO:

  • can we run the doctests from travis/runbot?
  • there is a couple of todo left in the code, but nothing preventing real life usage according to me
connector/jobrunner/runner.py
+
+ def set_job_enqueued(self, uuid):
+ with closing(self.conn.cursor()) as cr:
+ cr.execute("UPDATE queue_job SET state=%s, date_enqueued=NOW() "

This comment has been minimized.

@lmignon

lmignon Apr 3, 2015

Contributor

the date_enqueued must be truncated to 'seconds'

date_enqueued=date_trunc('seconds', now()::timestamp)

otherwise Odoo will not be able to read it and will raise ValueError

  File "/home/lmi/projects/openerp/openerp-deldrive-buildout/addons-connector/connector/queue/model.py", line 113, in _change_job_state
    job = storage.load(job.uuid)
  File "/home/lmi/projects/openerp/openerp-deldrive-buildout/addons-connector/connector/queue/job.py", line 265, in load
    stored.date_enqueued, DEFAULT_SERVER_DATETIME_FORMAT)
  File "/usr/lib/python2.7/_strptime.py", line 328, in _strptime
    data_string[found.end():])
ValueError: unconverted data remains: .024856
@lmignon

lmignon Apr 3, 2015

Contributor

the date_enqueued must be truncated to 'seconds'

date_enqueued=date_trunc('seconds', now()::timestamp)

otherwise Odoo will not be able to read it and will raise ValueError

  File "/home/lmi/projects/openerp/openerp-deldrive-buildout/addons-connector/connector/queue/model.py", line 113, in _change_job_state
    job = storage.load(job.uuid)
  File "/home/lmi/projects/openerp/openerp-deldrive-buildout/addons-connector/connector/queue/job.py", line 265, in load
    stored.date_enqueued, DEFAULT_SERVER_DATETIME_FORMAT)
  File "/usr/lib/python2.7/_strptime.py", line 328, in _strptime
    data_string[found.end():])
ValueError: unconverted data remains: .024856

@lmignon lmignon referenced this pull request Apr 3, 2015

Merged

7.0 jobrunner lmi #53

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Apr 3, 2015

Coverage Status

Coverage decreased (-14.76%) to 65.31% when pulling 4f493b6 on acsone:8.0-jobrunner-sbi into 6686fd3 on OCA:8.0.

Coverage Status

Coverage decreased (-14.76%) to 65.31% when pulling 4f493b6 on acsone:8.0-jobrunner-sbi into 6686fd3 on OCA:8.0.

connector/jobrunner/channels.py
+_logger = logging.getLogger(__name__)
+
+
+class PriorityQueue:

This comment has been minimized.

@guewen

guewen Apr 7, 2015

Member

Can you change the classes to new-style classes?

@guewen

guewen Apr 7, 2015

Member

Can you change the classes to new-style classes?

+ self.db_name = db_name
+ self.channel = channel
+ self.uuid = uuid
+ self.seq = seq

This comment has been minimized.

@guewen

guewen Apr 7, 2015

Member

What is seq?

@guewen

guewen Apr 7, 2015

Member

What is seq?

This comment has been minimized.

@sbidoul

sbidoul Apr 12, 2015

Member

It's my idea to implement guaranteed sequence on job execution. date_created is not sufficient, and priority is another concept.

It is not fully implemented yet, because jobs that are retried get an eta and this does not work well with sequentiality... needs further thinking.

@sbidoul

sbidoul Apr 12, 2015

Member

It's my idea to implement guaranteed sequence on job execution. date_created is not sufficient, and priority is another concept.

It is not fully implemented yet, because jobs that are retried get an eta and this does not work well with sequentiality... needs further thinking.

+ o2 = heappop(self._heap)
+ assert o2 == o
+ self._removed.remove(o)
+ self._known.remove(o)

This comment has been minimized.

@guewen

guewen Apr 7, 2015

Member

In that branch is it expected to return None?

@guewen

guewen Apr 7, 2015

Member

In that branch is it expected to return None?

This comment has been minimized.

@sbidoul

sbidoul Apr 12, 2015

Member

In that branch we have a removed Item on top of the heap, so we pop it and try again until a non-removed item is found or the heap is empty (in which case we raise IndexError).

@sbidoul

sbidoul Apr 12, 2015

Member

In that branch we have a removed Item on top of the heap, so we pop it and try again until a non-removed item is found or the heap is empty (in which case we raise IndexError).

This comment has been minimized.

@guewen

guewen Apr 20, 2015

Member

Yes of course, it is enclosed in a while True... Don't know how I could not notice that ;-)

@guewen

guewen Apr 20, 2015

Member

Yes of course, it is enclosed in a while True... Don't know how I could not notice that ;-)

connector/jobrunner/channels.py
+ """A channel for jobs, with a maximum number of workers.
+
+ Job channels are joined in a hierarchy down to the root channel.
+ When a job channel has free workers, jobs are dequeued, marked

This comment has been minimized.

@guewen

guewen Apr 7, 2015

Member

I find the term workers confusing here as this word is used as for a different meaning in Odoo (we could argue that the workers are how many odoo workers we can speak to at the same time but it would be a bit far-fetched imo)
Would capacity be better? Other ideas? And if no better idea I can live with workers ;-)

@guewen

guewen Apr 7, 2015

Member

I find the term workers confusing here as this word is used as for a different meaning in Odoo (we could argue that the workers are how many odoo workers we can speak to at the same time but it would be a bit far-fetched imo)
Would capacity be better? Other ideas? And if no better idea I can live with workers ;-)

This comment has been minimized.

@sbidoul

sbidoul Apr 12, 2015

Member

I was thinking the same. Capacity is fine. Adapted.

@sbidoul

sbidoul Apr 12, 2015

Member

I was thinking the same. Capacity is fine. Adapted.

connector/jobrunner/channels.py
+ if self.parent:
+ self.parent.children.append(self)
+ self.children = []
+ self.workers = None

This comment has been minimized.

@guewen

guewen Apr 7, 2015

Member

self.workers = workers I guess

@guewen

guewen Apr 7, 2015

Member

self.workers = workers I guess

This comment has been minimized.

@sbidoul

sbidoul Apr 12, 2015

Member

done

connector/jobrunner/channels.py
+ self.parent.children.append(self)
+ self.children = []
+ self.workers = None
+ self.sequential = False

This comment has been minimized.

@guewen

guewen Apr 7, 2015

Member

self.sequential = sequential

Or maybe remove the arguments since they are set afterwards in configure().

@guewen

guewen Apr 7, 2015

Member

self.sequential = sequential

Or maybe remove the arguments since they are set afterwards in configure().

This comment has been minimized.

@sbidoul

sbidoul Apr 12, 2015

Member

done

connector/jobrunner/channels.py
+ Using this technique, it is possible to enforce sequence in a channel
+ with 1 worker. It is also possible to dedicate a channel with a
+ limited number of workers for application-autocreated subchannels
+ without risking to overflow the system.

This comment has been minimized.

@guewen

guewen Apr 7, 2015

Member

👍 great doc

@guewen

guewen Apr 7, 2015

Member

👍 great doc

This comment has been minimized.

@sbidoul

sbidoul Apr 12, 2015

Member

Thanks! I updated to talk about capacity instead of workers.

@sbidoul

sbidoul Apr 12, 2015

Member

Thanks! I updated to talk about capacity instead of workers.

connector/jobrunner/channels.py
+ def get_subchannel_by_name(self, subchannel_name):
+ for child in self.children:
+ if child.name == subchannel_name:
+ return child

This comment has been minimized.

@guewen

guewen Apr 7, 2015

Member

Perhaps self.children could be a dict to avoid this loop.

@guewen

guewen Apr 7, 2015

Member

Perhaps self.children could be a dict to avoid this loop.

This comment has been minimized.

@sbidoul

sbidoul Apr 12, 2015

Member

okay, done.

@sbidoul

sbidoul Apr 12, 2015

Member

okay, done.

+ for db_name in self.get_db_names():
+ db = Database(db_name)
+ if not db.has_connector:
+ _logger.debug('connector is not installed for db %s', db_name)

This comment has been minimized.

@guewen

guewen Apr 7, 2015

Member

When we install a the addon connector on a new db, do we need to restart odoo to have it considered by the runner? (and the reverse when we uninstall it)

@guewen

guewen Apr 7, 2015

Member

When we install a the addon connector on a new db, do we need to restart odoo to have it considered by the runner? (and the reverse when we uninstall it)

This comment has been minimized.

@sbidoul

sbidoul Apr 12, 2015

Member

Currently the server needs to be restarted indeed. I added a TODO about this.

@sbidoul

sbidoul Apr 12, 2015

Member

Currently the server needs to be restarted indeed. I added a TODO about this.

+ job.set_failed(exc_info=buff.getvalue())
+ with session_hdl.session() as session:
+ self.job_storage_class(session).store(job)
+ raise

This comment has been minimized.

@guewen

guewen Apr 7, 2015

Member

I didn't compared, but from memory the code seems quite similar to the one in https://github.com/OCA/connector/blob/8.0/connector/queue/worker.py#L71. Is it something we can factorize? Maybe it should be moved in the Job class?

@guewen

guewen Apr 7, 2015

Member

I didn't compared, but from memory the code seems quite similar to the one in https://github.com/OCA/connector/blob/8.0/connector/queue/worker.py#L71. Is it something we can factorize? Maybe it should be moved in the Job class?

This comment has been minimized.

@sbidoul

sbidoul Apr 12, 2015

Member

This is a shameless copy/paste indeed. Except I removed some worker-specific logic (related to worker_uuid and eta). This could be factorized and moved to runjob indeed.

@sbidoul

sbidoul Apr 12, 2015

Member

This is a shameless copy/paste indeed. Except I removed some worker-specific logic (related to worker_uuid and eta). This could be factorized and moved to runjob indeed.

@guewen

This comment has been minimized.

Show comment
Hide comment
@guewen

guewen Apr 7, 2015

Member

As said before, this is a brilliant work. I added some suggestions or interrogations.
I want to propose a new branch with the channel models / jobs on top of your branch but I fear not to have the time before the 20th of april (holidays from 10th to 19th).
We are testing the 7.0 branch since our customer having the largest number of jobs is on 7.0.

Member

guewen commented Apr 7, 2015

As said before, this is a brilliant work. I added some suggestions or interrogations.
I want to propose a new branch with the channel models / jobs on top of your branch but I fear not to have the time before the 20th of april (holidays from 10th to 19th).
We are testing the 7.0 branch since our customer having the largest number of jobs is on 7.0.

@guewen

This comment has been minimized.

Show comment
Hide comment
@guewen

guewen Apr 7, 2015

Member

can we run the doctests from travis/runbot?

Here is a PR: acsone#1

Tests will be executed when --test-enable is used.

Member

guewen commented Apr 7, 2015

can we run the doctests from travis/runbot?

Here is a PR: acsone#1

Tests will be executed when --test-enable is used.

@guewen

This comment has been minimized.

Show comment
Hide comment
@guewen

guewen Apr 7, 2015

Member

BTW could you rebase your branch?

Member

guewen commented Apr 7, 2015

BTW could you rebase your branch?

@guewen

This comment has been minimized.

Show comment
Hide comment
@guewen

guewen Apr 7, 2015

Member

Started the implementation of the channels configuration: acsone#2

Member

guewen commented Apr 7, 2015

Started the implementation of the channels configuration: acsone#2

guewen added a commit to guewen/odoo-connector-magento-buildout that referenced this pull request Apr 9, 2015

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Apr 12, 2015

Coverage Status

Coverage decreased (-8.35%) to 71.72% when pulling a900aec on acsone:8.0-jobrunner-sbi into 6686fd3 on OCA:8.0.

Coverage Status

Coverage decreased (-8.35%) to 71.72% when pulling a900aec on acsone:8.0-jobrunner-sbi into 6686fd3 on OCA:8.0.

@sbidoul

This comment has been minimized.

Show comment
Hide comment
@sbidoul

sbidoul Apr 12, 2015

Member

I rebased. Thanks for the review @guewen.

Member

sbidoul commented Apr 12, 2015

I rebased. Thanks for the review @guewen.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Apr 12, 2015

Coverage Status

Coverage decreased (-8.46%) to 71.61% when pulling 8f150ff on acsone:8.0-jobrunner-sbi into 6686fd3 on OCA:8.0.

Coverage Status

Coverage decreased (-8.46%) to 71.61% when pulling 8f150ff on acsone:8.0-jobrunner-sbi into 6686fd3 on OCA:8.0.

@sbidoul sbidoul referenced this pull request in acsone/connector Apr 12, 2015

Closed

Jobrunner configurable channels #2

connector/jobrunner/channels.py
+ channel_name, uuid)
+ channel = self._root_channel
+ job = ChannelJob(db_name, channel, uuid,
+ seq, date_created, priority, eta)

This comment has been minimized.

@lmignon

lmignon Apr 21, 2015

Contributor

Be careful. In Odoo python date, datetime and time are returned as string by psycopg2. (see https://github.com/odoo/odoo/blob/8.0/openerp/sql_db.py#L61 and http://initd.org/psycopg/docs/extensions.html#psycopg2.extensions.new_type). If you assign a string as eta to your ChannelJob you'll get a Comparison error between str and datetime when popping you job at line #L418

    while not self.capacity or len(self._running) < self.capacity:
            job = self._queue.pop(now=datetime.now())

due to #L275

    def pop(self, now):
        if len(self._eta_queue) and self._eta_queue[0].eta <= now:
@lmignon

lmignon Apr 21, 2015

Contributor

Be careful. In Odoo python date, datetime and time are returned as string by psycopg2. (see https://github.com/odoo/odoo/blob/8.0/openerp/sql_db.py#L61 and http://initd.org/psycopg/docs/extensions.html#psycopg2.extensions.new_type). If you assign a string as eta to your ChannelJob you'll get a Comparison error between str and datetime when popping you job at line #L418

    while not self.capacity or len(self._running) < self.capacity:
            job = self._queue.pop(now=datetime.now())

due to #L275

    def pop(self, now):
        if len(self._eta_queue) and self._eta_queue[0].eta <= now:

This comment has been minimized.

@sbidoul

sbidoul Apr 21, 2015

Member

@lmignon thanks. I fixed that.

@sbidoul

sbidoul Apr 21, 2015

Member

@lmignon thanks. I fixed that.

+ continue
+ func_name = '%s.%s' % (func.__module__, func.__name__)
+ if not self.search_count([('name', '=', func_name)]):
+ self.create({'name': func_name})

This comment has been minimized.

@guewen

guewen Apr 21, 2015

Member

I think to add a new default_channel keyword argument in the @job decorator, that automatically create a channel for the job if not existing yet and link it with the job. Thus, implementations of connectors would be able to propose default channels (example root.magento.import and root.magento.export for magento).

@guewen

guewen Apr 21, 2015

Member

I think to add a new default_channel keyword argument in the @job decorator, that automatically create a channel for the job if not existing yet and link it with the job. Thus, implementations of connectors would be able to propose default channels (example root.magento.import and root.magento.export for magento).

This comment has been minimized.

@sbidoul

sbidoul Apr 21, 2015

Member

👍

This comment has been minimized.

@guewen

guewen Apr 24, 2015

Member

@sbidoul here it is acsone#3
I also added tests on the job function and channel models.

@guewen

guewen Apr 24, 2015

Member

@sbidoul here it is acsone#3
I also added tests on the job function and channel models.

@lmignon

This comment has been minimized.

Show comment
Hide comment
@lmignon

lmignon Apr 21, 2015

Contributor

👍 it's really a great improvement!

Contributor

lmignon commented Apr 21, 2015

👍 it's really a great improvement!

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Apr 26, 2015

Coverage Status

Coverage decreased (-7.38%) to 72.69% when pulling 458a409 on acsone:8.0-jobrunner-sbi into 6686fd3 on OCA:8.0.

Coverage Status

Coverage decreased (-7.38%) to 72.69% when pulling 458a409 on acsone:8.0-jobrunner-sbi into 6686fd3 on OCA:8.0.

@guewen

This comment has been minimized.

Show comment
Hide comment
@guewen

guewen Apr 27, 2015

Member

@sbidoul @lmignon We tested the runner (the 7.0 backport) on a production server with ~15000 jobs per day. We encountered "too many connections" issues with postgresql. I think that's only a matter of misconfiguration on our side. Did you issued similar issues ? Do you know how many connections may the runners use to the fullest?

Member

guewen commented Apr 27, 2015

@sbidoul @lmignon We tested the runner (the 7.0 backport) on a production server with ~15000 jobs per day. We encountered "too many connections" issues with postgresql. I think that's only a matter of misconfiguration on our side. Did you issued similar issues ? Do you know how many connections may the runners use to the fullest?

@lmignon

This comment has been minimized.

Show comment
Hide comment
@lmignon

lmignon Apr 27, 2015

Contributor

@guewen We also encountered the same issue with (+100k jobs/d). If I remember well, you have to change the db_maxconn parameter in your configuration (by default 64).

Contributor

lmignon commented Apr 27, 2015

@guewen We also encountered the same issue with (+100k jobs/d). If I remember well, you have to change the db_maxconn parameter in your configuration (by default 64).

@guewen

This comment has been minimized.

Show comment
Hide comment
@guewen

guewen Apr 27, 2015

Member

We also encountered the same issue with (+100k jobs/d). If I remember well, you have to change the db_maxconn parameter in your configuration (by default 64).

We tend to have a number as low as possible in this option in order to avoid to exhaust the max connections of postgres itself.
Example, we have max_connections=200 in postgresql.conf and 16 workers, we put 12 in db_maxconn which is sufficient today (so the max connections we use is 192). If we had to put something > 64, multiplied by the quantity of workers, we would need a huge quantity of postgres connections.

I want to find out how we could quantify the connections necessary for the runner, but I don't know exactly how I should proceed. Any idea?

Member

guewen commented Apr 27, 2015

We also encountered the same issue with (+100k jobs/d). If I remember well, you have to change the db_maxconn parameter in your configuration (by default 64).

We tend to have a number as low as possible in this option in order to avoid to exhaust the max connections of postgres itself.
Example, we have max_connections=200 in postgresql.conf and 16 workers, we put 12 in db_maxconn which is sufficient today (so the max connections we use is 192). If we had to put something > 64, multiplied by the quantity of workers, we would need a huge quantity of postgres connections.

I want to find out how we could quantify the connections necessary for the runner, but I don't know exactly how I should proceed. Any idea?

@sbidoul

This comment has been minimized.

Show comment
Hide comment
@sbidoul

sbidoul Apr 27, 2015

Member

From the top of my head, the runner needs exactly one postgres connection per database it is listening to. This comes on top of the connections used by the normal Odoo workers of course.

Member

sbidoul commented Apr 27, 2015

From the top of my head, the runner needs exactly one postgres connection per database it is listening to. This comes on top of the connections used by the normal Odoo workers of course.

@sbidoul

This comment has been minimized.

Show comment
Hide comment
@sbidoul

sbidoul May 4, 2015

Member

rebased

Member

sbidoul commented May 4, 2015

rebased

@guewen

This comment has been minimized.

Show comment
Hide comment
@guewen

guewen May 7, 2015

Member

We miss one thing: documentation on odoo-connector.com
Perhaps we should add information on http://odoo-connector.com/guides/multiprocessing.html or a dedicated page. And we can add the API docstrings it the doc like that https://raw.githubusercontent.com/OCA/connector/8.0/connector/doc/api/api_queue.rst

I do not consider that as blocking though, as we can do it in a second PR.

@sbidoul Whenever you confirm it is ready to merge, I will do it with a great smile on my face :-)

Member

guewen commented May 7, 2015

We miss one thing: documentation on odoo-connector.com
Perhaps we should add information on http://odoo-connector.com/guides/multiprocessing.html or a dedicated page. And we can add the API docstrings it the doc like that https://raw.githubusercontent.com/OCA/connector/8.0/connector/doc/api/api_queue.rst

I do not consider that as blocking though, as we can do it in a second PR.

@sbidoul Whenever you confirm it is ready to merge, I will do it with a great smile on my face :-)

@sbidoul

This comment has been minimized.

Show comment
Hide comment
@sbidoul

sbidoul May 7, 2015

Member

@guewen I need to write a documentation page indeed. I suggest you wait for it before merging so I have a higher incentive to squeeze that in my agenda 😄

Member

sbidoul commented May 7, 2015

@guewen I need to write a documentation page indeed. I suggest you wait for it before merging so I have a higher incentive to squeeze that in my agenda 😄

@guewen

This comment has been minimized.

Show comment
Hide comment
@guewen

guewen May 8, 2015

Member

I need to write a documentation page indeed. I suggest you wait for it before merging so I have a higher incentive to squeeze that in my agenda 😄

Okay 😆

Member

guewen commented May 8, 2015

I need to write a documentation page indeed. I suggest you wait for it before merging so I have a higher incentive to squeeze that in my agenda 😄

Okay 😆

connector/jobrunner/runner.py
+ def set_job_enqueued(self, uuid):
+ with closing(self.conn.cursor()) as cr:
+ cr.execute("UPDATE queue_job SET state=%s, "
+ "date_enqueued=date_trunc('seconds', now()::timestamp) "

This comment has been minimized.

@lmignon

lmignon May 8, 2015

Contributor

@sbidoul I just noticed that the now() should be at UTC.

date_enqueued = date_trunc('seconds', now()::timestamp at time zone 'utc')

The code is therefore sensitive to the timezone defined on the database. To work as expected the timezone on the database should be set to 'UTC'

#show timezone;
 TimeZone 
----------
 UTC
(1 row)
@lmignon

lmignon May 8, 2015

Contributor

@sbidoul I just noticed that the now() should be at UTC.

date_enqueued = date_trunc('seconds', now()::timestamp at time zone 'utc')

The code is therefore sensitive to the timezone defined on the database. To work as expected the timezone on the database should be set to 'UTC'

#show timezone;
 TimeZone 
----------
 UTC
(1 row)

This comment has been minimized.

@sbidoul

sbidoul May 8, 2015

Member

Good catch. It's fixed now.

@sbidoul

sbidoul May 8, 2015

Member

Good catch. It's fixed now.

connector/jobrunner/__init__.py
+
+
+# TODO: this is a temporary flag to enable the connector runner
+enable = os.environ.get('ODOO_CONNECTOR_RUNNER_ENABLE')

This comment has been minimized.

@sbidoul

sbidoul May 8, 2015

Member

@guewen @lmignon perhaps we can do the economy of one environment variable by using ODOO_CONNECTOR_CHANNELS here instead. What do you think?

@sbidoul

sbidoul May 8, 2015

Member

@guewen @lmignon perhaps we can do the economy of one environment variable by using ODOO_CONNECTOR_CHANNELS here instead. What do you think?

This comment has been minimized.

@guewen

guewen May 13, 2015

Member

Yes, sounds good.

@guewen

guewen May 13, 2015

Member

Yes, sounds good.

This comment has been minimized.

@sbidoul

sbidoul May 13, 2015

Member

I removed the need for ODOO_CONNECTOR_RUNNER_ENABLE, setting ODOO_CONNECTOR_CHANNELS is now sufficient to enable the runner.

@sbidoul

sbidoul May 13, 2015

Member

I removed the need for ODOO_CONNECTOR_RUNNER_ENABLE, setting ODOO_CONNECTOR_CHANNELS is now sufficient to enable the runner.

@sbidoul

This comment has been minimized.

Show comment
Hide comment
@sbidoul

sbidoul May 13, 2015

Member

@guewen I just pushed a documentation commit.

Let me know if it's readable or any other thought about it.

I took the liberty to add Acsone in the list of maintainers next to Akretion as I thought this contrib together with Laurent's previous work justifies it but I don't know the volume of Akretion contribs. If you feel this is not appropriate let me know.

Member

sbidoul commented May 13, 2015

@guewen I just pushed a documentation commit.

Let me know if it's readable or any other thought about it.

I took the liberty to add Acsone in the list of maintainers next to Akretion as I thought this contrib together with Laurent's previous work justifies it but I don't know the volume of Akretion contribs. If you feel this is not appropriate let me know.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls May 13, 2015

Coverage Status

Coverage decreased (-8.08%) to 71.99% when pulling 2e41523 on acsone:8.0-jobrunner-sbi into cf15369 on OCA:8.0.

Coverage Status

Coverage decreased (-8.08%) to 71.99% when pulling 2e41523 on acsone:8.0-jobrunner-sbi into cf15369 on OCA:8.0.

sbidoul added some commits Mar 14, 2015

[WIP] POC to replace workers by a job runner using pgsql NOTIFY/LISTEN
Because:
* preassigning jobs to a connector worker can starve other connector workers
* connector workers poll with a long sleep delay, so jobs do not start immediately

If this works, we could even in the future:
* remove the concept of "connector worker" completely and let Odoo run jobs in its own worker process
* remove the concept of connector session: jobs could be any Odoo method

sbidoul added some commits Apr 12, 2015

[IMP] connector-runner: rename worker to capacity, plus a few more tests
* auto-created intermediate channels have an infinite capacity
* ChannelNotFound exception instead of RuntimeError
[FIX] connector-runner: do not attempt to auto-create/auto-configure …
…channels

What we receive from the queue are channel names and that's all.
My original idea was to let applications auto-create temporary channels
which could be useful for sequential channels.
But this needs more thinking.
[IMP] connector-runner: do not NOTIFY when deleting done jobs
This NOTIFY is not necessary and makes the purge of the job queue very slow.
Thanks to @acsonelmi for the suggestion.
[FIX] connector-runner: use string type for datetime
Odoo indeed adapts the psycopg2 type mappings to use strings for datetime.
[IMP] connector-runner: remove ODOO_CONNECTOR_ENABLE environment vari…
…able

The presence of ODOO_CONNECTOR_CHANNELS is enough to make the runner
start in place of the workers.
[IMP] connector-runner: remove obsolete TODO
Since the runner is now a thread in the Odoo main process,
such a situation should occur only with a pathologically wrong
configuration, so no need to worry about that... unless proven
otherwise ;)
[IMP] connector-runner: refactor handling of 'now'
So the channel module does not make assumptions on the type
of the eta field, so we use the odoo-ish way to get now in
a database-compatible format, so it is more easily testable.
@sbidoul

This comment has been minimized.

Show comment
Hide comment
@sbidoul

sbidoul May 15, 2015

Member

I rebased to get the latest travis config.

There are 3 todo's left, which IMO can be left for future work:

  • better mechanism to http get asynchronously: normally not a real performance issue, but in case it is perhaps using a threadpool would be a solution
  • detect databases on which connector is added/removed: currently the server needs to be restarted in such case, which should not be an issue except for multi-tenant hosters
  • finalize sequential channels (needs thinking around ETA), so I did not document that feature

So from my point of view it's ready to merge if and when Travis manages to gives back a green status.

Member

sbidoul commented May 15, 2015

I rebased to get the latest travis config.

There are 3 todo's left, which IMO can be left for future work:

  • better mechanism to http get asynchronously: normally not a real performance issue, but in case it is perhaps using a threadpool would be a solution
  • detect databases on which connector is added/removed: currently the server needs to be restarted in such case, which should not be an issue except for multi-tenant hosters
  • finalize sequential channels (needs thinking around ETA), so I did not document that feature

So from my point of view it's ready to merge if and when Travis manages to gives back a green status.

@guewen

This comment has been minimized.

Show comment
Hide comment
@guewen

guewen May 15, 2015

Member

I took the liberty to add Acsone in the list of maintainers next to Akretion as I thought this contrib together with Laurent's previous work justifies it but I don't know the volume of Akretion contribs. If you feel this is not appropriate let me know.

Absolutely deserved.

Member

guewen commented May 15, 2015

I took the liberty to add Acsone in the list of maintainers next to Akretion as I thought this contrib together with Laurent's previous work justifies it but I don't know the volume of Akretion contribs. If you feel this is not appropriate let me know.

Absolutely deserved.

connector/doc/guides/multiprocessing.rst
+ supports job channels. You should try the job runner first
+ and fall back to using workers in case the runner does not
+ work (sic) for you, in which case we will very much appreciate
+ a github issue describing the problems you encoutered.

This comment has been minimized.

@guewen

guewen May 15, 2015

Member

s/encoutered/encountered/

@guewen

guewen May 15, 2015

Member

s/encoutered/encountered/

connector/jobrunner/channels.py
+ ---------------------+
+ |
+ |
+ Ch. A W:4,Q:12,R:4 +-----------------------

This comment has been minimized.

@guewen

guewen May 15, 2015

Member

Could you replace W by C?

@guewen

guewen May 15, 2015

Member

Could you replace W by C?

@guewen

This comment has been minimized.

Show comment
Hide comment
@guewen

guewen May 15, 2015

Member

I spotted 2 very little things in the doc (inline comments), apart that, all good!

Member

guewen commented May 15, 2015

I spotted 2 very little things in the doc (inline comments), apart that, all good!

@lmignon

This comment has been minimized.

Show comment
Hide comment
@lmignon

lmignon May 15, 2015

Contributor

I'm really impressed by this PR and the educational aspects of the doctects associated to each new concept. The connector module is one of my favorite OCA addons and a killer feature for Odoo/OCA. IMO it's ready to be merged. 👍

Contributor

lmignon commented May 15, 2015

I'm really impressed by this PR and the educational aspects of the doctects associated to each new concept. The connector module is one of my favorite OCA addons and a killer feature for Odoo/OCA. IMO it's ready to be merged. 👍

@guewen

This comment has been minimized.

Show comment
Hide comment
@guewen

guewen May 15, 2015

Member

I'm really impressed by this PR and the educational aspects of the doctects associated to each new concept. The connector module is one of my favorite OCA addons and a killer feature for Odoo/OCA. IMO it's ready to be merged. 👍

I can't express it better. Many thanks for this contribution.

Member

guewen commented May 15, 2015

I'm really impressed by this PR and the educational aspects of the doctects associated to each new concept. The connector module is one of my favorite OCA addons and a killer feature for Odoo/OCA. IMO it's ready to be merged. 👍

I can't express it better. Many thanks for this contribution.

guewen added a commit that referenced this pull request May 15, 2015

@guewen guewen merged commit 499be9c into OCA:8.0 May 15, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@sbidoul

This comment has been minimized.

Show comment
Hide comment
@sbidoul

sbidoul May 15, 2015

Member

Cool. Thanks guys!

Member

sbidoul commented May 15, 2015

Cool. Thanks guys!

@sbidoul sbidoul deleted the acsone:8.0-jobrunner-sbi branch May 15, 2015

@agb80

This comment has been minimized.

Show comment
Hide comment
@agb80

agb80 Jun 19, 2015

Member

@sbidoul is this runner available on 7.0 branch?

Member

agb80 commented Jun 19, 2015

@sbidoul is this runner available on 7.0 branch?

@sbidoul

This comment has been minimized.

Show comment
Hide comment
@sbidoul

sbidoul Jun 20, 2015

Member

@agb80 yes, #53 is merged in 7.0

Member

sbidoul commented Jun 20, 2015

@agb80 yes, #53 is merged in 7.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment