Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

external tool status for celery unstable even though celery is running fine #377

Closed
ngehlenborg opened this issue Mar 3, 2014 · 7 comments
Assignees
Labels

Comments

@ngehlenborg
Copy link
Contributor

I observed that the external tool status for celery is switching back and forth between SUCCESS and FAILURE every few seconds (3 - 5 seconds wall clock time), even though the application is running fine.

The task statuses are all looking ok, too:

screen shot 2014-03-03 at 7 30 02 am

How can this happen?

This was observed on the VM after restarting all tools and running the node_relationship_ui_feature branch (even though I doubt that there is any difference to the develop branch with regards to the external tool status monitoring).

@ngehlenborg ngehlenborg added this to the Release 0.0.1 milestone Mar 3, 2014
@ngehlenborg ngehlenborg added the bug label Mar 3, 2014
@ngehlenborg
Copy link
Contributor Author

Below is my celeryd log from that session (and the one after I restarted the VM).

There seemed to be some issues connecting to the broker and a lot of hard time limit exceed errors (probably connected to the broker issue).

My computer went to sleep a few times during that session, maybe this affected the broker running in the VM. But I don't understand why I would sometimes see a SUCCESS on one attempt and a FAILURE on the next.

A highlight:

[2014-03-03 07:20:44,850: WARNING/MainProcess] discard: Erased 44341 messages from the queue.
---- **** -----
--- * ***  * -- [Configuration]
-- * - **** ---   . broker:      amqp://guest@localhost:5672//
- ** ----------   . loader:      djcelery.loaders.DjangoLoader
- ** ----------   . logfile:     [stderr]@WARNING
- ** ----------   . concurrency: 4
- ** ----------   . events:      ON
- *** --- * ---   . beat:        OFF
-- ******* ----
--- ***** ----- [Queues]
 --------------   . celery:      exchange:celery (direct) binding:celery
[2014-02-28 15:39:49,251: WARNING/MainProcess] celery@refinery has started.
2014-02-28 15:39:49 INFO     tasks check_for_solr: core.tasks.check_for_solr: Could not connect to Solr
[2014-02-28 18:19:34,135: ERROR/MainProcess] Task core.tasks.check_for_celery[4a5f3d0c-5b63-42a4-aabd-27291acbe563] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-02-28 18:19:34,136: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[4a5f3d0c-5b63-42a4-aabd-27291acbe563]
[2014-03-01 07:17:28,324: ERROR/MainProcess] Task core.tasks.dispatch_galaxy_checks[67bfdf3a-9879-4859-8c2c-e736cc9e61ad] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:28,328: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.dispatch_galaxy_checks[67bfdf3a-9879-4859-8c2c-e736cc9e61ad]
[2014-03-01 07:17:30,891: ERROR/MainProcess] Task core.tasks.check_for_solr[83a760b8-f682-47d7-9d86-1d4abff1c947] raised exception: TimeLimitExceeded(2.5,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.5
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.5
[2014-03-01 07:17:30,893: ERROR/MainProcess] Hard time limit (2.5s) exceeded for core.tasks.check_for_solr[83a760b8-f682-47d7-9d86-1d4abff1c947]
[2014-03-01 07:17:31,418: ERROR/MainProcess] Task core.tasks.check_for_galaxy[93f70d81-8ab9-4e23-bd6d-4c3853996cec] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:31,419: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_galaxy[93f70d81-8ab9-4e23-bd6d-4c3853996cec]
[2014-03-01 07:17:33,762: ERROR/MainProcess] Task core.tasks.check_for_celery[e4841542-09b7-4c12-bfc3-ed443e855d7e] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:33,792: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[e4841542-09b7-4c12-bfc3-ed443e855d7e]
[2014-03-01 07:17:33,796: ERROR/MainProcess] Task core.tasks.check_for_galaxy[99a1d05a-0dac-4492-bae3-a3cdcfe2692a] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:33,796: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_galaxy[99a1d05a-0dac-4492-bae3-a3cdcfe2692a]
[2014-03-01 07:17:43,752: ERROR/MainProcess] Task core.tasks.dispatch_galaxy_checks[3f3439bb-813d-4c6a-b080-5cf74236b615] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:43,755: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.dispatch_galaxy_checks[3f3439bb-813d-4c6a-b080-5cf74236b615]
[2014-03-01 07:17:45,440: ERROR/MainProcess] Task core.tasks.check_for_celery[2e09c423-d720-4f64-bb18-a56715733bae] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:45,443: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[2e09c423-d720-4f64-bb18-a56715733bae]
[2014-03-01 07:17:49,409: ERROR/MainProcess] Task core.tasks.check_for_celery[d1137809-ceb4-4599-a01c-b45e1f9a0947] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:49,409: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[d1137809-ceb4-4599-a01c-b45e1f9a0947]
[2014-03-01 07:17:54,649: ERROR/MainProcess] Task core.tasks.check_for_celery[8d4232ea-bffe-4c66-93b8-4539146498de] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:54,650: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[8d4232ea-bffe-4c66-93b8-4539146498de]
[2014-03-01 18:02:12,086: ERROR/MainProcess] Task core.tasks.check_for_celery[62b418f5-c73a-4120-9518-7a0c47400e8d] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 18:02:12,095: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[62b418f5-c73a-4120-9518-7a0c47400e8d]
[2014-03-01 18:02:14,107: ERROR/MainProcess] Task core.tasks.check_for_galaxy[87cc81c2-c515-402f-8a2a-9f8b0cc7e2a3] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 18:02:14,108: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_galaxy[87cc81c2-c515-402f-8a2a-9f8b0cc7e2a3]
[2014-03-01 20:26:34,258: ERROR/MainProcess] Task core.tasks.check_for_celery[f5fda794-c57a-4da6-8484-f3ad8fff69f6] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 20:26:34,273: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[f5fda794-c57a-4da6-8484-f3ad8fff69f6]
[2014-03-01 21:33:38,969: ERROR/MainProcess] Task core.tasks.check_for_celery[8c210343-b846-4dcb-8b44-613cb7b23ed9] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 21:33:38,969: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[8c210343-b846-4dcb-8b44-613cb7b23ed9]
[2014-03-02 04:08:51,922: ERROR/MainProcess] Task core.tasks.check_for_celery[89c5dd4a-da9b-40ea-9ee0-e802b7c71866] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-02 04:08:51,970: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[89c5dd4a-da9b-40ea-9ee0-e802b7c71866]
[2014-03-02 04:19:24,515: ERROR/MainProcess] Task core.tasks.check_for_celery[a1be83d9-7c65-413d-9f18-a2e31a4cc74e] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-02 04:19:24,516: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[a1be83d9-7c65-413d-9f18-a2e31a4cc74e]
[2014-03-02 04:24:41,198: ERROR/MainProcess] Task core.tasks.check_for_celery[f0cae5db-6eaa-4bea-abb3-f4c0308eee20] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-02 04:24:41,199: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[f0cae5db-6eaa-4bea-abb3-f4c0308eee20]
[2014-03-02 05:05:51,877: ERROR/MainProcess] Task core.tasks.check_for_celery[5926f3b9-a9b4-49d4-a35a-1171e50b26a9] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
    raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-02 05:05:51,886: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[5926f3b9-a9b4-49d4-a35a-1171e50b26a9]
[2014-03-03 07:20:24,260: WARNING/MainProcess] celeryd: Warm shutdown (MainProcess)
[2014-03-03 07:20:44,850: WARNING/MainProcess] discard: Erased 44341 messages from the queue.
[2014-03-03 07:20:44,850: WARNING/MainProcess] -------------- celery@refinery v2.4.6
---- **** -----
--- * ***  * -- [Configuration]
-- * - **** ---   . broker:      amqp://guest@localhost:5672//
- ** ----------   . loader:      djcelery.loaders.DjangoLoader
- ** ----------   . logfile:     [stderr]@WARNING
- ** ----------   . concurrency: 4
- ** ----------   . events:      ON
- *** --- * ---   . beat:        OFF
-- ******* ----
--- ***** ----- [Queues]
 --------------   . celery:      exchange:celery (direct) binding:celery
[2014-03-03 07:20:44,900: WARNING/MainProcess] celery@refinery has started.
2014-03-03 07:20:45 INFO     tasks check_for_solr: core.tasks.check_for_solr: Could not connect to Solr
2014-03-03 07:41:02 DEBUG    tasks run_analysis: analysis_manager.tasks run_analysis called
2014-03-03 07:41:02 WARNING  models get_absolute_path: Datafile doesn't exist in FileStoreItem '8325e0d4-9dd8-11e3-a08c-080027129698'
2014-03-03 07:41:02 DEBUG    tasks import_file: Importing FileStoreItem with UUID '8325e0d4-9dd8-11e3-a08c-080027129698'
2014-03-03 07:41:02 WARNING  models get_absolute_path: Datafile doesn't exist in FileStoreItem '8325e0d4-9dd8-11e3-a08c-080027129698'
2014-03-03 07:41:02 DEBUG    tasks run_analysis_preprocessing: analysis_manager.run_analysis_preprocessing called
2014-03-03 07:41:03 DEBUG    connection create_library: library name: 'Refinery Analysis - 14426220-a2d1-11e3-8022-080027129698 (2014-03-03 07:41:03.038619)'
2014-03-03 07:41:03 DEBUG    galaxy_workflow createStepsAnnot: galaxy_workflow.createStepsAnnot called
2014-03-03 07:41:03 DEBUG    galaxy_workflow countWorkflowSteps: galaxy_connector.galaxy_workflow countWorkflowSteps called
2014-03-03 07:41:04 DEBUG    tasks import_file: Starting download from 'https://main.g2.bx.psu.edu/datasets/d78ba454458040fd/display?to_ext=fastqsanger'
2014-03-03 07:41:43 DEBUG    tasks import_file: Finished downloading from 'https://main.g2.bx.psu.edu/datasets/d78ba454458040fd/display?to_ext=fastqsanger'
2014-03-03 07:41:44 DEBUG    tasks run_analysis_execution: analysis_manager.run_analysis_execution called
2014-03-03 07:41:44 DEBUG    tasks import_analysis_in_galaxy: analysis_manager.tasks import_analysis_in_galaxy called
2014-03-03 07:41:44 ERROR    connection post: 400 Client Error: Bad Request - http://192.168.50.1:8080/api/libraries/40876639881ca029/contents?key=dbeaf10787cc00d9f64e733b0802265d
2014-03-03 07:41:44 ERROR    tasks run_analysis_execution: Analysis execution failed: error importing analysis 'Test workflow: 5 steps without branching 2014-03-03 @ 07:41:02' into Galaxy: Galaxy request error
2014-03-03 07:41:51 DEBUG    tasks chord_postprocessing: analysis_manager.chord_postprocessing called
[2014-03-03 07:44:31,848: ERROR/MainProcess] Consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 310, in start
    self.consume_messages()
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 326, in consume_messages
    self.connection.drain_events(timeout=1)
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/kombu/connection.py", line 194, in drain_events
    return self.transport.drain_events(self.connection, **kwargs)
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/kombu/transport/amqplib.py", line 235, in drain_events
    return connection.drain_events(**kwargs)
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/kombu/transport/amqplib.py", line 67, in drain_events
    return self.wait_multi(self.channels.values(), timeout=timeout)
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/kombu/transport/amqplib.py", line 92, in wait_multi
    return amqp_method(channel, args)
  File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/amqplib/client_0_8/connection.py", line 380, in _close
    raise AMQPConnectionException(reply_code, reply_text, (class_id, method_id))
AMQPConnectionException: (320, u"CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'", (0, 0), '')
[2014-03-03 07:44:31,881: ERROR/MainProcess] Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 2 seconds...
[2014-03-03 07:44:33,882: ERROR/MainProcess] Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 4 seconds...
[2014-03-03 07:44:37,924: WARNING/MainProcess] celeryd: Warm shutdown (MainProcess)
[2014-03-03 07:44:38,820: WARNING/MainProcess] celeryd: Warm shutdown (MainProcess)
[2014-03-03 07:46:19,922: WARNING/MainProcess] discard: Erased 16 messages from the queue.
[2014-03-03 07:46:19,923: WARNING/MainProcess] -------------- celery@refinery v2.4.6

@ngehlenborg
Copy link
Contributor Author

If you have any theories as to why this is happening please write a comment. I will take this out of the current milestone.

@ngehlenborg ngehlenborg removed this from the Release 0.0.1 milestone Mar 6, 2014
@ngehlenborg
Copy link
Contributor Author

I am still observing this problem occasionally. We should look into using the supervisord API (http://supervisord.org/api.html) to see if we can replace the current backend with calls to supervisord. It seems pretty straightforward.

@hackdna: Do you think that supervisord is an appropriate approach for running services in production? And would be still be using supervisord when running on AWS?

@hackdna
Copy link
Member

hackdna commented May 21, 2015

Supervisord can be used in production. However, application architecture will change on AWS (different services running on different EC2 instances), so we'll be probably using something like http://aws.amazon.com/cloudwatch/.

@ngehlenborg
Copy link
Contributor Author

Thanks @hackdna. We could implement a backend for CloudWatch once we move. I suggest we try to connect directly to supervisord for now and pass the information on to the client through our own API. If someone decides to deploy without supervisord, they need to handle monitoring themselves or use the existing backend. supervisord should be a lot more reliable than our current solution and will also be less of a strain on resources.

@hackdna
Copy link
Member

hackdna commented May 21, 2015

Yes, it'd be nice to have a monitoring API in Refinery that could work with different backends but I think there are more important tasks to tackle in the short to medium term.

@ngehlenborg
Copy link
Contributor Author

Closed by removal of ExternalToolStatus backend in e3d4871.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants