New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
external tool status for celery unstable even though celery is running fine #377
Comments
Below is my celeryd log from that session (and the one after I restarted the VM). There seemed to be some issues connecting to the broker and a lot of hard time limit exceed errors (probably connected to the broker issue). My computer went to sleep a few times during that session, maybe this affected the broker running in the VM. But I don't understand why I would sometimes see a SUCCESS on one attempt and a FAILURE on the next. A highlight: [2014-03-03 07:20:44,850: WARNING/MainProcess] discard: Erased 44341 messages from the queue. ---- **** -----
--- * *** * -- [Configuration]
-- * - **** --- . broker: amqp://guest@localhost:5672//
- ** ---------- . loader: djcelery.loaders.DjangoLoader
- ** ---------- . logfile: [stderr]@WARNING
- ** ---------- . concurrency: 4
- ** ---------- . events: ON
- *** --- * --- . beat: OFF
-- ******* ----
--- ***** ----- [Queues]
-------------- . celery: exchange:celery (direct) binding:celery
[2014-02-28 15:39:49,251: WARNING/MainProcess] celery@refinery has started.
2014-02-28 15:39:49 INFO tasks check_for_solr: core.tasks.check_for_solr: Could not connect to Solr
[2014-02-28 18:19:34,135: ERROR/MainProcess] Task core.tasks.check_for_celery[4a5f3d0c-5b63-42a4-aabd-27291acbe563] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-02-28 18:19:34,136: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[4a5f3d0c-5b63-42a4-aabd-27291acbe563]
[2014-03-01 07:17:28,324: ERROR/MainProcess] Task core.tasks.dispatch_galaxy_checks[67bfdf3a-9879-4859-8c2c-e736cc9e61ad] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:28,328: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.dispatch_galaxy_checks[67bfdf3a-9879-4859-8c2c-e736cc9e61ad]
[2014-03-01 07:17:30,891: ERROR/MainProcess] Task core.tasks.check_for_solr[83a760b8-f682-47d7-9d86-1d4abff1c947] raised exception: TimeLimitExceeded(2.5,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.5
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.5
[2014-03-01 07:17:30,893: ERROR/MainProcess] Hard time limit (2.5s) exceeded for core.tasks.check_for_solr[83a760b8-f682-47d7-9d86-1d4abff1c947]
[2014-03-01 07:17:31,418: ERROR/MainProcess] Task core.tasks.check_for_galaxy[93f70d81-8ab9-4e23-bd6d-4c3853996cec] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:31,419: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_galaxy[93f70d81-8ab9-4e23-bd6d-4c3853996cec]
[2014-03-01 07:17:33,762: ERROR/MainProcess] Task core.tasks.check_for_celery[e4841542-09b7-4c12-bfc3-ed443e855d7e] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:33,792: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[e4841542-09b7-4c12-bfc3-ed443e855d7e]
[2014-03-01 07:17:33,796: ERROR/MainProcess] Task core.tasks.check_for_galaxy[99a1d05a-0dac-4492-bae3-a3cdcfe2692a] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:33,796: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_galaxy[99a1d05a-0dac-4492-bae3-a3cdcfe2692a]
[2014-03-01 07:17:43,752: ERROR/MainProcess] Task core.tasks.dispatch_galaxy_checks[3f3439bb-813d-4c6a-b080-5cf74236b615] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:43,755: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.dispatch_galaxy_checks[3f3439bb-813d-4c6a-b080-5cf74236b615]
[2014-03-01 07:17:45,440: ERROR/MainProcess] Task core.tasks.check_for_celery[2e09c423-d720-4f64-bb18-a56715733bae] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:45,443: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[2e09c423-d720-4f64-bb18-a56715733bae]
[2014-03-01 07:17:49,409: ERROR/MainProcess] Task core.tasks.check_for_celery[d1137809-ceb4-4599-a01c-b45e1f9a0947] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:49,409: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[d1137809-ceb4-4599-a01c-b45e1f9a0947]
[2014-03-01 07:17:54,649: ERROR/MainProcess] Task core.tasks.check_for_celery[8d4232ea-bffe-4c66-93b8-4539146498de] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 07:17:54,650: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[8d4232ea-bffe-4c66-93b8-4539146498de]
[2014-03-01 18:02:12,086: ERROR/MainProcess] Task core.tasks.check_for_celery[62b418f5-c73a-4120-9518-7a0c47400e8d] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 18:02:12,095: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[62b418f5-c73a-4120-9518-7a0c47400e8d]
[2014-03-01 18:02:14,107: ERROR/MainProcess] Task core.tasks.check_for_galaxy[87cc81c2-c515-402f-8a2a-9f8b0cc7e2a3] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 18:02:14,108: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_galaxy[87cc81c2-c515-402f-8a2a-9f8b0cc7e2a3]
[2014-03-01 20:26:34,258: ERROR/MainProcess] Task core.tasks.check_for_celery[f5fda794-c57a-4da6-8484-f3ad8fff69f6] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 20:26:34,273: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[f5fda794-c57a-4da6-8484-f3ad8fff69f6]
[2014-03-01 21:33:38,969: ERROR/MainProcess] Task core.tasks.check_for_celery[8c210343-b846-4dcb-8b44-613cb7b23ed9] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-01 21:33:38,969: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[8c210343-b846-4dcb-8b44-613cb7b23ed9]
[2014-03-02 04:08:51,922: ERROR/MainProcess] Task core.tasks.check_for_celery[89c5dd4a-da9b-40ea-9ee0-e802b7c71866] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-02 04:08:51,970: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[89c5dd4a-da9b-40ea-9ee0-e802b7c71866]
[2014-03-02 04:19:24,515: ERROR/MainProcess] Task core.tasks.check_for_celery[a1be83d9-7c65-413d-9f18-a2e31a4cc74e] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-02 04:19:24,516: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[a1be83d9-7c65-413d-9f18-a2e31a4cc74e]
[2014-03-02 04:24:41,198: ERROR/MainProcess] Task core.tasks.check_for_celery[f0cae5db-6eaa-4bea-abb3-f4c0308eee20] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-02 04:24:41,199: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[f0cae5db-6eaa-4bea-abb3-f4c0308eee20]
[2014-03-02 05:05:51,877: ERROR/MainProcess] Task core.tasks.check_for_celery[5926f3b9-a9b4-49d4-a35a-1171e50b26a9] raised exception: TimeLimitExceeded(2.0,)
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 354, in _on_hard_timeout
raise TimeLimitExceeded(hard_timeout)
TimeLimitExceeded: 2.0
[2014-03-02 05:05:51,886: ERROR/MainProcess] Hard time limit (2.0s) exceeded for core.tasks.check_for_celery[5926f3b9-a9b4-49d4-a35a-1171e50b26a9]
[2014-03-03 07:20:24,260: WARNING/MainProcess] celeryd: Warm shutdown (MainProcess)
[2014-03-03 07:20:44,850: WARNING/MainProcess] discard: Erased 44341 messages from the queue.
[2014-03-03 07:20:44,850: WARNING/MainProcess] -------------- celery@refinery v2.4.6
---- **** -----
--- * *** * -- [Configuration]
-- * - **** --- . broker: amqp://guest@localhost:5672//
- ** ---------- . loader: djcelery.loaders.DjangoLoader
- ** ---------- . logfile: [stderr]@WARNING
- ** ---------- . concurrency: 4
- ** ---------- . events: ON
- *** --- * --- . beat: OFF
-- ******* ----
--- ***** ----- [Queues]
-------------- . celery: exchange:celery (direct) binding:celery
[2014-03-03 07:20:44,900: WARNING/MainProcess] celery@refinery has started.
2014-03-03 07:20:45 INFO tasks check_for_solr: core.tasks.check_for_solr: Could not connect to Solr
2014-03-03 07:41:02 DEBUG tasks run_analysis: analysis_manager.tasks run_analysis called
2014-03-03 07:41:02 WARNING models get_absolute_path: Datafile doesn't exist in FileStoreItem '8325e0d4-9dd8-11e3-a08c-080027129698'
2014-03-03 07:41:02 DEBUG tasks import_file: Importing FileStoreItem with UUID '8325e0d4-9dd8-11e3-a08c-080027129698'
2014-03-03 07:41:02 WARNING models get_absolute_path: Datafile doesn't exist in FileStoreItem '8325e0d4-9dd8-11e3-a08c-080027129698'
2014-03-03 07:41:02 DEBUG tasks run_analysis_preprocessing: analysis_manager.run_analysis_preprocessing called
2014-03-03 07:41:03 DEBUG connection create_library: library name: 'Refinery Analysis - 14426220-a2d1-11e3-8022-080027129698 (2014-03-03 07:41:03.038619)'
2014-03-03 07:41:03 DEBUG galaxy_workflow createStepsAnnot: galaxy_workflow.createStepsAnnot called
2014-03-03 07:41:03 DEBUG galaxy_workflow countWorkflowSteps: galaxy_connector.galaxy_workflow countWorkflowSteps called
2014-03-03 07:41:04 DEBUG tasks import_file: Starting download from 'https://main.g2.bx.psu.edu/datasets/d78ba454458040fd/display?to_ext=fastqsanger'
2014-03-03 07:41:43 DEBUG tasks import_file: Finished downloading from 'https://main.g2.bx.psu.edu/datasets/d78ba454458040fd/display?to_ext=fastqsanger'
2014-03-03 07:41:44 DEBUG tasks run_analysis_execution: analysis_manager.run_analysis_execution called
2014-03-03 07:41:44 DEBUG tasks import_analysis_in_galaxy: analysis_manager.tasks import_analysis_in_galaxy called
2014-03-03 07:41:44 ERROR connection post: 400 Client Error: Bad Request - http://192.168.50.1:8080/api/libraries/40876639881ca029/contents?key=dbeaf10787cc00d9f64e733b0802265d
2014-03-03 07:41:44 ERROR tasks run_analysis_execution: Analysis execution failed: error importing analysis 'Test workflow: 5 steps without branching 2014-03-03 @ 07:41:02' into Galaxy: Galaxy request error
2014-03-03 07:41:51 DEBUG tasks chord_postprocessing: analysis_manager.chord_postprocessing called
[2014-03-03 07:44:31,848: ERROR/MainProcess] Consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 310, in start
self.consume_messages()
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 326, in consume_messages
self.connection.drain_events(timeout=1)
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/kombu/connection.py", line 194, in drain_events
return self.transport.drain_events(self.connection, **kwargs)
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/kombu/transport/amqplib.py", line 235, in drain_events
return connection.drain_events(**kwargs)
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/kombu/transport/amqplib.py", line 67, in drain_events
return self.wait_multi(self.channels.values(), timeout=timeout)
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/kombu/transport/amqplib.py", line 92, in wait_multi
return amqp_method(channel, args)
File "/home/vagrant/.virtualenvs/refinery-platform/local/lib/python2.7/site-packages/amqplib/client_0_8/connection.py", line 380, in _close
raise AMQPConnectionException(reply_code, reply_text, (class_id, method_id))
AMQPConnectionException: (320, u"CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'", (0, 0), '')
[2014-03-03 07:44:31,881: ERROR/MainProcess] Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 2 seconds...
[2014-03-03 07:44:33,882: ERROR/MainProcess] Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 4 seconds...
[2014-03-03 07:44:37,924: WARNING/MainProcess] celeryd: Warm shutdown (MainProcess)
[2014-03-03 07:44:38,820: WARNING/MainProcess] celeryd: Warm shutdown (MainProcess)
[2014-03-03 07:46:19,922: WARNING/MainProcess] discard: Erased 16 messages from the queue.
[2014-03-03 07:46:19,923: WARNING/MainProcess] -------------- celery@refinery v2.4.6 |
If you have any theories as to why this is happening please write a comment. I will take this out of the current milestone. |
I am still observing this problem occasionally. We should look into using the supervisord API (http://supervisord.org/api.html) to see if we can replace the current backend with calls to supervisord. It seems pretty straightforward. @hackdna: Do you think that supervisord is an appropriate approach for running services in production? And would be still be using supervisord when running on AWS? |
Supervisord can be used in production. However, application architecture will change on AWS (different services running on different EC2 instances), so we'll be probably using something like http://aws.amazon.com/cloudwatch/. |
Thanks @hackdna. We could implement a backend for CloudWatch once we move. I suggest we try to connect directly to supervisord for now and pass the information on to the client through our own API. If someone decides to deploy without supervisord, they need to handle monitoring themselves or use the existing backend. supervisord should be a lot more reliable than our current solution and will also be less of a strain on resources. |
Yes, it'd be nice to have a monitoring API in Refinery that could work with different backends but I think there are more important tasks to tackle in the short to medium term. |
Closed by removal of ExternalToolStatus backend in e3d4871. |
I observed that the external tool status for celery is switching back and forth between SUCCESS and FAILURE every few seconds (3 - 5 seconds wall clock time), even though the application is running fine.
The task statuses are all looking ok, too:
How can this happen?
This was observed on the VM after restarting all tools and running the node_relationship_ui_feature branch (even though I doubt that there is any difference to the develop branch with regards to the external tool status monitoring).
The text was updated successfully, but these errors were encountered: