Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kc data downloads are not working #1347

Closed
tinok opened this issue Jun 19, 2017 · 7 comments
Closed

kc data downloads are not working #1347

tinok opened this issue Jun 19, 2017 · 7 comments
Assignees
Labels
bug Things broken and not working as expected CRITICAL Red alert! Fix me ASAP!

Comments

@tinok
Copy link
Member

tinok commented Jun 19, 2017

Export remains as 'pending' forever.

Seems like celery is down?

@tinok tinok added bug Things broken and not working as expected CRITICAL Red alert! Fix me ASAP! labels Jun 19, 2017
@esmail esmail assigned esmail and jnm Jun 19, 2017
@jnm
Copy link
Member

jnm commented Jun 19, 2017

OCHA-specific issue; resolved. Celery has processed all backlogged jobs.

@jnm jnm closed this as completed Jun 19, 2017
@jnm jnm removed the ready label Jun 19, 2017
@jnm
Copy link
Member

jnm commented Jun 19, 2017

Just documenting some strange behavior I saw while Celery was working to empty the queue. As the backlog was getting down to around 20 tasks, the workers seemed like they were getting lethargic. I restarted them and watched the output; right away a bunch of delete_export tasks came in:

kobocat_1  | [2017-06-19 13:43:53,181: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[295fd9c1-c2dd-40c2-aa44-6bf9d46c5599] eta:[2017-06-19 13:44:00.688790-04:00]
kobocat_1  | [2017-06-19 13:43:53,183: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[77994cae-c8a9-4b6f-acb1-eb832494faa0] eta:[2017-06-19 13:44:01.351635-04:00]
kobocat_1  | [2017-06-19 13:43:53,183: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[001cc6d6-51c3-4100-a123-297461e4d3b2] eta:[2017-06-19 13:44:17.640801-04:00]
kobocat_1  | [2017-06-19 13:43:53,184: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[f034d8d8-88e0-4833-bd93-38d71fbc5d4f] eta:[2017-06-19 13:44:21.204096-04:00]
kobocat_1  | [2017-06-19 13:43:53,185: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[9c51184a-413e-4234-8c96-4dfd654fc464] eta:[2017-06-19 13:45:08.698221-04:00]
kobocat_1  | [2017-06-19 13:43:53,185: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[8a0cccc7-427e-462a-9675-fe03f20ff1db] eta:[2017-06-19 13:45:13.646240-04:00]
kobocat_1  | [2017-06-19 13:43:53,186: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[26a955d2-34f2-46be-94fd-8047b2d32e61] eta:[2017-06-19 13:45:15.798513-04:00]
kobocat_1  | [2017-06-19 13:43:53,187: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[679f79ac-9126-4c92-bee2-5db857740b6c] eta:[2017-06-19 13:45:16.621567-04:00]
kobocat_1  | [2017-06-19 13:43:53,187: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[684f03b5-c62a-4677-86ee-69a98b930e97] eta:[2017-06-19 13:45:58.004685-04:00]
kobocat_1  | [2017-06-19 13:43:53,188: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[a014034f-afb2-44e2-aff7-2efd1db4411f] eta:[2017-06-19 13:46:20.277894-04:00]
kobocat_1  | [2017-06-19 13:43:53,189: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[16b11899-dcad-41f4-9ff3-6e42a74dba94] eta:[2017-06-19 13:46:59.236211-04:00]
kobocat_1  | [2017-06-19 13:43:53,189: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[f809ac31-5342-4e6a-af72-7d36f1285cf2] eta:[2017-06-19 13:47:55.171423-04:00]
kobocat_1  | [2017-06-19 13:43:53,190: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[756dc216-2d99-41b1-a2b0-ccd9225b5392] eta:[2017-06-19 13:47:55.185656-04:00]
kobocat_1  | [2017-06-19 13:43:53,190: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[29469403-6c49-460a-b7aa-c04d1fc9b396] eta:[2017-06-19 13:48:11.817490-04:00]
kobocat_1  | [2017-06-19 13:43:53,190: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[eddb64f8-44db-4f8d-a11d-70249c11a183] eta:[2017-06-19 13:48:15.886463-04:00]
kobocat_1  | [2017-06-19 13:43:53,191: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[cb095515-0d81-4c37-92d2-48aa3ce7a40c] eta:[2017-06-19 13:53:27.213334-04:00]
kobocat_1  | [2017-06-19 13:43:53,191: INFO/MainProcess] Received task: onadata.apps.viewer.tasks.delete_export[03da7c3c-971f-46c5-9a2f-b10adc55dfda] eta:[2017-06-19 13:57:23.721203-04:00]

Each of these then went through a sequence of TaskPool: Apply, Task accepted, and Task...succeeded in.... There were surprising delays between subsequent TaskPool: Applys, during which workers were idle:

Timestamp Task UID Seconds until success Seconds since previous TaskPool: Apply
06/19/17 01:44 PM 295fd9c1-c2dd-40c2-aa44-6bf9d46c5599 0.99
06/19/17 01:44 PM 77994cae-c8a9-4b6f-acb1-eb832494faa0 0.87 0.66
06/19/17 01:44 PM 001cc6d6-51c3-4100-a123-297461e4d3b2 2.1 16.29
06/19/17 01:44 PM f034d8d8-88e0-4833-bd93-38d71fbc5d4f 2.54 3.56
06/19/17 01:45 PM 9c51184a-413e-4234-8c96-4dfd654fc464 1.18 47.49
06/19/17 01:45 PM 8a0cccc7-427e-462a-9675-fe03f20ff1db 2.27 4.95
06/19/17 01:45 PM 26a955d2-34f2-46be-94fd-8047b2d32e61 0.7 2.15
06/19/17 01:45 PM 679f79ac-9126-4c92-bee2-5db857740b6c 1.18 0.82
06/19/17 01:45 PM 684f03b5-c62a-4677-86ee-69a98b930e97 1.84 41.38
06/19/17 01:46 PM a014034f-afb2-44e2-aff7-2efd1db4411f 1.47 22.27
06/19/17 01:46 PM 16b11899-dcad-41f4-9ff3-6e42a74dba94 0.73 38.96
06/19/17 01:47 PM f809ac31-5342-4e6a-af72-7d36f1285cf2 0.59 55.93
06/19/17 01:47 PM 756dc216-2d99-41b1-a2b0-ccd9225b5392 0.68 0.01
06/19/17 01:48 PM 29469403-6c49-460a-b7aa-c04d1fc9b396 2.06 16.63
06/19/17 01:48 PM eddb64f8-44db-4f8d-a11d-70249c11a183 0.8 4.07
06/19/17 01:53 PM cb095515-0d81-4c37-92d2-48aa3ce7a40c 0.93 311.33
06/19/17 01:57 PM 03da7c3c-971f-46c5-9a2f-b10adc55dfda 0.62 236.51

Overall, it took 803.65 seconds to get through these deletion jobs, but only 21.55 seconds were spent doing any work.

@tinok
Copy link
Member Author

tinok commented Jul 5, 2017

@jnm Reopening this bug because the same behavior is present again on KC-PROD as of 7/5.

@jnm
Copy link
Member

jnm commented Jul 5, 2017

Resolved. Systems administration issue. See https://www.flowdock.com/app/kobotoolbox/kobo/threads/OLq-mzfkYsa365DL_szuaa-IIeY

@jnm jnm closed this as completed Jul 5, 2017
@jnm jnm removed the ready label Jul 5, 2017
@chlarsen
Copy link

chlarsen commented Nov 3, 2018

The above link demands privileged access. Could you please elaborate, which sysadmin issues where at play? Thank you!

@tinok
Copy link
Member Author

tinok commented Nov 6, 2018

@chlarsen the issue at the time was that disk space was exhausted on the server, which caused RabbitMQ to stop working. Expanding disk space and restarting RabbitMQ solved this problem.

@chlarsen
Copy link

chlarsen commented Nov 7, 2018

Thank you, Tino, much appreciated. I should look, what is wrong along that line!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Things broken and not working as expected CRITICAL Red alert! Fix me ASAP!
Projects
None yet
Development

No branches or pull requests

5 participants