Skip to content
This repository has been archived by the owner on Jan 29, 2022. It is now read-only.

Processors in Airflow are stuck in "Running" state #812

Open
vitorbaptista opened this issue Apr 27, 2017 · 2 comments
Open

Processors in Airflow are stuck in "Running" state #812

vitorbaptista opened this issue Apr 27, 2017 · 2 comments
Assignees

Comments

@vitorbaptista
Copy link
Contributor

This is an important issue, that is currently breaking our entire data pipeline.

For example, https://airflow.opentrials.net/admin/airflow/log?task_id=processor_nct&dag_id=run_all_processors&execution_date=2017-04-12T10:51:51.861748.

*** Log file isn't local.
*** Fetching here: http://worker-1:8793/log/run_all_processors/processor_nct/2017-04-12T10:51:51.861748

[2017-04-19 19:52:41,451] {models.py:172} INFO - Filling up the DagBag from /usr/local/airflow/dags/run_all_processors.py
[2017-04-19 19:52:49,110] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:49,277] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:49,662] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:50,468] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:50,671] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:50,918] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:51,742] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:51,942] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:52,183] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:54,532] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:54,879] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:55,023] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:56,414] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:56,490] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:56,757] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:52:58,711] {base_task_runner.py:112} INFO - Running: ['bash', '-c', u'airflow run run_all_processors processor_nct 2017-04-12T10:51:51.861748 --job_id 567 --raw -sd DAGS_FOLDER/run_all_processors.py']
[2017-04-19 19:53:08,256] {base_task_runner.py:95} INFO - Subtask: /usr/local/airflow/src/airflow/airflow/configuration.py:128: DeprecationWarning: This method will be removed in future versions.  Use 'parser.read_file()' instead.
[2017-04-19 19:53:08,257] {base_task_runner.py:95} INFO - Subtask:   self.readfp(StringIO.StringIO(string))
[2017-04-19 19:53:19,965] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:19,965] {__init__.py:56} INFO - Using executor CeleryExecutor
[2017-04-19 19:53:21,098] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:21,098] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
[2017-04-19 19:53:21,864] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:21,864] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
[2017-04-19 19:53:26,296] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:26,222] {models.py:172} INFO - Filling up the DagBag from /usr/local/airflow/dags/run_all_processors.py
[2017-04-19 19:53:31,567] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:31,553] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:31,859] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:31,859] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:32,076] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:32,063] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:34,393] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:34,392] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:34,833] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:34,833] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:35,210] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:35,208] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:36,258] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:36,258] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:36,481] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:36,480] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:36,737] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:36,737] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:38,290] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:38,261] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:38,668] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:38,668] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:38,871] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:38,871] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:40,266] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:40,266] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:40,813] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:40,811] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:41,437] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:41,437] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 19:53:42,565] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:42,551] {models.py:1146} INFO - Dependencies all met for <TaskInstance: run_all_processors.processor_nct 2017-04-12 10:51:51.861748 [queued]>
[2017-04-19 19:53:42,787] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:42,787] {models.py:1146} INFO - Dependencies all met for <TaskInstance: run_all_processors.processor_nct 2017-04-12 10:51:51.861748 [queued]>
[2017-04-19 19:53:42,787] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:42,787] {models.py:1338} INFO - 
[2017-04-19 19:53:42,787] {base_task_runner.py:95} INFO - Subtask: --------------------------------------------------------------------------------
[2017-04-19 19:53:42,787] {base_task_runner.py:95} INFO - Subtask: Starting attempt 2 of 2
[2017-04-19 19:53:42,820] {base_task_runner.py:95} INFO - Subtask: --------------------------------------------------------------------------------
[2017-04-19 19:53:42,820] {base_task_runner.py:95} INFO - Subtask: 
[2017-04-19 19:53:44,155] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:44,154] {models.py:1362} INFO - Executing <Task(DockerOperator): processor_nct> on 2017-04-12 10:51:51.861748
[2017-04-19 19:53:45,535] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:45,391] {docker_operator.py:132} INFO - Starting docker container from image opentrials/processors:latest
[2017-04-19 19:53:45,535] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:45,409] {docker_operator.py:153} INFO - Pulling docker image opentrials/processors:latest
[2017-04-19 19:53:47,942] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:47,942] {docker_operator.py:156} INFO - Pulling from opentrials/processors
[2017-04-19 19:53:47,943] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:47,943] {docker_operator.py:156} INFO - Digest: sha256:bba09355b8e76bb7086a261efd0da380687e98f8b71ebb9301000a3db454aae5
[2017-04-19 19:53:47,943] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:47,943] {docker_operator.py:156} INFO - Status: Image is up to date for opentrials/processors:latest
[2017-04-19 19:53:53,365] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:53,364] {docker_operator.py:181} INFO - fatal: Not a git repository (or any of the parent directories): .git
[2017-04-19 19:53:53,366] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:53,366] {docker_operator.py:181} INFO - fatal: Not a git repository (or any of the parent directories): .git
[2017-04-19 19:53:53,637] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:53:53,637] {docker_operator.py:181} INFO - python -m processors.base.cli nct
[2017-04-19 19:54:01,570] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:01,569] {docker_operator.py:181} INFO - /usr/local/lib/python2.7/site-packages/dotenv/main.py:24: UserWarning: Not loading .env - it doesn't exist.
[2017-04-19 19:54:01,583] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:01,570] {docker_operator.py:181} INFO - warnings.warn("Not loading %s - it doesn't exist." % dotenv_path)
[2017-04-19 19:54:13,595] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,594] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: cochrane_reviews on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,595] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,595] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: interventions on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,596] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,595] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: organisations on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,596] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,596] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: locations on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,596] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,596] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: knex_migrations on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,597] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,596] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: sources on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,597] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,597] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: risk_of_biases_risk_of_bias_criterias on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,631] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,631] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: trials_publications on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,631] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,631] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: trials_documents on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,632] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,632] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: fda_approvals on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,632] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,632] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: documents on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,632] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,632] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: trials_persons on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,633] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,633] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: trials on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,668] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,668] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: trials_locations on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,672] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,672] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: trial_deduplication_logs on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,673] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,672] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: knex_migrations_lock on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,673] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,673] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: conditions on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,674] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,673] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: files on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,732] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,674] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: trials_conditions on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,734] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,697] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: trials_interventions on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,734] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,698] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: risk_of_biases on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,735] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,698] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: records on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,748] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,698] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: fda_applications on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,748] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,698] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: risk_of_bias_criterias on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,748] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,698] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: persons on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,748] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,698] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: publications on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,749] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,731] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: trials_organisations on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:13,749] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:13,731] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: document_categories on <Database(postgres://database:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/database)>
[2017-04-19 19:54:19,313] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,313] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: isrctn on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,314] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,314] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: takeda on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,315] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,314] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: cochrane_reviews on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,315] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,315] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: ictrp on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,315] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,315] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: icdpcs on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,332] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,315] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: actrn on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,333] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,315] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: nct on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,333] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,333] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: jprn on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,334] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,333] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: fdadl on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,335] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,333] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: alembic_version on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,335] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,333] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: pfizer on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,422] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,335] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: pubmed on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,422] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,335] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: fda_dap on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,422] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,335] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: euctr on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,423] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,422] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: gsk on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,441] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,423] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: hra on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:19,442] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:19,442] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: icdcm on <Database(postgres://warehouse:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/warehouse)>
[2017-04-19 19:54:20,518] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:20,517] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: knex_migrations_lock on <Database(postgres://explorer:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/explorer)>
[2017-04-19 19:54:20,518] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:20,517] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: oauth_credentials on <Database(postgres://explorer:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/explorer)>
[2017-04-19 19:54:20,518] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:20,518] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: users on <Database(postgres://explorer:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/explorer)>
[2017-04-19 19:54:20,519] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:20,519] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: data_contributions on <Database(postgres://explorer:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/explorer)>
[2017-04-19 19:54:20,519] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:20,519] {docker_operator.py:181} INFO - DEBUG dataset.persistence.database: Loading table: knex_migrations on <Database(postgres://explorer:*****@opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com:5432/explorer)>
[2017-04-19 19:54:21,046] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:20,974] {docker_operator.py:181} INFO - DEBUG processors.base.writers.source: Source - updated: ClinicalTrials.gov
[2017-04-19 19:54:54,443] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:54,442] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Record - updated: {u'nct': u'NCT00407069'}
[2017-04-19 19:54:54,634] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:54,633] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Trial "e106c50e-b7b5-489f-8a70-fa8c59d2fe01" was matched with record "00001009-e01a-47d0-a1a8-33239685c863" (method: identifiers)
[2017-04-19 19:54:54,634] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:54,633] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Trial - updated: {u'nct': u'NCT00407069'}
[2017-04-19 19:54:54,790] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:54,790] {docker_operator.py:181} INFO - DEBUG processors.base.helpers: Location "united states" normalized as "United States"
[2017-04-19 19:54:54,981] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:54,979] {docker_operator.py:181} INFO - DEBUG processors.base.writers.location: Location - updated: United States
[2017-04-19 19:54:55,160] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:55,156] {docker_operator.py:181} INFO - DEBUG processors.base.writers.person: Person - updated: Donald Leung, MD, PhD
[2017-04-19 19:54:55,634] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:55,633] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Record - updated: {u'nct': u'NCT00167765'}
[2017-04-19 19:54:55,708] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:55,708] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Trial "2d5769d5-13b5-4ed2-b878-6dcc4ddbf711" was matched with record "0000795f-8473-46a5-aaab-f807ebb70ce8" (method: identifiers)
[2017-04-19 19:54:55,709] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:55,709] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Trial - updated: {u'nct': u'NCT00167765'}
[2017-04-19 19:54:56,174] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:56,173] {docker_operator.py:181} INFO - DEBUG processors.base.helpers: Location "germany" normalized as "Germany"
[2017-04-19 19:54:56,244] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:56,243] {docker_operator.py:181} INFO - DEBUG processors.base.writers.location: Location - updated: Germany
[2017-04-19 19:54:56,410] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:56,410] {docker_operator.py:181} INFO - DEBUG processors.base.helpers: Location "switzerland" normalized as "Switzerland"
[2017-04-19 19:54:56,610] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:56,610] {docker_operator.py:181} INFO - DEBUG processors.base.writers.location: Location - updated: Switzerland
[2017-04-19 19:54:56,685] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:56,684] {docker_operator.py:181} INFO - DEBUG processors.base.writers.person: Person - updated: Ralf W. Baumgartner, MD
[2017-04-19 19:54:56,957] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:56,939] {docker_operator.py:181} INFO - DEBUG processors.base.writers.person: Person - updated: Mario Siebler, MD
[2017-04-19 19:54:57,538] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:57,537] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Record - updated: {u'nct': u'NCT01776866'}
[2017-04-19 19:54:57,672] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:57,671] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Trial "4089387a-b8b3-4ee1-9f2c-16dc06b7f306" was matched with record "00008ada-0e8b-4cc0-ab4e-f3defc83b61e" (method: identifiers)
[2017-04-19 19:54:57,672] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:57,671] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Trial - updated: {u'nct': u'NCT01776866'}
[2017-04-19 19:54:57,939] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:57,938] {docker_operator.py:181} INFO - DEBUG processors.base.helpers: Location "china" normalized as "China"
[2017-04-19 19:54:58,094] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:58,094] {docker_operator.py:181} INFO - DEBUG processors.base.writers.location: Location - updated: China
[2017-04-19 19:54:58,344] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:58,344] {docker_operator.py:181} INFO - DEBUG processors.base.writers.person: Person - updated: Yang Shengli, M.D
[2017-04-19 19:54:58,518] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:58,501] {docker_operator.py:181} INFO - DEBUG processors.base.writers.person: Person - updated: Luo Jianping, M.M
[2017-04-19 19:54:58,651] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:58,651] {docker_operator.py:181} INFO - DEBUG processors.base.writers.person: Person - updated: Ma Dongxing, M.D
[2017-04-19 19:54:58,787] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:58,787] {docker_operator.py:181} INFO - DEBUG processors.base.writers.person: Person - updated: Liu Ying, M.M
[2017-04-19 19:54:58,923] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:58,913] {docker_operator.py:181} INFO - DEBUG processors.base.writers.person: Person - updated: Han Wei, M.D
[2017-04-19 19:54:59,147] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:54:59,146] {docker_operator.py:181} INFO - DEBUG processors.base.writers.person: Person - updated: Jing Limin, B.S.M
[2017-04-19 19:55:00,500] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:00,500] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Record - updated: {u'nct': u'NCT02796144'}
[2017-04-19 19:55:00,590] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:00,590] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Trial "bc102f9d-de84-4e81-847b-5980c937d9f9" was matched with record "000095dd-4722-42e5-8521-a31a9ed94b46" (method: identifiers)
[2017-04-19 19:55:00,591] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:00,591] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Trial - updated: {u'nct': u'NCT02796144'}
[2017-04-19 19:55:02,200] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:02,199] {docker_operator.py:181} INFO - DEBUG processors.base.helpers: Location "united states" normalized as "United States"
[2017-04-19 19:55:02,329] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:02,329] {docker_operator.py:181} INFO - DEBUG processors.base.writers.location: Location - updated: United States
[2017-04-19 19:55:02,665] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:02,628] {docker_operator.py:181} INFO - DEBUG processors.base.writers.person: Person - updated: Lars F. Jarskog, MD
[2017-04-19 19:55:03,232] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:03,226] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Record - updated: {u'nct': u'NCT01947491'}
[2017-04-19 19:55:03,356] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:03,355] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Trial "95d50f48-a450-4cef-a7f2-36260d83260f" was matched with record "0001a839-3d88-4792-a4b7-d385a22fedbf" (method: identifiers)
[2017-04-19 19:55:03,357] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:03,356] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Trial - updated: {u'nct': u'NCT01947491'}
[2017-04-19 19:55:03,501] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:03,501] {docker_operator.py:181} INFO - DEBUG processors.base.writers.document: Document - updated: Results
[2017-04-19 19:55:04,605] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:04,605] {docker_operator.py:181} INFO - DEBUG processors.base.helpers: Location "united states" normalized as "United States"
[2017-04-19 19:55:04,703] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:04,702] {docker_operator.py:181} INFO - DEBUG processors.base.writers.location: Location - updated: United States
[2017-04-19 19:55:05,350] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:05,349] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Record - updated: {u'nct': u'NCT02712307'}
[2017-04-19 19:55:05,459] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:05,459] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Trial "9704eec7-dd1b-4af4-b930-fe20cea03a28" was matched with record "00028728-1f23-4c78-bef2-984c3b0da0e6" (method: identifiers)
[2017-04-19 19:55:05,460] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:05,460] {docker_operator.py:181} INFO - DEBUG processors.base.writers.trial_and_record: Trial - updated: {u'nct': u'NCT02712307'}
[2017-04-19 19:55:05,823] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:05,821] {docker_operator.py:181} INFO - DEBUG processors.base.helpers: Location "sweden" normalized as "Sweden"
[2017-04-19 19:55:05,907] {base_task_runner.py:95} INFO - Subtask: [2017-04-19 19:55:05,907] {docker_operator.py:181} INFO - DEBUG processors.base.writers.location: Location - updated: Sweden
[2017-04-19 20:54:13,916] {models.py:172} INFO - Filling up the DagBag from /usr/local/airflow/dags/run_all_processors.py
[2017-04-19 20:54:24,065] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:24,611] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:25,081] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:26,355] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:26,674] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:28,921] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:31,521] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:31,739] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:32,383] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:33,013] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:33,259] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:33,611] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:35,536] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:35,612] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:36,047] {base_hook.py:79} INFO - Using connection to: opentrials.cxglaymtrf0h.eu-west-1.rds.amazonaws.com
[2017-04-19 20:54:40,548] {base_task_runner.py:112} INFO - Running: ['bash', '-c', u'airflow run run_all_processors processor_nct 2017-04-12T10:51:51.861748 --job_id 574 --raw -sd DAGS_FOLDER/run_all_processors.py']
[2017-04-19 20:54:45,688] {jobs.py:2163} WARNING - Recorded pid 163 is not a descendant of the current pid 252

Notice that Airflow runs bash -c "airflow run ..." twice, with the latest one raising an error Recorded pid 163 is not a descendant of the current pid 252. I opened an issue about this on https://issues.apache.org/jira/browse/AIRFLOW-1131, including a small reproducible case.

I have tested locally using the SequentialExecutor and I couldn't reproduce the issue, but I'm not sure if this was because the issue doesn't happen with SequentialExecutor or some problem with how I ran the test.

@vitorbaptista
Copy link
Contributor Author

After more investigation, I confirmed that this issue doesn't happen with the SequentialExecutor, so it can be a workaround while the bug on CeleryExecutor isn't found and fixed. Using the SequentialExecutor means that we won't be able to scale the workers to multiple machines, but that's not a problem for us, as we're using a single machine anyway. The other caveat is that the logs don't seem to be available, but that can be partially solved by #763. We won't be able to follow the log as the task is running, but at least we'll have it after it's finished.

vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue Apr 28, 2017
There's a bug with the CeleryExecutor and DockerOperator where long-running
tasks are stuck in a "running" state, even after the Docker container finished.
I reported the bug on https://issues.apache.org/jira/browse/AIRFLOW-1131.

While that's not finished, we use the LocalExecutor as a workaround.

Fixes opentrials/opentrials#812
@vitorbaptista vitorbaptista self-assigned this Apr 28, 2017
vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue Apr 28, 2017
There's a bug with the CeleryExecutor and DockerOperator where long-running
tasks are stuck in a "running" state, even after the Docker container finished.
I reported the bug on https://issues.apache.org/jira/browse/AIRFLOW-1131.

While that's not finished, we use the LocalExecutor as a workaround.

Fixes opentrials/opentrials#812
vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue Apr 28, 2017
It's needed for the Scheduler to be able to run Docker containers (with the
`LocalExecutor`, the scheduler runs the tasks itself).

opentrials/opentrials#812
@vitorbaptista
Copy link
Contributor Author

Unfortunately, this is still happening in production.

@vitorbaptista vitorbaptista reopened this May 17, 2017
vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue May 17, 2017
Airflow's DockerOperator has an issue with long-running tasks, where it doesn't
detect their exit status, keeping the tasks stuck in a "running" state even
after they're finished (see
https://issues.apache.org/jira/browse/AIRFLOW-1131). This commit implements the
DockerCLIOperator that instead of using the Docker API, as the DockerOperator
does, uses the Docker CLI executable.

I have tested locally and it seems to work around the issues we've been having.
However, we'll only be sure after testing in production. This commit also
changes the EUCTR processor task to use the new DockerCLIOperator, so we can
try it. If it does work, we'll change the helpers to use DockerCLIOperator
instead of Airflow's DockerOperator.

opentrials/opentrials#812
vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue May 17, 2017
Airflow's DockerOperator has an issue with long-running tasks, where it doesn't
detect their exit status, keeping the tasks stuck in a "running" state even
after they're finished (see
https://issues.apache.org/jira/browse/AIRFLOW-1131). This commit implements the
DockerCLIOperator that instead of using the Docker API, as the DockerOperator
does, uses the Docker CLI executable.

I have tested locally and it seems to work around the issues we've been having.
However, we'll only be sure after testing in production. This commit also
changes the EUCTR processor task to use the new DockerCLIOperator, so we can
try it. If it does work, we'll change the helpers to use DockerCLIOperator
instead of Airflow's DockerOperator.

opentrials/opentrials#812
vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue May 17, 2017
Airflow's DockerOperator has an issue with long-running tasks, where it doesn't
detect their exit status, keeping the tasks stuck in a "running" state even
after they're finished (see
https://issues.apache.org/jira/browse/AIRFLOW-1131). This commit implements the
DockerCLIOperator that instead of using the Docker API, as the DockerOperator
does, uses the Docker CLI executable.

I have tested locally and it seems to work around the issues we've been having.
However, we'll only be sure after testing in production. This commit also
changes the EUCTR processor task to use the new DockerCLIOperator, so we can
try it. If it does work, we'll change the helpers to use DockerCLIOperator
instead of Airflow's DockerOperator.

opentrials/opentrials#812
vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue May 19, 2017
We need to set the server's Docker API version when using the client, otherwise
we'll get an error like:

> Error response from daemon: client is newer than server (client API version:
>    1.24, server API version: 1.23)

opentrials/opentrials#812
vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue May 19, 2017
The environment passed to `subprocess.Popen()` needs to:

* Have all values as strings;
* All strings must be byte strings.

We solved this by removing `None` values, converting everything else to string,
and encoding as UTF-8.

The other issue was that we were running Docker passing the env variables like
`--env "FOO=$FOO"`, expecting that the `$FOO` variable would take the contents
of the `FOO` env variable. It wasn't the case. To do so, we had to instead of
running the command directly, running it like `/bin/bash -c "command"`, this
way bash would parse and change the env variables.

The last issue was that we weren't logging the STDERR output, which was fixed
by simply piping it to STDOUT.

opentrials/opentrials#812
vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue Jun 6, 2017
We solved the bug https://issues.apache.org/jira/browse/AIRFLOW-1131 by
creating a new DockerCLIOperator, so we can return using the CeleryExecutor.

Reverts 99106c3.

opentrials/opentrials#812
vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue Jun 6, 2017
The DockerOperator has a bug where it fails to notice when a long-running
container is stopped, keeping the Airflow tasks in "running" state (see
https://issues.apache.org/jira/browse/AIRFLOW-1131). To workaround this, I
wrote the DockerCLIOperator which, instead of using the `docker` Python
bindings, runs Docker directly from a `subprocess`. After a couple weeks
testing with EUCTR, this has been working fine, so now we're using this
operator everywhere.

opentrials/opentrials#812
vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue Jul 21, 2017
We also limit the memory usage to 350 MB. We were facing a few containers
starving, so I'm hoping these changes will fix that. I haven't added resource
limits to the "main" containers as well (worker, webserver, etc.) because I
couldn't find out how to do that in the Docker Cloud stack file.

opentrials/opentrials#812
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant