Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1401655 - Run clients_daily and experiments_daily #176

Merged
merged 3 commits into from
Sep 29, 2017

Conversation

mreid-moz
Copy link
Contributor

This also addresses #155, at least for the main_summary DAG.

This needs to land after the associated PR over in python_mozetl, and I need to test this first, so please hold off on reviewing.

Change the Makefile to pass through environment variables for
AWS credentials, and update the docs to mention running the db
migration target.
@mreid-moz
Copy link
Contributor Author

I was able to run both tasks locally using the updated instructions / Makefile, however both tasks had the same strange behaviour:

$ make run COMMAND="test main_summary experiments_daily 20170828"
docker-compose run -e AWS_SECRET_ACCESS_KEY -e AWS_ACCESS_KEY_ID web airflow test main_summary experiments_daily 20170828
Starting telemetryairflow_app_1 ...
Starting telemetryairflow_app_1 ... done
Starting telemetryairflow_redis_1 ... done
Waiting for db to listen on 5432...
Waiting for redis to listen on 6379...
[2017-09-28 23:17:46,044] {__init__.py:57} INFO - Using executor CeleryExecutor
[2017-09-28 23:17:46,427] {models.py:168} INFO - Filling up the DagBag from /app/dags
/usr/local/lib/python2.7/site-packages/airflow/utils/helpers.py:406: DeprecationWarning: Importing BashOperator directly from <module 'airflow.operators' from '/usr/local/lib/python2.7/site-packages/airflow/operators/__init__.pyc'> has been deprecated. Please import from '<module 'airflow.operators' from '/usr/local/lib/python2.7/site-packages/airflow/operators/__init__.pyc'>.[operator_module]' instead. Support for direct imports will be dropped entirely in Airflow 2.0.
  DeprecationWarning)
/usr/local/lib/python2.7/site-packages/airflow/utils/helpers.py:406: DeprecationWarning: Importing ExternalTaskSensor directly from <module 'airflow.operators' from '/usr/local/lib/python2.7/site-packages/airflow/operators/__init__.pyc'> has been deprecated. Please import from '<module 'airflow.operators' from '/usr/local/lib/python2.7/site-packages/airflow/operators/__init__.pyc'>.[operator_module]' instead. Support for direct imports will be dropped entirely in Airflow 2.0.
  DeprecationWarning)
[2017-09-28 23:17:47,689] {models.py:1128} INFO - Dependencies all met for <TaskInstance: main_summary.experiments_daily 2017-08-28 00:00:00 [None]>
[2017-09-28 23:17:47,695] {models.py:1128} INFO - Dependencies all met for <TaskInstance: main_summary.experiments_daily 2017-08-28 00:00:00 [None]>
[2017-09-28 23:17:47,696] {models.py:1328} INFO -
--------------------------------------------------------------------------------
Starting attempt 1 of 3
--------------------------------------------------------------------------------

[2017-09-28 23:17:47,698] {models.py:1352} INFO - Executing <Task(EMRSparkOperator): experiments_daily> on 2017-08-28 00:00:00
[2017-09-28 23:17:47,727] {credentials.py:556} INFO - Found credentials in environment variables.
[2017-09-28 23:17:48,337] {connectionpool.py:735} INFO - Starting new HTTPS connection (1): us-west-2.elasticmapreduce.amazonaws.com
[2017-09-28 23:17:49,452] {emr_spark_operator.py:164} INFO - Running Spark Job Experiments Daily with JobFlow ID j-SIO18ZNYPX42
[2017-09-28 23:17:49,452] {emr_spark_operator.py:175} INFO - Logs will be available at: https://console.aws.amazon.com/s3/home?region=us-west-2#&bucket=telemetry-test-bucket&prefix=logs/mreid@mozilla.com/Experiments Daily/j-SIO18ZNYPX42
[2017-09-28 23:17:49,664] {emr_spark_operator.py:203} INFO - Spark Job 'Experiments Daily' status' is STARTING
[2017-09-28 23:22:49,770] {connectionpool.py:238} INFO - Resetting dropped connection: us-west-2.elasticmapreduce.amazonaws.com
[2017-09-28 23:22:50,678] {emr_spark_operator.py:203} INFO - Spark Job 'Experiments Daily' status' is BOOTSTRAPPING
[2017-09-28 23:27:50,774] {connectionpool.py:238} INFO - Resetting dropped connection: us-west-2.elasticmapreduce.amazonaws.com
[2017-09-28 23:27:51,605] {emr_spark_operator.py:203} INFO - Spark Job 'Experiments Daily' status' is RUNNING
[2017-09-28 23:32:51,781] {connectionpool.py:238} INFO - Resetting dropped connection: us-west-2.elasticmapreduce.amazonaws.com
[2017-09-28 23:32:52,725] {emr_spark_operator.py:203} INFO - Spark Job 'Experiments Daily' status' is RUNNING
[2017-09-28 23:37:52,754] {connectionpool.py:238} INFO - Resetting dropped connection: us-west-2.elasticmapreduce.amazonaws.com
[2017-09-28 23:37:53,665] {emr_spark_operator.py:203} INFO - Spark Job 'Experiments Daily' status' is RUNNING
[2017-09-28 23:42:53,727] {connectionpool.py:238} INFO - Resetting dropped connection: us-west-2.elasticmapreduce.amazonaws.com
[2017-09-28 23:42:54,642] {emr_spark_operator.py:203} INFO - Spark Job 'Experiments Daily' status' is RUNNING
[2017-09-28 23:47:54,665] {connectionpool.py:238} INFO - Resetting dropped connection: us-west-2.elasticmapreduce.amazonaws.com
[2017-09-28 23:47:55,532] {emr_spark_operator.py:203} INFO - Spark Job 'Experiments Daily' status' is RUNNING
[2017-09-28 23:52:55,545] {connectionpool.py:238} INFO - Resetting dropped connection: us-west-2.elasticmapreduce.amazonaws.com
[2017-09-28 23:52:56,427] {emr_spark_operator.py:203} INFO - Spark Job 'Experiments Daily' status' is RUNNING
[2017-09-28 23:57:56,530] {connectionpool.py:238} INFO - Resetting dropped connection: us-west-2.elasticmapreduce.amazonaws.com
[2017-09-28 23:57:57,593] {emr_spark_operator.py:203} INFO - Spark Job 'Experiments Daily' status' is RUNNING
[2017-09-29 00:02:57,661] {connectionpool.py:238} INFO - Resetting dropped connection: us-west-2.elasticmapreduce.amazonaws.com
[2017-09-29 00:02:58,555] {emr_spark_operator.py:203} INFO - Spark Job 'Experiments Daily' status' is RUNNING
[2017-09-29 00:07:58,621] {connectionpool.py:238} INFO - Resetting dropped connection: us-west-2.elasticmapreduce.amazonaws.com
[2017-09-29 00:07:59,401] {emr_spark_operator.py:203} INFO - Spark Job 'Experiments Daily' status' is RUNNING
[2017-09-29 00:12:59,491] {connectionpool.py:238} INFO - Resetting dropped connection: us-west-2.elasticmapreduce.amazonaws.com
[2017-09-29 00:13:00,380] {connectionpool.py:735} INFO - Starting new HTTPS connection (1): us-west-2.elasticmapreduce.amazonaws.com
[2017-09-29 00:13:00,954] {connectionpool.py:735} INFO - Starting new HTTPS connection (1): us-west-2.elasticmapreduce.amazonaws.com
^Cmake: *** [run] Error 1

After a while, both encountered "Resetting dropped connection", then didn't update any further. Meanwhile, the task itself appeared to run to successful completion on EMR (with the clients_daily task getting retried and running successfully a second time).

Also note that I gave all the existing tasks meaningful names in a separate commit before adding the two new tasks.

Copy link
Contributor

@sunahsuh sunahsuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, just the one comment from me.

This isn't really about airflow, but it seems odd to me that experiments_daily takes longer to run than clients_daily, provided the logic is similar since the volume should be way higher on clients_daily

@@ -30,7 +30,7 @@ redis-cli:
docker-compose run redis redis-cli -h redis

run:
docker-compose run web airflow $(COMMAND)
docker-compose run -e AWS_SECRET_ACCESS_KEY -e AWS_ACCESS_KEY_ID web airflow $(COMMAND)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be more generally applicable to add the env variables in https://github.com/mozilla/telemetry-airflow/blob/master/docker-compose.yml -- that way the keys will be passed in to all the compose-based commands

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I'm going to punt on this for now, since I'm not sure if we want to automatically pass the through more generally when spinning up the stack. I'll file an issue to follow up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #178

@mreid-moz
Copy link
Contributor Author

Experiments daily currently processes all the experiments data each time (not incrementally by day). Changing it to process incrementally is on my todo list.

@sunahsuh
Copy link
Contributor

Adding a comment for posterity -- the errors are probably fine: we see these a lot in the normal course of a job, and I think because of the way jobs are normally run (via celery workers) they're resilient to these dropped connections while the test run is not so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants