Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-910] Use parallel task execution in backfills #2107

Closed
wants to merge 1 commit into from

Conversation

bolkedebruin
Copy link
Contributor

@bolkedebruin bolkedebruin commented Feb 25, 2017

Dear Airflow Maintainers,

Please accept this PR that addresses the following issues:

The refactor to use dag runs in backfills caused a regression
in task execution performance as dag runs were executed
sequentially. Next to that, the backfills were non deterministic
due to the random execution of tasks, causing root tasks
being added to the non ready list too soon.

This updates the backfill logic as follows:

  • Parallelize execution of tasks
  • Use a leave first execution model; Breadth-first algorithm by Jerermiah
  • Replace state updates from the executor by task based only updates

Will add tests after deemed acceptable.

@aoen @saguziel @mistercrunch @jlowin

@mention-bot
Copy link

@bolkedebruin, thanks for your PR! By analyzing the history of the files in this pull request, we identified @mistercrunch, @plypaul and @aoen to be potential reviewers.

@bolkedebruin bolkedebruin force-pushed the AIRFLOW-910 branch 4 times, most recently from d6a3968 to e299807 Compare February 25, 2017 20:29
@codecov-io
Copy link

codecov-io commented Feb 26, 2017

Codecov Report

Merging #2107 into master will increase coverage by 0.06%.
The diff coverage is 83.59%.

@@            Coverage Diff             @@
##           master    #2107      +/-   ##
==========================================
+ Coverage   67.12%   67.19%   +0.06%     
==========================================
  Files         142      142              
  Lines       10734    10769      +35     
==========================================
+ Hits         7205     7236      +31     
- Misses       3529     3533       +4
Impacted Files Coverage Δ
airflow/jobs.py 73.36% <81.98%> (-0.13%)
airflow/models.py 86.86% <94.11%> (+0.16%)
airflow/ti_deps/deps/not_running_dep.py 88.88% <0%> (-11.12%)
airflow/executors/dask_executor.py 81.39% <0%> (+2.32%)
airflow/utils/state.py 100% <0%> (+3.33%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 201bd92...5e262b7. Read the comment docs.

@@ -3032,6 +3032,25 @@ def get_task_instances(
def roots(self):
return [t for t in self.tasks if not t.downstream_list]

def topographical_sort(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be named topological sort (https://en.wikipedia.org/wiki/Topological_sorting) for consistency

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zodiac that's my fault :)

@@ -3032,6 +3032,25 @@ def get_task_instances(
def roots(self):
return [t for t in self.tasks if not t.downstream_list]

def topology_sort(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/topology/topological

@@ -3032,6 +3032,25 @@ def get_task_instances(
def roots(self):
return [t for t in self.tasks if not t.downstream_list]

def topology_sort(self):
"""
Sorts tasks in topographical order, such that a task comes after any of its
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

topological

Sorts tasks in topographical order, such that a task comes after any of its
upstream dependencies.
"""
stack, visited, sort = [], set(), []
Copy link
Contributor

@saguziel saguziel Feb 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/sort/sorted_items or something similar

stack, visited, sort = [], set(), []

# begin with any tasks that have
stack.extend(t for t in self.tasks if not t.upstream_list)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer explicitness with len(t.upstream_list) > 0

@aoen
Copy link
Contributor

aoen commented Mar 1, 2017

Would be good to add a test that ensures multiple dagruns are kicked off as appropriate for backfilling.

@bolkedebruin
Copy link
Contributor Author

@aoen working on that as we speak. Will take me a little while though, due to the requirement of mocking several things. Probably a day or so.

I think I need to add the following to this PR:

  1. Handle state of QUEUED, also after restart of Backfill - i.e. like we do with the orphaned tasks of a scheduled dag run
  2. Reintroduce some of the earlier executor logic (ie. verify queued/running tasks in the executor)
  3. Split up some of the logic to make it easier to test separate parts

@bolkedebruin
Copy link
Contributor Author

@aoen Tests added for concurrent execution. Still WIP.

@bolkedebruin
Copy link
Contributor Author

@aoen @jlowin @saguziel ready for review. Will squash/rebase after LGTM.

@@ -118,6 +118,33 @@ def test_dag_as_context_manager(self):
self.assertEqual(dag.dag_id, 'creating_dag_in_cm')
self.assertEqual(dag.tasks[0].task_id, 'op6')

def test_dag_topological_sort(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like tests could be a bit more extensive here (e.g. test empty DAG).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

@@ -3032,6 +3032,25 @@ def get_task_instances(
def roots(self):
return [t for t in self.tasks if not t.downstream_list]

def topological_sort(self):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Add return docstring to method signature

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

Sorts tasks in topographical order, such that a task comes after any of its
upstream dependencies.
"""
stack, visited, sorted_tasks = [], set(), []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: don't think there is an advantage to combining these on the same line here

# begin with any tasks that have no upstream dependency
stack.extend(t for t in self.tasks if len(t.upstream_list) == 0)

while stack:
Copy link
Contributor

@aoen aoen Mar 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are cases that this wouldn't work on (and we should add it to tests as well).

If a DAG looks like:
A->B
A->C->D
(where A -> B means task A depends on task B)
then your algorithm will have the stack looking like this over time:
B D
D A (INCORRECT! A should not be processed until C is processed).

Also visited is not being used here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you say here A is dependent on both C and D? @jlowin as the "owner" of the algorithm would you mind taking a look?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A depends on task on task C which depends on task D (A depends on D but only transitively)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Good point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an implementation I found online in case you want to use it:
http://blog.jupo.org/2012/04/06/topological-sorting-acyclic-directed-graphs/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O nice! Will update tomorrow. Btw It seems we also have one in our JavaScript somewhere.

@bolkedebruin
Copy link
Contributor Author

@aoen @saguzie @zodiac ready for review: topological sort has been updated, tests improved. Will squash/rebase after LGTM

@bolkedebruin bolkedebruin force-pushed the AIRFLOW-910 branch 5 times, most recently from 50e98b3 to 3811bfa Compare March 11, 2017 16:40
    The refactor to use dag runs in backfills caused a regression
    in task execution performance as dag runs were executed
    sequentially. Next to that, the backfills were non deterministic
    due to the random execution of tasks, causing root tasks
    being added to the non ready list too soon.

    This updates the backfill logic as follows:
    * Parallelize execution of tasks
    * Use a leave first execution model; Breadth-first algorithm by Jerermiah
    * Replace state updates from the executor by task based only
bolkedebruin added a commit to bolkedebruin/airflow that referenced this pull request Mar 12, 2017
The refactor to use dag runs in backfills caused a
regression
in task execution performance as dag runs were
executed
sequentially. Next to that, the backfills were non
deterministic
due to the random execution of tasks, causing root
tasks
being added to the non ready list too soon.

This updates the backfill logic as follows:
* Parallelize execution of tasks
* Use a leave first execution model
* Replace state updates from the executor by task
based only

Closes apache#2107 from bolkedebruin/AIRFLOW-910
alekstorm pushed a commit to alekstorm/incubator-airflow that referenced this pull request Jun 1, 2017
The refactor to use dag runs in backfills caused a
regression
in task execution performance as dag runs were
executed
sequentially. Next to that, the backfills were non
deterministic
due to the random execution of tasks, causing root
tasks
being added to the non ready list too soon.

This updates the backfill logic as follows:
* Parallelize execution of tasks
* Use a leave first execution model
* Replace state updates from the executor by task
based only

Closes apache#2107 from bolkedebruin/AIRFLOW-910
saguziel pushed a commit to saguziel/incubator-airflow that referenced this pull request Oct 31, 2017
The refactor to use dag runs in backfills caused a
regression
in task execution performance as dag runs were
executed
sequentially. Next to that, the backfills were non
deterministic
due to the random execution of tasks, causing root
tasks
being added to the non ready list too soon.

This updates the backfill logic as follows:
* Parallelize execution of tasks
* Use a leave first execution model
* Replace state updates from the executor by task
based only

Closes apache#2107 from bolkedebruin/AIRFLOW-910
tedmiston added a commit to astronomer/airflow that referenced this pull request Mar 13, 2018
commit e42cec6b50bc2930db165eabe2913f05cba6418a
Author: Courtney Wurtz <courtney.wurtz@gmail.com>
Date:   Wed Jan 3 11:52:20 2018 -0500

    Pulled flask_wtf.csrf import into separate file

    Plugins using blueprints that need access to the csrf object can hit a problem with circular imports.  When the scheduler loads the configured executor, it will import all the plugins.  If the plugin imports airflow.www.app, it will import airflow.jobs.  Airflow jobs can import the executor.

    By pulling the csrf to its own file that a plugin and www.app can import, it will prevent any circular reference.

    #astro

commit cd66333b8fb02dabb8dffbd03437372d0436b57f
Author: Andy Cooper <andycooper.s@gmail.com>
Date:   Wed Dec 20 08:25:35 2017 -0500

    hotfix for pandas package bad version

commit 0eb7862730c68d25ebbabf1988d66d50dd988bb0
Merge: 12e9dcfb 90d5aaa7
Author: Andy Cooper <andycooper.s@gmail.com>
Date:   Mon Oct 9 07:45:41 2017 -0400

    Merge pull request #16 from astronomerio/Refactor-S3-Hook-To-Use-Login-and-Password-For-Access-Keys

    Add S3Hook login and password support for keys

commit 12e9dcfb9688588d391ed6530b81a3274fe56220
Author: Greg Brunk <greg.brunk@gmail.com>
Date:   Wed Oct 4 17:48:23 2017 -0600

    Astronomerize the UI (#18)

    * Astronomer-izes the airflow theme

    * Removes `+ Astronomer`

    * Revert "Removes `+ Astronomer`"

    This reverts commit e097b3e7c4441bff014446c37b0dd0e29c7d4dda.

    * Revert "Astronomer-izes the airflow theme"

    This reverts commit 4e0b1072db2bfc2ee2af2d2e8adf7ba6a21de23d.

    * Revert "Merge branch 'astronomer-fixes-182' into astronomerize-ui-gbrunk"

    This reverts commit 60e2a95e16e7faead923913805ee9d7e09d31576, reversing
    changes made to e097b3e7c4441bff014446c37b0dd0e29c7d4dda.

    * Revert trailing whitespace stripper changing license

commit 90d5aaa72b686aaeca395675e5552c440b42ae32
Author: Taylor Edmiston <tedmiston@gmail.com>
Date:   Wed Oct 4 18:45:39 2017 -0400

    Simplify creds in conn

commit 0039bf4990aa6d67f33cb89875759601954f3f4e
Author: Taylor Edmiston <tedmiston@gmail.com>
Date:   Wed Oct 4 18:40:31 2017 -0400

    Refactor S3Hook connection support for AWS keys

    Adds support for providing access + secret via login + password as a fallback to the default approach in extra_params.

commit 9454f40d4ab0cdf5934aed36796960f71a8131a3
Author: Andy Cooper <andycooper.s@gmail.com>
Date:   Wed Sep 27 11:07:47 2017 -0400

    Add S3Hook login and password support for keys

commit 55eb493861a4ee8441a87f3ab23ffc5d8efb9621
Merge: 50cd0a79 8eedb9d2
Author: Joshua Thompson <Jethom18@gmail.com>
Date:   Mon Sep 25 13:42:33 2017 -0400

    Merge pull request #15 from astronomerio/clamp-snowflake-version

    Clamp snowflake version to 1.4.1

commit 8eedb9d2136607b72c8ee375d539a783e23c3e7c
Author: Joshua Thompson <Jethom18@gmail.com>
Date:   Fri Sep 22 12:24:35 2017 -0400

    Clamp snowflake version to 1.4.1

commit 50cd0a79ea02a86fa08fb3c126b710cb924377fb
Merge: bb183852 419313da
Author: Andy Cooper <andycooper.s@gmail.com>
Date:   Thu Sep 21 13:11:54 2017 -0400

    Merge pull request #14 from astronomerio/add-snowflake-dependency

    Add snowflake dependency

commit 419313da9e090429bb994393b972d444e7650987
Author: Joshua Thompson <Jethom18@gmail.com>
Date:   Thu Sep 21 12:42:46 2017 -0400

    Add snowflake dependency

commit bb1838529cd8c317aadc023f5eb18eded5c8da6c
Author: Andy Cooper <andycooper.s@gmail.com>
Date:   Wed Sep 20 16:26:37 2017 -0400

    Add to extra_requires

commit 231bb971990979c8b1103309d045b2eb7c2f3f7c
Merge: f8b1353e db958b3f
Author: Andy Cooper <andycooper.s@gmail.com>
Date:   Wed Sep 20 14:53:38 2017 -0400

    Merge pull request #13 from astronomerio/adding-pymongo-dependency

    Add pymongo dependency to setup.py

commit db958b3fbd78eafa52ae739f361477d321cda21a
Author: Andy Cooper <andycooper.s@gmail.com>
Date:   Wed Sep 20 14:50:38 2017 -0400

    Add pymongo dependency to setup.py

commit f8b1353ece51c22d3d0049d47ce4adddbc75d08f
Merge: 453ac1bb e9fde86c
Author: Mike <perryao@users.noreply.github.com>
Date:   Wed Sep 20 12:56:53 2017 -0400

    Merge pull request #12 from astronomerio/306-fixing-templating-to-support-mixed-attr-dicts

    Adding support for dictionary int attr in templating

commit e9fde86c17089f78da58aa7f284f21cabc0fa622
Author: Andy Cooper <andycooper.s@gmail.com>
Date:   Mon Sep 18 16:59:15 2017 -0400

    Adding support for dictionary int attr in templating

commit 453ac1bb34937ddd00c3bf0c1db73ac1ccc6a5d3
Author: Courtney Wurtz <courtney.wurtz@gmail.com>
Date:   Fri Aug 25 12:08:37 2017 -0400

    Fixed first session login persistance issue

    The first time a user logs into Airflow, it fails to persist the session.   This causes the login form to silently reload, leading a user to question wtf just happened.

    The cause is the user_id is not populated from the auto-increment id after it is inserted into the database. I don’t have the time right now to investigate what is happening inside flask/alchemy to cause this, so I just did a quick fix of querying the database after committing the changes.  Since this only happens the first time an email is used to login, the extra load is basically irrelivant.

    On a side note, I tried doing a “session.refresh(user)” after the “session.commit”, however this gives an error of “Instance '<User at 0x110f872b0>' is not persistent within this Session”, so it just redoes the query that would happen if a user logs in with a user that already exists in the airflow database.

    Resolves https://github.com/astronomerio/engineering/issues/163

commit bdae8970dbc6f17f6f80fae52123a82455f7d227
Author: Courtney Wurtz <courtney.wurtz@gmail.com>
Date:   Fri Aug 18 17:56:57 2017 -0400

    Added organization id to auth API call

    Now adds the organization id and verifies the user has access to that organization, otherwise auth fails.

commit c7bad247c086ea1d1f4f3446ffcbcbcfd3bcebbf
Author: Courtney Wurtz <courtney.wurtz@gmail.com>
Date:   Tue Aug 15 09:40:16 2017 -0400

    Updated how endpoint string is generated

    Also removed unneeded todo comments

commit 17f72ce648eddcd93fee9c5e80863bf8aa439bc2
Author: Courtney Wurtz <courtney.wurtz@gmail.com>
Date:   Mon Aug 14 14:22:31 2017 -0400

    Added auth backend for Astronomer, which allows users to login to Airflow webserver with their Astronomer login

commit fbcfce967018ef350eafa35f67404ad3e28c34a6
Author: Mike Perry <mike@astronomer.io>
Date:   Wed Jul 5 14:00:41 2017 -0400

    DagFileProcessor terminates its child process when done

commit df75ea78a9581007b44eef6303fe01aa0e1d9575
Author: Mike Perry <mike@astronomer.io>
Date:   Mon Jun 26 15:41:56 2017 -0400

    ignore catchup for dags with @once schedules

commit ca0c4ce7f50328a869d5ec4f7b13ea76eac4cbeb
Author: Mike Perry <mike@astronomer.io>
Date:   Mon May 15 14:44:56 2017 -0400

    fix for utf8 logging in dockeroperator

commit f5b8673417e1fd15385eab5f5d7afc6522a7cf5c
Author: Mike Perry <mike@astronomer.io>
Date:   Thu Apr 13 11:39:40 2017 -0400

    implemented lazy import of the default executor. temporary commit. see: https://github.com/apache/incubator-airflow/pull/2120

commit 6636e7bc77f8283f40a828524ff4d7fe727d61cb
Author: Mike Perry <mike@astronomer.io>
Date:   Fri Dec 9 16:55:46 2016 -0500

    DockerOperator can optionally remove a container on exit

commit 9de3ffc56496aa812d9477c25afbad3b348759a5
Author: Mike Perry <mike@astronomer.io>
Date:   Mon Nov 21 17:45:48 2016 -0500

    updated DockerOperator initializer to accept a privileged param for governing whether the container runs in privileged mode

commit 32a26d84b679a54add43092d0bdb77350dcbaeaf
Author: Maxime Beauchemin <maxime.beauchemin@apache.org>
Date:   Mon Aug 7 21:57:29 2017 -0700

    Set version 1.8.2rc2 -> 1.8.2

commit 0be35d6280cd3e2a64b02989de4a3d99e24f0989
Author: Maxime Beauchemin <maximebeauchemin@gmail.com>
Date:   Mon Jul 31 15:48:41 2017 -0700

    [AIRFLOW-1476] add INSTALL instruction for source releases

commit 302520828cdce65a4efa65d495bb0b5d05b35069
Author: Maxime Beauchemin <maximebeauchemin@gmail.com>
Date:   Thu Jun 22 15:26:14 2017 -0700

    Updating CHANGELOG for 1.8.2rc2

commit 9a53e66390670c7a9e4206f0a3ef4a19c1baae72
Author: Chad Henderson <chenderson@gritlogic.com>
Date:   Sun Feb 19 10:03:18 2017 +0100

    [AIRFLOW-809][AIRFLOW-1] Use __eq__ ColumnOperator When Testing Booleans

    The .is_ ColumnOperator causes the SqlAlchemy's
    MSSQL dialect to produce
    IS 0 when given a value of False rather than a
    value of None. The __eq__
    ColumnOperator does this same test with the added
    benefit that it will
    modify the resulting expression from and == to a
    IS NULL when the target
    is None.

    This change replaces all is_ ColumnOperators that
    are doing boolean
    comparisons and leaves all is_ ColumnOperators
    that are checking for
    None values.

    Closes #2022 from gritlogic/AIRFLOW-809

commit 3975d3dadc35a7f1606976edb21c145354905993
Author: Maxime Beauchemin <maximebeauchemin@gmail.com>
Date:   Thu Jun 22 14:52:52 2017 -0700

    Bump to 1.8.2rc2

commit f58cfa3e1a98b613625b3f9c3e91cb75ba4e85ca
Author: Maxime Beauchemin <maximebeauchemin@gmail.com>
Date:   Wed Jun 21 15:14:12 2017 -0700

    Update CHANGELOG for 1.8.2rc

commit 333e0b3ebf79388f534ca52a89c5ae04e804f804
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Wed Jun 21 10:12:09 2017 +0200

    [AIRFLOW-1296] Propagate SKIPPED to all downstream tasks

    The ShortCircuitOperator and LatestOnlyOperator
    did not mark
    all downstream tasks as skipped, but only direct
    downstream
    tasks.

    Closes #2365 from bolkedebruin/AIRFLOW-719-3

    (cherry picked from commit a45e2d1888ffb19dab8401e07b10724090bf20f0)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 93825d50797fb3f04ba730c4d0868132d0bec8df
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Fri Jun 16 08:41:54 2017 -0400

    Re-enable caching for hadoop components

    (cherry picked from commit fb21bcbcc1ffaaf78fde2e0d9a9b1414c346ec51)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 33a9dcbb673713aecdca6febf04ab422c447ed95
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Wed Jun 14 18:19:10 2017 -0400

    Pin Hive and Hadoop to a specific version and create writable warehouse dir

    (cherry picked from commit 38b2747c5b50afc5f21af5b44e8a0ccf9a440559)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 7cff6cde4b50eb96595bb28e05b2fce99752abbf
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Thu Jun 15 09:44:16 2017 -0400

    [AIRFLOW-1308] Disable nanny usage for Dask

    Nanny is deprecated and results in build errors.

    Closes #2366 from bolkedebruin/fix_dask

    (cherry picked from commit 10826711846f06476d343b150412105489096179)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit c6a09c47e6402a05d2477e6431c6d717e2b5a3ba
Author: Maxime Beauchemin <maximebeauchemin@gmail.com>
Date:   Tue Jun 13 08:25:28 2017 -0700

    Updating CHANGELOG for 1.8.2rc1

commit 570b2ed3ef01123dace11b620b4fcafde3bcd8b8
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Tue Jun 13 08:21:26 2017 -0700

    [AIRFLOW-1294] Backfills can loose tasks to execute

    In backfills we can loose tasks to execute due to
    a task
    setting its own state to NONE if concurrency
    limits are reached,
    this makes them fall outside of the scope the
    backfill is
    managing hence they will not be executed.

    Dear Airflow maintainers,

    Please accept this PR. I understand that it will
    not be reviewed until I have checked off all the
    steps below!

    ### JIRA
    - [X] My PR addresses the following [Airflow JIRA]
    (https://issues.apache.org/jira/browse/AIRFLOW/)
    issues and references them in the PR title. For
    example, "[AIRFLOW-XXX] My Airflow PR"
        -
    https://issues.apache.org/jira/browse/AIRFLOW-1294

    ### Description
    - [X] Here are some details about my PR, including
    screenshots of any UI changes:

    In backfills we can loose tasks to execute due to
    a task
    setting its own state to NONE if concurrency
    limits are reached,
    this makes them fall outside of the scope the
    backfill is
    managing hence they will not be executed.

    ### Tests
    - [X] My PR adds the following unit tests __OR__
    does not need testing for this extremely good
    reason:
    Should be covered by current tests, will adjust if
    required.

    ### Commits
    - [X] My commits all reference JIRA issues in
    their subject lines, and I have squashed multiple
    commits if they address the same issue. In
    addition, my commits follow the guidelines from
    "[How to write a good git commit
    message](http://chris.beams.io/posts/git-
    commit/)":
        1. Subject is separated from body by a blank line
        2. Subject is limited to 50 characters
        3. Subject does not end with a period
        4. Subject uses the imperative mood ("add", not
    "adding")
        5. Body wraps at 72 characters
        6. Body explains "what" and "why", not "how"

    mistercrunch aoen saguziel This is a simplified
    fix that should be easier to digest in 1.8.2. It
    does not address all underlying issues as in
    https://github.com/apache/incubator-
    airflow/pull/2356 , but those can be addressed
    separately and in smaller bits.

    Closes #2360 from bolkedebruin/fix_race_backfill_2

commit 3f48d48e1b76f7db9db0db11b4f247d10f7e4006
Author: Maxime Beauchemin <maximebeauchemin@gmail.com>
Date:   Mon Jun 12 10:47:56 2017 -0700

    [AIRFLOW-1291] Update NOTICE and LICENSE files to match ASF requirements

    JIRA:
    https://issues.apache.org/jira/browse/AIRFLOW-1291

    * Update NOTICE with proper year range for ASF
    copyright
    * Break down LICENSE into
    licenses/LICENSE-[project].txt
    * add license header to jqClock.min.js

    [AIRFLOW-1291] Update NOTICE and LICENSE files to
    match ASF requirements

    * Update NOTICE with proper year range for ASF
    copyright
    * Break down LICENSE into
    licenses/LICENSE-[project].txt
    * add license header to jqClock.min.js

    fix license check

    Closes #2354 from
    mistercrunch/copyright_license_touchups

commit a2dd2465ae37cd6bba20d5f127a29be0c5080fc4
Author: Maxime Beauchemin <maximebeauchemin@gmail.com>
Date:   Wed Jun 7 21:37:28 2017 -0700

    Update CHANGELOG for 1.8.2rc1

commit 7f7c52fe4f4ab93f9b073c69e83836caee4fb023
Author: Maxime Beauchemin <maximebeauchemin@gmail.com>
Date:   Wed Jun 7 21:35:51 2017 -0700

    [AIRFLOW-XXX] Set version to 1.8.2rc1

commit b0245e4a3dacf665e26446acafb3e80f558a83e0
Author: Connor Ameres <connorameres@gmail.com>
Date:   Mon May 1 23:22:04 2017 +0200

    [AIRFLOW-1160] Update Spark parameters for Mesos

    Closes #2265 from cameres/master

commit 9744b1a8d8223fb713156eacbcaf07456d6d43c3
Author: Niels Zeilemaker <nielszeilemaker@godatadriven.com>
Date:   Sat Apr 29 17:14:40 2017 +0200

    [AIRFLOW 1149][AIRFLOW-1149] Allow for custom filters in Jinja2 templates

    Closes #2258 from
    NielsZeilemaker/jinja_custom_filters

commit 07ebc46ea7237f554477f83575ec65df95b321f0
Author: Thomas Hofer <thofer@zendesk.com>
Date:   Tue Apr 25 11:31:31 2017 +0200

    [AIRFLOW-1119] Fix unload query so headers are on first row[]

    Closes #2245 from th11/airflow-1119-fix

commit 34f072a1716350ec39464d73282127d08b83582c
Author: Stephan Werges <swerges@accertify.com>
Date:   Tue Apr 25 11:28:31 2017 +0200

    [AIRFLOW-1089] Add Spark application arguments

    Allows arguments to be passed to the Spark
    application being
    submitted. For example:

    - spark-submit --class foo.Bar foobar.jar arg1
    arg2
    - spark-submit app.py arg1 arg2

    Closes #2229 from camshrun/sparkSubmitAppArgs

commit 156e90b5c176633588fa7e5021de0185b583b8fc
Author: Alex Guziel <alex.guziel@airbnb.com>
Date:   Fri Apr 7 19:26:35 2017 +0200

    [AIRFLOW-1078] Fix latest_runs endpoint for old flask versions

    Old versions of flask (<0.11) dont support jsonify
    on arrays due an
    ECMAScript 4 vulnerability in older browsers. This
    should work on old
    flask versions as well.

    Closes #2224 from saguziel/aguziel-fix-homepage

commit 70024935f24e0ff3d2861c0ccfa69cdd38084b9d
Author: Alex Guziel <alex.guziel@airbnb.com>
Date:   Wed Apr 12 11:56:03 2017 -0700

    [AIRFLOW-1074] Don't count queued tasks for concurrency limits

    There may be orphaned tasks queued but not in a
    running dag run that
    will not cleared. We should not count these as
    they will interfere.

    I hate to do this, but I changed my mind on
    counting queued tasks.

    1. Queued tasks that are actually queued generally
    get set to running pretty quickly.
    2. Because of the worker-side check, we won't
    actually pass concurrency.

    I don't think the queued thing is a big deal
    because of this, I'm more worried about orphaned
    tasks that are in QUEUED state but not in a
    running dag_run (so they wont get reset)
    interfering with concurrency.

    There may be orphaned tasks queued but not in a
    running dag run that
    will not cleared. We should not count these as
    they will interfere.

    Closes #2221 from saguziel/aguziel-concurrency-2

commit 708e8ad7bbaf865ca850bbcdfcfd89b914a09dea
Author: Alex Guziel <alex.guziel@airbnb.com>
Date:   Tue Apr 4 17:19:43 2017 +0200

    [AIRFLOW-1064] Change default sort to job_id for TaskInstanceModelView

    The TaskInstanceModelView default sort column is
    on an unindexed column.
    We shouldn't need an index on start_date, and
    job_id is just as logical
    of a default sort.

    Closes #2215 from saguziel/aguziel-fix-ti-page

commit c39d91537bb8c28cc81aad3e26eff72158e3b1f4
Author: Alex Guziel <alex.guziel@airbnb.com>
Date:   Fri Mar 24 11:51:39 2017 -0700

    [AIRFLOW-1038] Specify celery serialization options explicitly

    Specify the CELERY_TASK_SERIALIZER and CELERY_RESULT_SERIALIZER as
    pickle explicitly, and CELERY_EVENT_SERIALIZER as json.

commit 258baf0ff0af9529388a464174dc703d0ad48f5b
Author: Alex Guziel <alex.guziel@airbnb.com>
Date:   Sat Apr 29 17:11:58 2017 +0200

    [AIRFLOW-1036] Randomize exponential backoff

    This prevents the thundering herd problem. Using a
    combination of
    dag_run, task_id, and execution_date makes this
    random with respect to
    task instances, while still being deterministic
    across machines. The
    retry delay is within a range that doubles in
    size.

    Closes #2262 from saguziel/aguziel-random-
    exponential-backoff

commit 650399590b06b11b67bbda699947574e50faed38
Author: Jeremiah Lowin <jlowin@apache.org>
Date:   Sat May 13 14:53:08 2017 +0200

    [AIRFLOW-993] Update date inference logic

    DAGs should set task start_date and end_date when
    possible, making sure
    they agree with the DAG’s own dates.

    Closes #2157 from jlowin/run-bug

commit c5d6c3a3cbd5ad28470e2d7e8434f439aae14a75
Author: Niels Zeilemaker <nielszeilemaker@godatadriven.com>
Date:   Fri May 5 09:17:40 2017 +0200

    [AIRFLOW-1167] Support microseconds in FTPHook modification time

    Closes #2268 from NielsZeilemaker/fix-ftp-hook

commit 1b2b34e6a0b44b5a686cae6fa6e4014705769d67
Author: Niels Zeilemaker <nielszeilemaker@godatadriven.com>
Date:   Tue May 9 09:42:32 2017 -0700

    [AIRFLOW-1179] Fix Pandas 0.2x breaking Google BigQuery change

    Closes #2279 from NielsZeilemaker/AIRFLOW-1179

commit fc5fe5cc4e3b27058c777ff6036f5ea86f30adbd
Author: Dan Davydov <dan.davydov@airbnb.com>
Date:   Mon Jun 5 16:32:42 2017 -0700

    [AIRFLOW-1263] Dynamic height for charts

    Dynamic heights for webserver charts so that
    longer task
    names fit

    Closes #2344 from aoen/ddavydov--
    dynamic_chart_heights

commit 6600faf03a99eeb68138c88539bf847bdea2f32c
Author: Dan Davydov <dan.davydov@airbnb.com>
Date:   Mon Jun 5 16:31:42 2017 -0700

    [AIRFLOW-1266] Increase width of gantt y axis

    Increase the width of the gantt view y axis to
    accommodate larger
    task names.

    Closes #2345 from aoen/ddavydov--
    increase_width_of_gantt_y_axis

commit f8f4e605c9adc1edac414178d0cd5b3fc8f49adc
Author: Maxime Beauchemin <maximebeauchemin@gmail.com>
Date:   Wed Jun 7 13:54:00 2017 -0700

    [AIRFLOW-1290] set docs author to 'Apache Airflow'

commit 9627969e68cba74c1622139fa08e6b905fbe64b8
Author: Chris Riccomini <criccomini@apache.org>
Date:   Tue May 9 13:14:50 2017 -0700

    [AIRFLOW-XXX] Updating CHANGELOG, README, and UPDATING after 1.8.1 release

    # Conflicts:
    #	UPDATING.md

commit 829a18a4b5d0e767b7253e73c8b821e339db010f
Author: Stanislav Kudriashev <stas.kudriashev@gmail.com>
Date:   Wed Jun 7 11:43:13 2017 +0200

    [AIRFLOW-1282] Fix known event column sorting

    Closes #2350 from skudriashev/airflow-1282

    (cherry picked from commit a52123dcaaf8e833c695e7fca12cf2a5dd31f291)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit cc2b4cb22820da922acf6d26650a20655c27adc2
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Wed Jun 7 09:16:51 2017 +0200

    [AIRFLOW-1166] Speed up _change_state_for_tis_without_dagrun

    _change_state_for_tis_without_dagrun was locking a
    significant
    amount of tasks uncessarily. This could end up in
    a deadlock
    in the database due to the time the lock stood.

    Closes #2267 from bolkedebruin/fix_deadlock

    (cherry picked from commit 4764646b18f56c34a35c19bd20a1931eb3a844fe)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 6f9e535bb0dd4401217e96dcf3fa4982ea48dc48
Author: Sumit Maheshwari <sumeet.manit@gmail.com>
Date:   Wed Jun 7 09:09:50 2017 +0200

    [AIRFLOW-1192] Some enhancements to qubole_operator

    1. Upgrade qds_sdk version to latest
    2. Add support to run Zeppelin Notebooks
    3. Move out initialization of QuboleHook from
    init()

    Closes #2322 from msumit/AIRFLOW-1192

    (cherry picked from commit 6be02475f80c2f493e640272ab5344ca686204a0)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit f65ae9a70d990fd74d9252ab27cbed9954a36363
Author: Stanislav Kudriashev <stas.kudriashev@gmail.com>
Date:   Wed Jun 7 09:04:21 2017 +0200

    [AIRFLOW-1281] Sort variables by key field by default

    Closes #2347 from skudriashev/airflow-1281

    (cherry picked from commit 6b890d157c402bbbeddc69ac4ad86b22889f813b)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit ec1657556abf72ca92203bc4269632060f5a4242
Author: Stanislav Kudriashev <stas.kudriashev@gmail.com>
Date:   Mon Jun 5 21:48:32 2017 +0200

    [AIRFLOW-1244] Forbid creation of a pool with empty name

    Closes #2324 from skudriashev/airflow-1244

    (cherry picked from commit df9a10b26fda546d0e8124f3d5cd9aefa6c0a81f)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit fd6d57e986b02c9ca0e979322bcae4d36f81c0a4
Author: Stanislav Kudriashev <stas.kudriashev@gmail.com>
Date:   Mon May 29 17:18:36 2017 +0200

    [AIRFLOW-1243] DAGs table has no default entries to show

    Closes #2323 from skudriashev/airflow-1243

    (cherry picked from commit 7e6e84385401f8c001b9408571afce9467568ca2)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 7b861ac42c0d5b3dedd25770057d3512b2cb1db9
Author: Stanislav Kudriashev <stas.kudriashev@gmail.com>
Date:   Thu May 18 20:06:03 2017 +0200

    [AIRFLOW-1227] Remove empty column on the Logs view

    Closes #2310 from skudriashev/airflow-1227

    (cherry picked from commit 917adbdbadaa6bd30bc3ca5cd3bbc137db305f99)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 37cd7ce1d5754f2379b12565ee9435d872658007
Author: Stanislav Kudriashev <stas.kudriashev@gmail.com>
Date:   Thu May 18 20:04:47 2017 +0200

    [AIRFLOW-1226] Remove empty column on the Jobs view

    Closes #2309 from skudriashev/airflow-1226

    (cherry picked from commit 37d2f7dd2a57c1a183a741f567ecfaaf4e0527c0)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 2e3fc7ecf0311912e9b92487410ceccbd772c5a2
Author: Stanislav Kudriashev <stas.kudriashev@gmail.com>
Date:   Wed May 17 20:55:19 2017 +0200

    [AIRFLOW-1199] Fix create modal

    Closes #2293 from skudriashev/airflow-1199

    (cherry picked from commit f9ffbbd918d1e1093f0d2823edea48126bf35c65)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit e2456f07c8da00c3daef5e6a2a56611ccfc48cf9
Author: Stanislav Kudriashev <stas.kudriashev@gmail.com>
Date:   Wed May 17 20:53:50 2017 +0200

    [AIRFLOW-1200] Forbid creation of a variable with an empty key

    Closes #2299 from skudriashev/airflow-1200

    (cherry picked from commit 3acfa048a88342ca059afd9329e7ba4cf1af0929)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 17233ea77998ddf587bd2269487b73ad80d75b84
Author: Ignasi Peiró <ignasi.peiro@gmail.com>
Date:   Tue May 16 10:30:21 2017 +0200

    [AIRFLOW-1186] Sort dag.get_task_instances by execution_date

    task.get_task_instances is sorted by
    execution_date, so we sort
    dag.get_task_instances by execution_date so it
    doesn't break
    duration chart

    Closes #2284 from OpringaoDoTurno/fix-duration

    (cherry picked from commit b87d3a49054f3890cf6714e7c4920973754b810e)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 9e6dfcbdb4a4bbd63a8683778e3b3dafe149150c
Author: julien-gm <julien-gm@users.noreply.github.com>
Date:   Sat May 13 21:10:13 2017 +0200

    [AIRFLOW-1145] Fix closest_date_partition function with before set to True
    If we're looking for the closest date before, we should take the latest date in the list of date before.

    Closes #2257 from julien-gm/fix_closest-date-
    partition

    (cherry picked from commit 0da540bf7840ae3cac866e352fba2b8b5cd9a625)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 0db3249df516d4fbc60d47a25000fcbae45185af
Author: Kengo Seki <sekikn@apache.org>
Date:   Sat May 13 21:03:51 2017 +0200

    [AIRFLOW-1180] Fix flask-wtf version for test_csrf_rejection

    For now, SecurityTests.test_csrf_rejection fails
    because flask-wtf version specified in setup.py is
    too old.
    This PR fixes it.

    Closes #2280 from sekikn/AIRFLOW-1180

    (cherry picked from commit ae61987945774ea2f9df53eaedac87fb378234a4)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 803e3494d0c4eaf7e090648228170b7f867df1f7
Author: Niels Zeilemaker <nielszeilemaker@godatadriven.com>
Date:   Sat May 13 14:49:05 2017 +0200

    [AIRFLOW-1170] DbApiHook insert_rows inserts parameters separately

    Instead of creating a sql statement with all
    values, we send the values
    separately to prevent sql injection

    Closes #2270 from NielsZeilemaker/AIRFLOW-1170

    (cherry picked from commit 3b589a9f73bed018bf7e2c7b7265bfce5da91ca0)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit d5b0f2a340d455ebbed1aad1d4dd29286c26a83a
Author: Giovanni Lanzani <gglanzani@users.noreply.github.com>
Date:   Fri May 12 11:39:24 2017 +0200

    [AIRFLOW-1150] Fix scripts execution in sparksql hook[]

    When using the the SparkSqlOperator and submitting
    a file (ending with
    `.sql` or `.hql`), a whitespace need to be
    appended, otherwise a Jinja
    error will be raised.

    However the trailing whitespace confused the hook
    as those files will
    not end with `.sql` and `.hql`, but with `.sql `
    and `.hql `. This PR
    fixes this.

    In the test, I've added the `get_after` function
    to easily check if the
    path is really stripped or not by the `-f` option.

    Closes #2259 from gglanzani/master

    (cherry picked from commit f97bc5ed3dc42f975776175f7a269c0604f49123)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 314b2deb6aacf7e1891afd79b51a94a2941bc1ec
Author: Niels Zeilemaker <nielszeilemaker@godatadriven.com>
Date:   Fri May 12 11:26:29 2017 +0200

    [AIRFLOW-1168] Add closing() to all connections and cursors

    This will prevent any left-open connections
    whenever an exception occurs

    Closes #2269 from NielsZeilemaker/AIRFLOW-1168

    (cherry picked from commit 8aeebd488416bd7618d36c64c49eca58f3f45e0d)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 608a810a2a30c96c4c04540e4ac26d5e2503d0ee
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Fri May 12 11:22:37 2017 +0200

    Continue development on 1.8.2

commit af2d0b4b5cb1ef30a065b1af66f90a01a953e2be
Author: Alex Guziel <alex.guziel@airbnb.com>
Date:   Wed Apr 5 10:02:42 2017 +0200

    [AIRFLOW-970] Load latest_runs on homepage async

    The latest_runs column on the homepage loads
    synchronously with an n+1
    query. Homepage loads will be significantly faster
    if this happens
    asynchronously and as a batch.

    Closes #2144 from saguziel/aguziel-latest-run-
    async

    (cherry picked from commit 0f7ddbbedb05f2f11500250db4989edcb27bc164)

commit d61af623178253eb39a1fabd6116a94dca3f33a6
Author: Chris Riccomini <criccomini@apache.org>
Date:   Thu Apr 27 13:15:37 2017 -0700

    [AIRFLOW-XXX] Fix merge issue with test/models.py by adding execution_date

commit 0a105eed4c14c1f1595c10a6529e3bdb51187a14
Author: Chris Riccomini <criccomini@apache.org>
Date:   Thu Apr 27 12:37:14 2017 -0700

    [AIRFLOW-XXX] Set version to 1.8.1

commit e342d0d223e47ea25f73baaa00a16df414a6e0df
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Wed Apr 26 20:39:48 2017 +0200

    [AIRFLOW-492] Make sure stat updates cannot fail a task

    Previously a failed commit into the db for the statistics
    could also fail a task. Secondly, the ui could display
    out of date statistics.

    This patch reworks DagStat so that failure to update the
    statistics does not propagate. Next to that, it make sure
    the ui always displays the latest statistics.

    Closes #2254 from bolkedebruin/AIRFLOW-492

    (cherry picked from commit c2472ffa124ffc65b8762ea583554494624dbb6a)

commit 5800f565628d11d8ea504468bcc14c4d1c0da10c
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Thu Apr 27 21:17:25 2017 +0200

    [AIRFLOW-1142] Do not reset orphaned state for backfills

    The scheduler could interfere with backfills when
    it resets the state
    of tasks that were considered orphaned. This patch
    prevents the scheduler
    from doing so and adds a guard in the backfill.

    Closes #2260 from bolkedebruin/AIRFLOW-1142

    (cherry picked from commit 4e79b830e3261b9d54fdbc7c9dcb510d36565986)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 3c8939838700fadc24f80f93a0e5012de736fea3
Author: Chris Riccomini <criccomini@apache.org>
Date:   Mon Apr 24 09:25:07 2017 -0700

    [AIRFLOW-XXX] Bump version to 1.8.1rc1+incubating

commit 4b5c6efd4a450b4a202f87cb12ea1f9eb4daf8fc
Author: Chris Riccomini <criccomini@apache.org>
Date:   Fri Apr 21 13:16:54 2017 -0700

    [AIRFLOW-1138] Add missing licenses to files in scripts directory

    Closes #2253 from criccomini/AIRFLOW-1138

    (cherry picked from commit 94f9822ffd867e559fd71046124626fee6acedf7)

commit dc6ebaab94bcc69b36bb97eefba3a01ee149b746
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Thu Apr 20 15:24:15 2017 -0700

    [AIRFLOW-1127] Move license notices to LICENSE

    Closes #2250 from bolkedebruin/AIRFLOW-1127

    (cherry picked from commit 659827639e256a668d669d0d229abf49d6010bb8)

commit aef7dd0a53411f3edb2333cb36a457056e5ab652
Author: Kengo Seki <sekikn@apache.org>
Date:   Wed Apr 19 12:31:10 2017 -0700

    [AIRFLOW-1121][AIRFLOW-1004] Fix `airflow webserver --pid` to write out pid file

    After AIRFLOW-1004, --pid option is no longer
    honored and
    the pid file is not being written out. This PR
    fixes it.

    Closes #2249 from sekikn/AIRFLOW-1121

    (cherry picked from commit 8d643897cf6171d110e7139fb31c3d4d47c3acca)

commit f0d072cfb3b023dd4c80fd4e30e42fef595793c7
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Wed Apr 19 17:15:46 2017 +0200

    [AIRFLOW-1124] Do not set all tasks to scheduled in backfill

    Backfill is supposed to fill in the blanks and not
    to reschedule
    all tasks. This fixes a regression from 1.8.0.

    Closes #2247 from bolkedebruin/AIRFLOW-1124

    (cherry picked from commit 0406462dc91427793ba40d0f05f321e85dbc6f19)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 1725c95163cf3a3d3b4c073922e39851e00942bf
Author: Chris Riccomini <criccomini@apache.org>
Date:   Tue Apr 18 13:53:03 2017 -0700

    [AIRFLOW-1120] Update version view to include Apache prefix

    Closes #2244 from criccomini/AIRFLOW-1120

    (cherry picked from commit 6684597d951cb9f2fea24576a3d19534d67c89ea)

commit 58a0ee787ed372034b417e6743175bdfe7f14808
Author: Chris Riccomini <criccomini@apache.org>
Date:   Mon Apr 17 11:18:12 2017 -0700

    [AIRFLOW-XXX] Set 1.8.1 version

commit bc52d092b5194c3a389c19ce45c2c2bdda3bf265
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Mon Apr 17 10:09:47 2017 +0200

    [AIRFLOW-1000] Rebrand distribution to Apache Airflow

    Per Apache requirements Airflow should be branded
    Apache Airflow.
    It is impossible to provide a forward compatible
    automatic update
    path and users will be required to manually
    upgrade.

    Closes #2172 from bolkedebruin/AIRFLOW-1000

    (cherry picked from commit 4fb05d8cc7a69255c6bff33c7f856eb4a341d5f2)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit a9e0894ba0113cf62c7e9006fb0b42085bc5e9f9
Author: pdambrauskas <p.d.ambrauskas@gmail.com>
Date:   Tue Apr 4 08:39:54 2017 +0200

    [AIRFLOW-1030][AIRFLOW-1] Fix hook import for HttpSensor

    Closes #2180 from
    pdambrauskas/fix/http_hook_import

    (cherry picked from commit f2dae7d15623e2534e7c0dab3b5a7e02d4cff81d)

commit 0a5fb7856b545073516210fcfc369d2072823ae9
Author: Kengo Seki <sekikn@apache.org>
Date:   Tue Apr 4 08:32:44 2017 +0200

    [AIRFLOW-1004][AIRFLOW-276] Fix `airflow webserver -D` to run in background

    AIRFLOW-276 introduced a monitor process for gunicorn
    to find new files in the dag folder, but it also changed
    `airflow webserver -D`'s behavior to run in foreground.
    This PR fixes that by running the monitor as a daemon
    process.

    Closes #2208 from sekikn/AIRFLOW-1004

    (cherry picked from commit a9b20a04b052e9479dbb79fd46124293085610e9)

commit c94b3a02f430f1a5a86c83d5f7286dcdac31492b
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Wed Apr 5 09:57:55 2017 +0200

    [AIRFLOW-1001] Fix landing times if there is no following schedule

    @once does not have a following schedule. This was
    not checked for
    and therefore the landing times page could bork.

    Closes #2213 from bolkedebruin/AIRFLOW-1001

    (cherry picked from commit 0371df4f1bd78e220e591d5cb23630d6a062f109)

commit aec9770831d69ca04569c9af8da51942261be4ca
Author: Alex Guziel <alex.guziel@airbnb.com>
Date:   Mon Apr 3 14:10:21 2017 -0700

    [AIRFLOW-111] Include queued tasks in scheduler concurrency check

    The concurrency argument in dags appears to not be obeyed because the
    scheduler does not check the concurrency properly when checking tasks.
    The tasks do not run, but this leads to a lot of scheduler churn.

    (cherry picked from commit 31fce01251957812b6aa392dcd70bb4519305e2a)

commit 4199cd3d23d35183253c5d078e0f9937e87df232
Author: Ivan Vergiliev <ivan.vergiliev@gmail.com>
Date:   Fri Apr 7 19:35:03 2017 +0200

    [AIRFLOW-1035] Use binary exponential backoff

    Closes #2196 from IvanVergiliev/exponential-
    backoff

    (cherry picked from commit 4ec932b551774bb394c5770c4d2660f565a4c592)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit ceb2ac366fce4eac7ca007e6ec15194e71e66409
Author: Stephan Werges <swerges@accertify.com>
Date:   Fri Apr 7 19:20:46 2017 +0200

    [AIRFLOW-1085] Enhance the SparkSubmitOperator

    - Allow the Spark home to be set on per connection
    basis to obviate
      the need for the spark-submit to be on the PATH,
    and allows different
      versions of Spark to be easily used.
    - Enable the use of the --driver-memory parameter
    on the spark-submit
      by making it parameter on the operator
    - Enable the use of the --class parameter on the
    spark-submit by making
      it a parameter on the operator

    Closes #2211 from camshrun/sparkSubmitImprovements

    (cherry picked from commit 0ade066f44257c5e119b292f4cc2ba105774f4e7)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 0fa593e38c7ea88765408af10abad3c3780ba27d
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Fri Apr 7 08:00:10 2017 +0200

    [AIRFLOW-1050] Do not count up_for_retry as not ready

    up_for_retry tasks were incorrectly counted
    towards not_ready
    therefore marking a dag run deadlocked instead of
    retrying.

    Closes #2225 from bolkedebruin/AIRFLOW-1050

    (cherry picked from commit 35e43f5067f4741640278b765c0e54e4fd45ffa3)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit ebfc3ea73ae1ffe273e4ff532f1ad47441bef518
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Thu Apr 6 14:03:11 2017 +0200

    [AIRFLOW-1033][AIFRLOW-1033] Fix ti_deps for no schedule dags

    DAGs that did not have a schedule (None or @once)
    make the dependency
    checker raise an exception as the previous
    schedule will not exist.

    Also activates all ti_deps tests.

    Closes #2220 from bolkedebruin/AIRFLOW-1033

    (cherry picked from commit fbcbd053a2c5fffb0c95eb55a91cb92fa860e1ab)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 916741171cc0c6426dbcbe8a2b5ce2468fce870d
Author: abloomston <adam.bloomston@cloverhealth.com>
Date:   Thu Mar 16 19:36:00 2017 -0400

    [AIRFLOW-969] Catch bad python_callable argument

    Checks for callable when Operator is
    created, not when it is run.

    * added initial PythonOperator unit test, testing
    run
    * python_callable must be callable; added unit
    test

    Closes #2142 from abloomston/python-callable

    (cherry picked from commit 12901ddfa9961a11feaa3f17696d19102ff8ecd0)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit dff6d21bfd9a2585ca484fc8fd56aa100f640908
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Tue Apr 4 17:04:12 2017 +0200

    Merge pull request #2195 from bolkedebruin/AIRFLOW-719

    (cherry picked from commit 4a6bef69d1817a5fc3ddd6ffe14c2578eaa49cf0)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 9070a82775691e08fb1b95c28fbc2cc5ee7b843d
Author: Alex Guziel <alex.guziel@airbnb.com>
Date:   Wed Apr 5 09:59:53 2017 +0200

    [AIRFLOW-111] Include queued tasks in scheduler concurrency check

    The concurrency argument in dags appears to not be
    obeyed because the
    scheduler does not check the concurrency properly
    when checking tasks.
    The tasks do not run, but this leads to a lot of
    scheduler churn.

    Closes #2214 from saguziel/aguziel-fix-concurrency

    (cherry picked from commit 3ff5abee3f9d29e545e021c2c060e9c9f3045236)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 4db53f39a972cae691dc49687a407dda0ff49aaf
Author: pdambrauskas <p.d.ambrauskas@gmail.com>
Date:   Tue Apr 4 08:39:54 2017 +0200

    [AIRFLOW-1030][AIRFLOW-1] Fix hook import for HttpSensor

    Closes #2180 from
    pdambrauskas/fix/http_hook_import

    (cherry picked from commit f2dae7d15623e2534e7c0dab3b5a7e02d4cff81d)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 010b80aa8b417091705556a07d5970fe0cc4efb2
Author: Kengo Seki <sekikn@apache.org>
Date:   Tue Apr 4 08:30:40 2017 +0200

    [AIRFLOW-1062] Fix DagRun#find to return correct result

    DagRun#find returns wrong result if
    external_trigger=False is specified,
    because adding filter is skipped on that
    condition. This PR fixes it.

    Closes #2210 from sekikn/AIRFLOW-1062

    (cherry picked from commit e4494f85ed5593c99949b52e1e0044c2a35f097f)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 2bebeaf9554d35710de6eb1b4006157e105ac79b
Author: Joe Schmid <jschmid@symphonyrm.com>
Date:   Tue Apr 4 08:27:45 2017 +0200

    [AIRFLOW-1011] Fix bug in BackfillJob._execute() for SubDAGs

    BackfillJob._execute() checks that the next run
    date is less than
    or equal to the end date before creating a DAG run
    and task
    instances. For SubDAGs, the next run date is not
    relevant,
    i.e. schedule_interval can be anything other than
    None
    or '@once' and should be ignored. However, current
    code calculates
    the next run date for a SubDAG and the condition
    check mentioned
    above always fails for SubDAG triggered manually.

    This change adds a simple check to determine if
    this is a SubDAG
    and, if so, sets next run date to DAG run's start
    date.

    Closes #2179 from joeschmid/AIRFLOW-1011-fix-bug-
    backfill-execute-for-subdags

    (cherry picked from commit 56501e6062df9456f7ac4efe94e21940734dd5bc)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 68b1c982e048878ec9dd658072c147e4341bf2c2
Author: Siddharth Anand <siddharthanand@yahoo.com>
Date:   Mon Apr 3 13:10:51 2017 -0700

    [AIRFLOW-1054] Fix broken import in test_dag

    Closes #2201 from
    r39132/fix_broken_import_on_test_dag

    (cherry picked from commit c64e876bd50eeb6c9e2600ac9d832c05eb5e9640)

commit 5eb33358f62a13192e537296becc315476112afb
Author: Ján Koščo <3k.stanley@gmail.com>
Date:   Sun Feb 12 15:43:41 2017 -0500

    [AIRFLOW-858] Configurable database name for DB operators

    Closes #2063 from s7anley/configurable-schema

    (cherry picked from commit 94dc7fb0a6bb3c563d9df6566cd52a59bd0c4629)

commit eb12f0164fbeedbe2701744c213cc90e6fc805f5
Author: George Sakkis <george.sakkis@gmail.com>
Date:   Sun Feb 12 16:09:26 2017 -0500

    [AIRFLOW-832] Let debug server run without SSL

    Closes #2051 from gsakkis/fix-debug-server

    (cherry picked from commit b0ae70d3a8e935dc9266b6853683ae5375a7390b)

commit 46ca569a37513f3d13c529786f65c7e443c9837e
Author: Dan Jarratt <djarratt@users.noreply.github.com>
Date:   Fri Feb 24 15:00:51 2017 -0800

    [AIRFLOW-906] Update Code icon from lightning bolt to file

    Lightning bolts are not a visual metaphor for code
    or files. Since Glyphicon doesn't have a code icon
    (<>, for instance), we should use its file icon.

    Dear Airflow Maintainers,

    Please accept this PR that addresses the following
    issues:
    AIRFLOW-906

    Testing Done:
    None.

    Before/After screenshots in AIRFLOW-906 (https://i
    ssues.apache.org/jira/browse/AIRFLOW-906)

    Update Code icon from lightning bolt to file

    Lightning bolts are not a visual metaphor for code
    or files. Since Glyphicon doesn't have a code icon
    (<>, for instance), we should use its file icon.

    Merge pull request #1 from djarratt/djarratt-
    patch-1

    Update Code icon from lightning bolt to file

    AIRFLOW-906 change glyphicon flash to file

    Merge pull request #2 from djarratt/djarratt-
    patch-2

    AIRFLOW-906 change glyphicon flash to file

    Closes #2104 from djarratt/master

    (cherry picked from commit bc47200711be4d2c0b36b772651dae4f5e01a204)

commit 2106ff57056d08436d8aab87ac5601d9f554935a
Author: Chris Riccomini <criccomini@apache.org>
Date:   Wed Mar 29 14:09:56 2017 -0700

    [AIRFLOW-1017] get_task_instance shouldn't throw exception when no TI

    get_task_instance should return None instead of
    throwing exception in the case where dagrun does not have the task
    instance.

    Closes #2178 from aoen/ddavydov--
    one_instead_of_first_for_dagrun

    (cherry picked from commit b2b9587cca9195229ab107394ad94b7702c70e37)

commit 15600e42c805b222d6147b60376b56c8e708dcde
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Wed Mar 15 16:39:12 2017 -0700

    [AIRFLOW-989] Do not mark dag run successful if unfinished tasks

    Dag runs could be marked successful if all root
    tasks were successful,
    even if some tasks did not run yet, ie. in case of
    clearing. Now
    we consider unfinished_tasks, before marking
    successful.

    Closes #2154 from bolkedebruin/AIRFLOW-989

    (cherry picked from commit 3d6095ff5cf6eff0444d7e47a2360765f2953daf)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 3b37cfa1f2642ff90908a3af0a5674637c9518ee
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Mon Mar 13 20:14:07 2017 -0700

    [AIRFLOW-974] Fix mkdirs race condition

    mkdirs congtained a race condition for when if the
    directory is
    created between the os.path.exists and the
    os.makedirs calls,
    the os.makedirs will fail with an OSError.

    This reworks the function to be non-recursive as
    well, as
    permission errors were due to umasks being
    applied.

    Closes #2147 from bolkedebruin/AIRFLOW-974

    (cherry picked from commit c5cc298cf16c9777c90aec1fc8cc24bde62f7b2f)
    Signed-off-by: Bolke de Bruin <bolke@Bolkes-MacBook-Pro.local>

commit 2a6089728841e1f4bb060345b5c251b3ff73d13d
Author: Bolke de Bruin <bolke@Bolkes-MacBook-Pro.local>
Date:   Sun Mar 12 19:48:04 2017 -0700

    Update changelog for 1.8.0

commit f171d17e8b5ef698f487bed8a40c6dd21ed81b51
Author: Bolke de Bruin <bolke@Bolkes-MacBook-Pro.local>
Date:   Sun Mar 12 10:34:19 2017 -0700

    Fix postgres hook

commit 3927e00dc72f6f2d14e463ff8daba3e3bcb11b73
Author: Bolke de Bruin <bolke@Bolkes-MacBook-Pro.local>
Date:   Sun Mar 12 10:33:49 2017 -0700

    Remove remnants

commit 8df046bfbec670a253139c83c6174bb88f25ee7f
Author: Bolke de Bruin <bolke@Bolkes-MacBook-Pro.local>
Date:   Sun Mar 12 10:11:15 2017 -0700

    Make compatible with 1.8

commit 2b26a5d95ce230b66255c8e7e7388c8013dc6ba6
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Sat Mar 11 13:42:58 2017 -0800

    [AIRFLOW-900] Double trigger should not kill original task instance

    This update the tests of an earlier AIRFLOW-900.

    Closes #2146 from bolkedebruin/AIRFLOW-900

commit 57faa530f7e9580cda9bb0200d40af15d323df24
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Sat Mar 11 13:26:39 2017 -0800

    Fix tests for topological sort

commit 1243ab16849ab9716b26aeba6a11ea3e9e9a81ca
Author: Alex Guziel <alex.guziel@airbnb.com>
Date:   Sat Mar 11 10:54:39 2017 -0800

    [AIRFLOW-900] Fixes bugs in LocalTaskJob for double run protection

    Right now, a second task instance being triggered
    will cause
    both itself and the original task to run because
    the hostname
    and pid fields are updated regardless if the task
    is already running.
    Also, pid field is not refreshed from db properly.
    Also, we should
    check against parent's pid.

    Will be followed up by working tests.

    Closes #2102 from saguziel/aguziel-fix-trigger-2

commit a8f2c27ed44449e6611c7c4a9ec8cf2371cf0987
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Sat Mar 11 10:52:07 2017 -0800

    AIRFLOW-932][AIRFLOW-932][AIRFLOW-921][AIRFLOW-910] Do not mark tasks removed when backfilling[

    In a backfill one can specify a specific task to
    execute. We
    create a subset of the orginal tasks in a subdag
    from the original dag.
    The subdag has the same name as the original dag.
    This breaks
    the integrity check of a dag_run as tasks are
    suddenly not in
    scope any more.

    Closes #2122 from bolkedebruin/AIRFLOW-921

commit dacc69a504cbfcdba5e2b24220fa1982637b17d3
Author: Alex Guziel <alex.guziel@airbnb.com>
Date:   Sat Mar 11 10:43:49 2017 -0800

    [AIRFLOW-961] run onkill when SIGTERMed

    Closes #2138 from saguziel/aguziel-sigterm

commit dcc8ede5c1a2f6819b151dd5ce839f0a0917313a
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Sat Mar 11 09:40:38 2017 -0800

    [AIRFLOW-910] Use parallel task execution for backfills

    The refactor to use dag runs in backfills caused a
    regression
    in task execution performance as dag runs were
    executed
    sequentially. Next to that, the backfills were non
    deterministic
    due to the random execution of tasks, causing root
    tasks
    being added to the non ready list too soon.

    This updates the backfill logic as follows:
    * Parallelize execution of tasks
    * Use a leave first execution model
    * Replace state updates from the executor by task
    based only

    Closes #2107 from bolkedebruin/AIRFLOW-910

commit 8ffaadf173e1cd46661a592ad55b0d41e460c05a
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Fri Mar 10 12:00:16 2017 -0800

    [AIRFLOW-967] Wrap strings in native for py2 ldap compatibility

    ldap3 has issues with newstr being passed. This
    wraps any call
    that goes over the wire to the ldap server in
    native() to ensure
    the native string type is used.

    Closes #2141 from bolkedebruin/AIRFLOW-967

commit 1f3aead5c486c3576a5df3b6904aa449b8a1d90a
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Mon Mar 6 21:03:14 2017 +0100

    [AIRFLOW-941] Use defined parameters for psycopg2

    This works around
    https://github.com/psycopg/psycopg2/issues/517 .

    Closes #2126 from bolkedebruin/AIRFLOW-941

commit 4077c6de297566a4c598065867a9a27324ae6eb1
Author: Daniel Huang <dxhuang@gmail.com>
Date:   Sat Mar 4 17:33:23 2017 +0100

    [AIRFLOW-719] Prevent DAGs from ending prematurely

    DAGs using ALL_SUCCESS and ONE_SUCCESS trigger
    rules were ending
    prematurely when upstream tasks were skipped.
    Changes mean that the
    ALL_SUCCESS and ONE_SUCCESS triggers rule
    encompasses both SUCCESS and
    SKIPPED tasks.

    Closes #2125 from dhuang/AIRFLOW-719

commit 157054e2c9967e48fb3f3157081baf686dcee5e8
Author: Alex Guziel <alex.guziel@airbnb.com>
Date:   Fri Mar 3 13:52:03 2017 -0800

    [AIRFLOW-938] Use test for True in task_stats queries

    Fix a bug with the task_stats query on postgres which doesn't support
    == 1.

    https://issues.apache.org/jira/browse/AIRFLOW-938

    I've seen the other PR but I'll try to see if this
    method works because I believe `__eq__(True)` is
    just `== True`, and it is how it is down here http
    ://docs.sqlalchemy.org/en/latest/core/sqlelement.h
    tml#sqlalchemy.sql.expression.and_ (underscore is
    part of link)

    Closes #2123 from saguziel/aguziel-fix-task-
    stats-2

commit 66f39ca0c3511da2ff86858ce7ea569d11adbd44
Author: Alex Guziel <alex.guziel@airbnb.com>
Date:   Thu Mar 2 14:04:49 2017 -0800

    [AIRFLOW-937] Improve performance of task_stats

    Please accept this PR that addresses the following
    issues:
    -
    https://issues.apache.org/jira/browse/AIRFLOW-937

    Testing Done:
    - Shouldn't change functionality significantly,
    should pass existing tests (if they exist)

    This leads to slightly different results, but it
    reduced the time of this endpoint from 90s to 9s
    on our data, and the existing logic for task_ids
    was already incorrect (task_ids may not be
    distinct across dags)

    Closes #2121 from saguziel/task-stats-fix

commit 0964f189f2cd2ac10150040670a542910370e456
Author: Rui Wang <rui.wang@airbnb.com>
Date:   Wed Mar 1 14:03:34 2017 -0800

    [AIRFLOW-933] use ast.literal_eval rather eval because ast.literal_eval does not execute
    input.

    This PR addresses the following issues:
    - *(https://issues.apache.org/jira/browse/AIRFLOW-
    933)*

    This PR is trying to solve a secure issue. The
    test was done by setting up a local web server and
    reproduce the issue described in JIRA link above.

    Closes #2117 from amaliujia/master

commit f04ea97d066093abf898fec81f96eeb4b82eaf13
Author: Li Xuanji <xuanji.li@airbnb.com>
Date:   Tue Feb 28 12:17:33 2017 -0800

    [AIRFLOW-925] Revert airflow.hooks change that cherry-pick picked

    Please accept this PR that addresses the following
    issues:
    -
    https://issues.apache.org/jira/browse/AIRFLOW-925

    Testing Done:
    - Fixes bug in prod

    Closes #2112 from saguziel/aguziel-
    hivemetastorehook-import-apache

commit ab37f8d32ef9dcf3163a037b53ca749f2f99f22e
Author: Dan Davydov <dan.davydov@airbnb.com>
Date:   Mon Feb 27 13:43:25 2017 -0800

    [AIRFLOW-919] Running tasks with no start date shouldn't break a DAGs UI

    Please accept this PR that addresses the following
    issues:
    -
    https://issues.apache.org/jira/browse/AIRFLOW-919

    I also made the airflow PR template a little bit
    less verbose (requires less edits when creating a
    PR).

    Testing Done:
    - Ran a webserver with this case and made sure
    that the DAG page loaded

    Closes #2110 from
    aoen/ddavydov/fix_running_task_with_no_start_date

commit 01494fd4c0633dbb57f231ee17e015f42a5ecf24
Author: Fokko Driesprong <fokkodriesprong@godatadriven.com>
Date:   Mon Feb 27 13:45:24 2017 +0100

    [AIRFLOW-802][AIRFLOW-1] Add spark-submit operator/hook

    Add a operator for spark-submit to kick off Apache
    Spark jobs by
    using Airflow. This allows the user to maintain
    the configuration
    of the master and yarn queue within Airflow by
    using connections.
    Add default connection_id to the initdb routine to
    set spark
    to yarn by default. Add unit tests to verify the
    behaviour of
    the spark-submit operator and hook.

    Closes #2042 from Fokko/airflow-802

commit c29af4668a67b5d7f969140549558714fb7b32c9
Author: Dan Davydov <dan.davydov@airbnb.com>
Date:   Fri Feb 24 14:29:11 2017 -0800

    [AIRFLOW-897] Prevent dagruns from failing with unfinished tasks

    Closes #2099 from
    aoen/ddavydov/fix_premature_dagrun_failures

commit ff0fa00d82bfebbe9b2b9ff957e4d77db0891e7f
Author: Alex Guziel <alex.guziel@airbnb.com>
Date:   Fri Feb 17 11:45:45 2017 -0800

    [AIRFLOW-861] make pickle_info endpoint be login_required

    Testing Done:
    - Unittests pass

    Closes #2077 from saguziel/aguziel-fix-login-
    required

commit 101700853896fdb90cda4267b5310e6c8811f4f0
Author: Ming Wu <ming.wu@ubisoft.com>
Date:   Fri Feb 10 19:47:47 2017 -0500

    [AIRFLOW-853] use utf8 encoding for stdout line decode

    Closes #2060 from ming-wu/master

commit 3918e5e1c489bf01a6a836d1d76e2251137af5de
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Fri Feb 10 14:17:26 2017 +0100

    [AIRFLOW-856] Make sure execution date is set for local client

    In the local api client the execution date was
    hardi coded to None.
    Secondly, when no execution date was specified the
    execution date
    was set to datetime.now(). Datetime.now() includes
    the fractional seconds
    that are supported in the database, but they are
    not supported in
    a.o. the current logging setup. Now we cut off
    fractional seconds for
    the execution date.

    Closes #2064 from bolkedebruin/AIRFLOW-856

commit 3b1e81ac9e8e97b6d2a4c3217df81db9ddbd0900
Author: Jeremiah Lowin <jlowin@apache.org>
Date:   Wed Feb 8 08:32:25 2017 -0500

    [AIRFLOW-830][AIRFLOW-829][AIRFLOW-88] Reduce Travis log verbosity

    [AIRFLOW-829][AIRFLOW-88] Reduce verbosity of
    Travis tests

    Remove the -s flag for Travis unit tests to
    suppress output
    from successful tests.

    [AIRFLOW-830] Reduce plugins manager verbosity

    The plugin manager prints all status to INFO,
    which is unnecessary and
    overly verbose.

    Closes #2049 from jlowin/reduce-logs

commit e1d0adb61d6475154ada7347ea30404f0680e779
Author: Jeremiah Lowin <jlowin@apache.org>
Date:   Thu Feb 2 11:56:22 2017 -0500

    [AIRFLOW-831] Restore import to fix broken tests

    The global `models` object is used in the code and
    was inadvertently
    removed. This PR restores it

    Closes #2050 from jlowin/fix-broken-tests

commit 2592024230a25820d368ecc3bd43fbf7b52e46d9
Author: George Sakkis <george.sakkis@gmail.com>
Date:   Thu Feb 2 14:45:48 2017 +0100

    [AIRFLOW-794] Access DAGS_FOLDER and SQL_ALCHEMY_CONN exclusively from settings

    Closes #2013 from gsakkis/settings

commit 5405f5f83c6e20fff2dc209cd4be3d1d5ea85140
Author: Kengo Seki <sekikn@nttdata.co.jp>
Date:   Thu Feb 2 14:38:29 2017 +0100

    [AIRFLOW-694] Fix config behaviour for empty envvar

    Currently, environment variable with empty value
    does not overwrite the
    configuration value corresponding to it.

    Closes #2044 from sekikn/AIRFLOW-694

commit a7abcf35b0e228034f746b3d50abd0ca9bd8bede
Author: Daniel Huang <dxhuang@gmail.com>
Date:   Thu Feb 2 13:57:20 2017 +0100

    [AIRFLOW-365] Set dag.fileloc explicitly and use for Code view

    Code view for subdag has not been working. I do
    not think we are able
    cleanly figure out where the code for the factory
    method lives when we
    process the dags, so we need to save the location
    when the subdag is
    created.

    Previously for a subdag, its `fileloc` attribute
    would be set to the
    location of the parent dag. I think it is
    appropriate to instead set
    it to the actual child dag location instead. We do
    not lose any
    information this way (we still have the link to
    the parent dag that
    has its location) and now we can always read this
    attribute for the
    code view. This should not affect the use of this
    field for refreshing
    dags, because we always refresh the parent for a
    subdag.

    Closes #2043 from dhuang/AIRFLOW-365

commit 4db8f0796642691255b0632d599f33cb9d0ce423
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Thu Mar 9 08:32:46 2017 -0800

    [AIRFLOW-931] Do not set QUEUED in TaskInstances

    The contract of TaskInstances stipulates that end
    states for Tasks
    can only be UP_FOR_RETRY, SUCCESS, FAILED,
    UPSTREAM_FAILED or
    SKIPPED. If concurrency was reached task instances
    were set to
    QUEUED by the task instance themselves. This would
    prevent the
    scheduler to pick them up again.

    We set the state to NONE now, to ensure integrity.

    Closes #2127 from bolkedebruin/AIRFLOW-931

    (cherry picked from commit e42398100a3248eddb6b511ade73f6a239e58090)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 3a5a3235d5ad77a116ea1ac2a3216af31900d703
Author: Dan Davydov <dan.davydov@airbnb.com>
Date:   Thu Feb 23 23:50:19 2017 +0100

    [AIRFLOW-899] Tasks in SCHEDULED state should be white in the UI instead of black

    Closes #2100 from
    aoen/ddavydov/fix_black_squares_in_ui

    (cherry picked from commit daa405e2bd2e4d3538eea0ed951fdcdf6d8bc127)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 8ad9ab673350207479e9597a36aadb1ec9987640
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Thu Feb 23 23:48:03 2017 +0100

    [AIRFLOW-895] Address Apache release incompliancies

    * Fixes missing licenses in NOTICE
    * Corrects license header
    * Removes HighCharts left overs.

    Closes #2098 from bolkedebruin/AIRFLOW-895

    (cherry picked from commit 784b3638c5633a9a94e020c47a3b95b942e6fb87)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit b38df6b8c6fc5eefe14b9594827d6f28092f77f8
Author: Dan Davydov <dan.davydov@airbnb.com>
Date:   Thu Feb 23 22:51:02 2017 +0100

    [AIRFLOW-893][AIRFLOW-510] Fix crashing webservers when a dagrun has no start date

    Closes #2094 from aoen/ddavydov/fix_webservers_whe
    n_bad_startdate_dag

    (cherry picked from commit 1c4508d84806debbedac9c4e12f14031c8a1effd)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 1c2313338a586aae4a7752c3fb3b9de4e3564415
Author: Krishna Bhupatiraju <krishna.bhupatiraju@airbnb.com>
Date:   Mon Feb 6 16:52:11 2017 -0800

    [AIRFLOW-793] Enable compressed loading in S3ToHiveTransfer

    Testing Done:
    - Added new unit tests for the S3ToHiveTransfer
    module

    Closes #2012 from krishnabhupatiraju/S3ToHiveTrans
    fer_compress_loading

    (cherry picked from commit ad15f5efd6c663bd5f0c8cd3f556d08182cc778c)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 3658bf310811cd22651b6c20c5d50bfbd3153025
Author: Jeremiah Lowin <jlowin@apache.org>
Date:   Sun Feb 12 15:37:56 2017 -0500

    [AIRFLOW-863] Example DAGs should have recent start dates

    Avoid unnecessary backfills by having start dates
    of
    just a few days ago. Adds a utility function
    airflow.utils.dates.days_ago().

    Closes #2068 from jlowin/example-start-date

    (cherry picked from commit bbfd43df4663547abda4ac6fdc3a6ed730a75b57)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 563cc9a3c8414725a615a93d3910e7a2dbb94999
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Fri Feb 17 09:05:41 2017 +0100

    [AIRFLOW-869] Refactor mark success functionality

    This refactors the mark success functionality in a
    more generic function that can set multiple states
    and properly drills down on SubDags.

    Closes #2085 from bolkedebruin/AIRFLOW-869

    (cherry picked from commit 28cfd2c541c12468b3e4f634545dfa31a77b0091)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit eddecd59d73191904f2f156e53a138e532dc560a
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Sun Feb 12 13:10:33 2017 +0100

    Revert "Revert "[AIRFLOW-782] Add support for DataFlowPythonOperator.""

    This reverts commit 7e65998a1bedd00e74fa333cfee78ad574aaa849.

commit 8aacc283a6b3a605648bf4bd1361225a2a3678d9
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Fri Feb 10 14:53:02 2017 +0100

    Add known issue of 'num_runs'

commit 7925bed63991da78cc63909a005d3dd9abd813ac
Merge: b3d4e711 fb88c2d8
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Fri Feb 10 14:54:03 2017 +0100

    Merge branch 'v1-8-test' of https://git-wip-us.apache.org/repos/asf/incubator-airflow into v1-8-test

commit fb88c2d8362d751f902252c51c8bce4301ac8c40
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Fri Feb 10 14:17:26 2017 +0100

    [AIRFLOW-856] Make sure execution date is set for local client

    In the local api client the execution date was
    hardi coded to None.
    Secondly, when no execution date was specified the
    execution date
    was set to datetime.now(). Datetime.now() includes
    the fractional seconds
    that are supported in the database, but they are
    not supported in
    a.o. the current logging setup. Now we cut off
    fractional seconds for
    the execution date.

    Closes #2064 from bolkedebruin/AIRFLOW-856

    (cherry picked from commit b7c828bf094d3aa1eae310979a82addf7e423bb0)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit b3d4e7114fd7f1943aee2e5f865cf27cffedd0ee
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Thu Feb 9 16:10:17 2017 +0100

    Add pool upgrade issue description

    (cherry picked from commit e63cb1fced9517397b7db9e2849bf01fcca63902)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit adaebc2d7afea4b996a0f49ee850bdb6dd6a0cfc
Author: Patrick McKenna <patrickmckenna@github.com>
Date:   Tue Feb 7 21:54:13 2017 +0100

    [AIRFLOW-814] Fix Presto*CheckOperator.__init__

    Use keyword args when initializing a
    Presto*CheckOperator.

    Closes #2029 from patrickmckenna/fix-presto-check-
    operators

    (cherry picked from commit d428a90286a8d34db65bb8f4d8252fbbe9665e55)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 0b477900021e69f6a0ae8b5dd42b1465e9f836c5
Author: Dan Davydov <dan.davydov@airbnb.com>
Date:   Mon Feb 6 16:21:05 2017 -0800

    [AIRFLOW-844] Fix cgroups directory creation

    Testing Done:
    - Tested locally, we should add cgroup tests at
    some point though

    Closes #2057 from aoen/ddavydov/fix_cgroups

commit ce3f88b68b926b2bdd2c8d1d0b21113c1d7f246e
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Thu Feb 2 20:24:42 2017 +0100

    Bump version to 1.8.1alpha0

commit 8dc27c675a2651e8d4e20f40d9b0a50c7ba5a832
Author: Alex Van Boxel <alex@vanboxel.be>
Date:   Thu Feb 2 19:40:04 2017 +0100

    CHANGELOG for 1.8

    Closes #2000 from alexvanboxel/pr/changelog

commit 7e65998a1bedd00e74fa333cfee78ad574aaa849
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Wed Feb 1 15:56:14 2017 +0000

    Revert "[AIRFLOW-782] Add support for DataFlowPythonOperator."

    This reverts commit dc97bcd3b7e0a7eebd838f0fb0452a0b47ba417b.

commit 77715b9e705f0915fa5c4368eda9d6d4323e2d2d
Merge: 44a980df c6483271
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Wed Feb 1 15:54:39 2017 +0000

    Merge branch 'master' into v1-8-test

commit 44a980df556a94c52ab5540fdd6fa9f29c51fac7
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Wed Feb 1 15:32:01 2017 +0000

    [AIRFLOW-816] Use static nvd3 and d3

    Closes #2035 from bolkedebruin/AIRFLOW-816

    (cherry picked from commit 1accb54ff561b8d745277308447dd6f9d3e9f8d5)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 98e3d6ffcf78c2503cec32c1403d3e1c6280e9d8
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Tue Jan 31 18:56:18 2017 +0000

    [AIRFLOW-821] Fix py3 compatibility

    iteritems() does not exist in py3.

    Closes #2039 from bolkedebruin/AIRFLOW-821

    (cherry picked from commit fbb59b94467d7e684620570407698f509b073e47)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit c01b6c9c636b4706b6b3bf9452404efc8938e6de
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Tue Jan 31 18:54:05 2017 +0000

    [AIRFLOW-817] Check for None value of execution_date in endpoint

    execution_date can be present in json while
    resolving to None.

    Closes #2034 from bolkedebruin/AIRFLOW-817

    (cherry picked from commit 2b13109ff01ee1534d611665d974667a06787cb2)
    Signed-off-by: Bolke de Bruin <bolke@xs4all.nl>

commit 2b7d40bf83c3b7b7ef35cce379d384019f89f57f
Author: Bolke de Bruin <bolke@xs4all.nl>
Date:   Tue Jan 31 14:58:32 2017 +0100

    Bump version to 1.8.0rc1

commit 5212665fff535ebebb1ce75b2904a7ec1ca42797
Author: Fokko Driesprong <fokkodriesprong@godatadriven.com>
Date:   Tue Jan 31 14:34:01 2017 +0100

    [AIRFLOW-822] Close db before exception

    The basehook contains functionality to retrieve
    connections from the
    database. If a connection does not exist …
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants