Skip to content

This commit fixes [AIRFLOW-6033] UI crashes on "Landing Times"#6634

Closed
drexpp wants to merge 1452 commits intoapache:masterfrom
drexpp:v1-10-stable
Closed

This commit fixes [AIRFLOW-6033] UI crashes on "Landing Times"#6634
drexpp wants to merge 1452 commits intoapache:masterfrom
drexpp:v1-10-stable

Conversation

@drexpp
Copy link
Contributor

@drexpp drexpp commented Nov 22, 2019

Adding our company to "Who uses Apache Airflow?"

Hello everyone, is it possible that our company appears on "Who uses Apache Airflow?" section? We are a small team working in Endesa, a big spanish electric distributor and part of Enel. I wrote some pipelines to automate the ETLs processes we use with Hadoop / Spark, so I believe it would be great for our team.

Endesa [@drexpp]


Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira Issue AIRFLOW-6033 and references them in the PR title.

Description

I targeted to v1-10-stable since I think is what @ashb recommended to me in my last PR.

  • Here are some details about my PR:

Airflow UI will crash in the browser returning "Oops" message and the Traceback of the crashing error.

This is caused by modifying a task_id with a capital/small letter, I will point out some examples that will cause airflow to crash:

task_id = "DUMMY_TASK" to task_id = "dUMMY_TASK"
task_id = "Dummy_Task" to task_id = "dummy_Task" or "Dummy_task",...
task_id = "Dummy_task" to task_id = "Dummy_tASk"


File causing the problem: https://github.com/apache/airflow/blob/master/airflow/www/views.py (lines 1643 - 1654)

for task in dag.tasks:
    y[task.task_id] = []
    x[task.task_id] = []

    for ti in task.get_task_instances(start_date=min_date, end_date=base_date):

        ts = ti.execution_date
        if dag.schedule_interval and dag.following_schedule(ts):
            ts = dag.following_schedule(ts)
        if ti.end_date:
            dttm = wwwutils.epoch(ti.execution_date)
            secs = (ti.end_date - ts).total_seconds()
            x[ti.task_id].append(dttm)
            y[ti.task_id].append(secs)

We can see in first two lines inside the first for loop, how the dictionary x and y is being filled with tasks_id attributes which comes from the actual DAG.

The problem actually comes in the second for loop when you get the task instances from a DAG, I am not sure about this next part and I wish someone to clarify my question about this.

I think that the task instances (ti) received from get_task_instances() function comes from the information stored into the database, that is the reason of crash when you access to "Landing Times" page, is that the x and y where filled with the actual name of the task_id in the DAG and the task_instances' task_id has different name stored causing this problem access to the dictionary.

One of my main questions is how having a different task name (such as changing from "run" to "Run") the function get_task_instances() keeps returning past task instances with different name, such asking instances of Run but returns task instances (ti) with task_id "run"?

Error screeshot

How to replicate:

  • Launch airflow webserver -p 8080
  • Go to the Airflow-UI
  • Create an example DAG with a task_id name up to your choice in small letters (ex. "run")
  • Launch the DAG and wait its execution to finish
  • Modify the task_id inside the DAG with the first letter to capital letter (ex. "Run")
  • Refresh the DAG
  • Go to "Landing Times" inside the DAG menu in the UI
  • You will get an "oops" message with the Traceback.

error_img


Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

    • I didn't know exactly how to unit test this, if you have any advice I will do a test for it. Other than that, I did test checking that the behaviour was as expected:

      • Create DAG and access to Landing Times
      • Modify a task from the created DAG to a completly new name and access to Landing Times
      • Modifying a task with capital/lower letters and accessing to Landing Times
      • Switch to the original name and access to Landing Times

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

Code Quality

  • Passes flake8

kaxil and others added 30 commits August 12, 2019 15:02
…roblems (#5835)

* [AIRFLOW-5233] Fixed consistency in whitespace (tabs/eols) + common problems

(cherry picked from commit 5cfe9c2)
List the two separate pylint scripts for use inside the Docker
containers in CONTRIBUTING.md.

(cherry picked from commit a47292d)
… shell files (#5807)

* [AIRFLOW-5204] Shellcheck + common licence in shell files

(cherry picked from commit 6420712)
This was done in the top level README.me in abbb1ea in 2016.

(cherry picked from commit 9d10ac7)
…%I) in _clean_execution_date instead of %H and %M (#5864)

(cherry picked from commit 4777c8a)
…AG files (#5757)

The scheduler calls `list_py_file_paths` to find DAGs to schedule. It does so
without passing any parameters other than the directory. This means that
it *won't* discover DAGs that are missing the words "airflow" and "DAG" even
if DAG_DISCOVERY_SAFE_MODE is disabled.

Since `list_py_file_paths` will refer to the configuration if
`include_examples` is not provided, it makes sense to have the same behaviour
for `safe_mode`.

(cherry picked from commit c4a9d8b)
…Operators (#5567)

Add KMS to BigQuery

(cherry picked from commit 682aea2)
…e in GoogleCloudStorageToBigQueryOperator. (#5771)

Set autodetect default value from false to be true to avoid breaking downstream
services using GoogleCloudStorageToBigQueryOperator but not aware of the newly
added autodetect field.

This is to fix the current regression introduced by #3880

(cherry picked from commit 462ab88)
* [AIRFLOW-4856] change hard coded run_as_user

try to use worker_run_as_user

* [AIRFLOW-4856] change hard coded run_as_user

add unit test

* [AIRFLOW-4856] change hard coded run_as_user

create new param git_sync_run_as_user

* [AIRFLOW-4856] change hard coded run_as_user

add back remove option

* [AIRFLOW-4856] change hard coded run_as_user

fix Flake8

* [AIRFLOW-4856] change hard coded run_as_user

fix Flake8

* [AIRFLOW-4856] change hard coded run_as_user

fix unit test

* [AIRFLOW-4856] change hard coded run_as_user

change the default value to it's old 65533

(cherry picked from commit b0bb65d)
…(RBAC only) (#5866)

Add execution_date_arg
Use execution_date_arg in graph, gantt, and Back To {parent.dag} links.

Add check of execution date

(cherry picked from commit 835eadf)
Francisco Chiang and others added 27 commits October 16, 2019 14:42
* Update databricks operator

* Updated token auth to get from extra_dejson

* Update test DatabricksHookTokenTest to use get host from 'extra'

(cherry picked from commit db770cf)
… patch_dataset and get_dataset (#5546)

Implement BigQuery Hooks/Operators for update_dataset, patch_dataset and get_dataset

(cherry picked from commit 09b9610)
)

* discussion on original PR suggested removing private_key option as init param
* with this PR, can still provide through extras, but not as init param
* also add support for private_key in tunnel -- missing in original PR for this issue
* remove test related to private_key init param
* use context manager to auto-close socket listener so tests can be re-run

(cherry picked from commit 0790ede)
Co-authored-by: Jarek Potiuk <jarek.potiuk@polidea.com>
(cherry picked from commit adfcf67)
The detection of python version is complex because we need to handle
several cases - including determining the version from image name
on DockerHub, detecting python version from python in the environment,
finally forced from python version. This caused multiple problems
with Travis where we run tests with different version (auto-detected
from current python - especially when python3 became present in
Travis' python 2.7 images. Now all the jobs in Travis have
PYTHON_VERSION forced and the code responsible for detecting current
python version has been removed as it is not needed in this case.

(cherry picked from commit 351ae4e)
All files are mounted in CI now and checked using the RAT tool.
As opposed to only the runtime-needed files. This is enabled for CI
build only as mounting all local files to Docker (especially on Mac)
has big performance penalty when running the checks (slow osxfs
volume and thousands of small node_modules files generated make the
check runs for a number of minutes). The RAT checks will by default
use the selective volumes but on CI they will mount the whole
source directory.

Also latest version of RAT tool is used now and the output - list
of checked files - is additionally printed as output of the RAT
check so that we are sure the files we expect to be there, are
actually verified.

(cherry picked from commit 7e440da)
rebased on v1-10-stable due to complete k8s refactor on master
@drexpp drexpp changed the base branch from v1-10-stable to master November 22, 2019 11:54
@drexpp
Copy link
Contributor Author

drexpp commented Nov 22, 2019

I will write my PR from master's branch in my fork to master's branch in this repo rather than v1-10-stable

@drexpp drexpp closed this Nov 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.