This commit fixes [AIRFLOW-6033] UI crashes on "Landing Times"#6634
Closed
drexpp wants to merge 1452 commits intoapache:masterfrom
drexpp:v1-10-stable
Closed
This commit fixes [AIRFLOW-6033] UI crashes on "Landing Times"#6634drexpp wants to merge 1452 commits intoapache:masterfrom drexpp:v1-10-stable
drexpp wants to merge 1452 commits intoapache:masterfrom
drexpp:v1-10-stable
Conversation
(cherry picked from commit e550afc)
(cherry picked from commit 2ea2c53)
(cherry picked from commit 3fac1bd)
(cherry picked from commit 4d491f3)
…roblems (#5835) * [AIRFLOW-5233] Fixed consistency in whitespace (tabs/eols) + common problems (cherry picked from commit 5cfe9c2)
List the two separate pylint scripts for use inside the Docker containers in CONTRIBUTING.md. (cherry picked from commit a47292d)
… shell files (#5807) * [AIRFLOW-5204] Shellcheck + common licence in shell files (cherry picked from commit 6420712)
(cherry picked from commit 5e36f42)
(cherry picked from commit 698c38b)
(cherry picked from commit 80b413b)
(cherry picked from commit 46e5fb1)
(cherry picked from commit a317cd2)
(cherry picked from commit 7fb729d)
(cherry picked from commit 5f100db)
(cherry picked from commit d8c9bdc)
(cherry picked from commit e405be0)
(cherry picked from commit 2b94600)
…AG files (#5757) The scheduler calls `list_py_file_paths` to find DAGs to schedule. It does so without passing any parameters other than the directory. This means that it *won't* discover DAGs that are missing the words "airflow" and "DAG" even if DAG_DISCOVERY_SAFE_MODE is disabled. Since `list_py_file_paths` will refer to the configuration if `include_examples` is not provided, it makes sense to have the same behaviour for `safe_mode`. (cherry picked from commit c4a9d8b)
…e in GoogleCloudStorageToBigQueryOperator. (#5771) Set autodetect default value from false to be true to avoid breaking downstream services using GoogleCloudStorageToBigQueryOperator but not aware of the newly added autodetect field. This is to fix the current regression introduced by #3880 (cherry picked from commit 462ab88)
* [AIRFLOW-4856] change hard coded run_as_user try to use worker_run_as_user * [AIRFLOW-4856] change hard coded run_as_user add unit test * [AIRFLOW-4856] change hard coded run_as_user create new param git_sync_run_as_user * [AIRFLOW-4856] change hard coded run_as_user add back remove option * [AIRFLOW-4856] change hard coded run_as_user fix Flake8 * [AIRFLOW-4856] change hard coded run_as_user fix Flake8 * [AIRFLOW-4856] change hard coded run_as_user fix unit test * [AIRFLOW-4856] change hard coded run_as_user change the default value to it's old 65533 (cherry picked from commit b0bb65d)
(cherry picked from commit 76fe5e2)
* Update databricks operator * Updated token auth to get from extra_dejson * Update test DatabricksHookTokenTest to use get host from 'extra' (cherry picked from commit db770cf)
(cherry picked from commit ae9608d)
(cherry picked from commit f2b7f5a)
(cherry picked from commit 489e7fe)
(cherry picked from commit 844bbad)
) * discussion on original PR suggested removing private_key option as init param * with this PR, can still provide through extras, but not as init param * also add support for private_key in tunnel -- missing in original PR for this issue * remove test related to private_key init param * use context manager to auto-close socket listener so tests can be re-run (cherry picked from commit 0790ede)
Co-authored-by: Jarek Potiuk <jarek.potiuk@polidea.com> (cherry picked from commit adfcf67)
(cherry picked from commit c7ed169)
The detection of python version is complex because we need to handle several cases - including determining the version from image name on DockerHub, detecting python version from python in the environment, finally forced from python version. This caused multiple problems with Travis where we run tests with different version (auto-detected from current python - especially when python3 became present in Travis' python 2.7 images. Now all the jobs in Travis have PYTHON_VERSION forced and the code responsible for detecting current python version has been removed as it is not needed in this case. (cherry picked from commit 351ae4e)
All files are mounted in CI now and checked using the RAT tool. As opposed to only the runtime-needed files. This is enabled for CI build only as mounting all local files to Docker (especially on Mac) has big performance penalty when running the checks (slow osxfs volume and thousands of small node_modules files generated make the check runs for a number of minutes). The RAT checks will by default use the selective volumes but on CI they will mount the whole source directory. Also latest version of RAT tool is used now and the output - list of checked files - is additionally printed as output of the RAT check so that we are sure the files we expect to be there, are actually verified. (cherry picked from commit 7e440da)
rebased on v1-10-stable due to complete k8s refactor on master
Contributor
Author
|
I will write my PR from master's branch in my fork to master's branch in this repo rather than v1-10-stable |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adding our company to "Who uses Apache Airflow?"
Hello everyone, is it possible that our company appears on "Who uses Apache Airflow?" section? We are a small team working in Endesa, a big spanish electric distributor and part of Enel. I wrote some pipelines to automate the ETLs processes we use with Hadoop / Spark, so I believe it would be great for our team.
Endesa [@drexpp]
Make sure you have checked all steps below.
Jira
Description
I targeted to v1-10-stable since I think is what @ashb recommended to me in my last PR.
Airflow UI will crash in the browser returning "Oops" message and the Traceback of the crashing error.
This is caused by modifying a task_id with a capital/small letter, I will point out some examples that will cause airflow to crash:
task_id = "DUMMY_TASK" to task_id = "dUMMY_TASK"
task_id = "Dummy_Task" to task_id = "dummy_Task" or "Dummy_task",...
task_id = "Dummy_task" to task_id = "Dummy_tASk"
File causing the problem: https://github.com/apache/airflow/blob/master/airflow/www/views.py (lines 1643 - 1654)
We can see in first two lines inside the first for loop, how the dictionary x and y is being filled with tasks_id attributes which comes from the actual DAG.
The problem actually comes in the second for loop when you get the task instances from a DAG, I am not sure about this next part and I wish someone to clarify my question about this.
I think that the task instances (ti) received from get_task_instances() function comes from the information stored into the database, that is the reason of crash when you access to "Landing Times" page, is that the x and y where filled with the actual name of the task_id in the DAG and the task_instances' task_id has different name stored causing this problem access to the dictionary.
One of my main questions is how having a different task name (such as changing from "run" to "Run") the function get_task_instances() keeps returning past task instances with different name, such asking instances of Run but returns task instances (ti) with task_id "run"?
Error screeshot
How to replicate:
Tests
My PR adds the following unit tests OR does not need testing for this extremely good reason:
I didn't know exactly how to unit test this, if you have any advice I will do a test for it. Other than that, I did test checking that the behaviour was as expected:
Commits
Documentation
Code Quality
flake8