-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Dp 432 airflow deep dive notes #19666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
parejadan
wants to merge
10,000
commits into
apache:constraints-2.1.0-setuptools
from
parejadan:DP-432_airflow_deep_dive_notes
Closed
Dp 432 airflow deep dive notes #19666
parejadan
wants to merge
10,000
commits into
apache:constraints-2.1.0-setuptools
from
parejadan:DP-432_airflow_deep_dive_notes
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
From 1.10.x -> 2.0, the required permissions to trigger a dag have changed from DAG.can_edit to DAG.can_read + DagRun.can_create. Since the Viewer role has DAG.can_read by default, it isn't possible to give a Viewer trigger access to a single DAG without giving access to all DAGs. This fixes that discrepancy by making the trigger requirement DAG.can_edit + DagRun.can_create. Now, to trigger a DAG, a viewer will need to be given both permissions, as neither is with the Viewer role by default. This PR also hides the Trigger/Refresh buttons on the home page if the user doesn't have permission to perform those actions. closes: apache#13891 related: apache#13891 (cherry picked from commit 629abfd)
(cherry picked from commit 570d322)
Fixes the issue wherein regardless of what role anonymous users are assigned (via the `AUTH_ROLE_PUBLIC` env var), they can't see any DAGs. Current behavior causes: Anonymous users are handled as a special case by Airflow's DAG-related security methods (`.has_access()` and `.get_accessible_dags()`). Rather than checking the `AUTH_ROLE_PUBLIC` value to check for role permissions, the methods reject access to view or edit any DAGs. Changes in this PR: Rather than hardcoding permission rules inside the security methods, this change checks the `AUTH_ROLE_PUBLIC` value and gives anonymous users all permissions linked to the designated role. **This places security in the hands of the Airflow users. If the value is set to `Admin`, anonymous users will have full admin functionality.** This also changes how the `Public` role is created. Currently, the `Public` role is created automatically by Flask App Builder. This PR explicitly declares `Public` as a default role with no permissions in `security.py`. This change makes it easier to test. closes: apache#13340 (cherry picked from commit 78aa921)
…pache#14032) In the case of OperationalError (caused deadlocks, network blips), the scheduler will now retry those methods 3 times. closes apache#11899 closes apache#13668 (cherry picked from commit 914e9ce)
closes apache#14050 We were not de-serializing `BaseOperator.sla` properly, hence we were returning float instead of `timedelta` object. Example: 100.0 instead of timedelta(seconds=100) And because we had a check in _manage_sla in `SchedulerJob` and `DagFileProcessor`, we were skipping SLA. SchedulerJob: https://github.com/apache/airflow/blob/88bdcfa0df5bcb4c489486e05826544b428c8f43/airflow/jobs/scheduler_job.py#L1766-L1768 DagFileProcessor: https://github.com/apache/airflow/blob/88bdcfa0df5bcb4c489486e05826544b428c8f43/airflow/jobs/scheduler_job.py#L395-L397 (cherry picked from commit 604a37e)
This fixes the test test_should_load_plugins_from_property, which is currently quarantined as a "Heisentest". Current behavior: The test currently fails because the records that it expects to find in the logger are not present. Cause: While the test sets the logger as "DEBUG", it doesn't specify which logger to update. Python loggers are namespaced (typically based on the current file's path), but this has to be defined explicitly. In the absence of a specified logger, any attempts to lookup will return the BaseLogger instance. The test is therefore updating the log level for the base logger, but when the test runs, the plugins_manager.py file defines a namespaced logger log = logging.getLogger(__name__) used throughout the file. Since a different logger is used, the original log level, in this case INFO, is used. INFO is a higher level than DEBUG, so the calls to log.debug() get filtered out, and when the test looks for log records it finds an empty list. Fix: Just specify which logger to update when modifying the log level in the test. (cherry picked from commit e80ad5a)
closes apache#13464 (cherry picked from commit eb78a8b)
…pache#14067) Only `Admin` or `Op` roles should have permissions to view Configurations. Previously, Users with `User` or `Viewer` role were able to get/view configurations using the REST API or in the Webserver. From Airflow 2.0.1, only users with `Admin` or `Op` role would be able to get/view Configurations. (cherry picked from commit 3909232)
…s. (apache#14107) Revert "Fix Commands to install Airflow in docker/install_airflow.sh (apache#14099)" This reverts commit 68758b8. Also fixes the docker build script that was the reason for original attempt to fix it. (cherry picked from commit 212d5cd)
(cherry picked from commit 5090fb0)
(cherry picked from commit 69801f5)
(cherry picked from commit 6fd93ba)
`attachable` is only a property of compose version 3.1 files, but we are
on 2.2 still.
This was failing on self-hosted runners with an error
`networks.example.com value Additional properties are not allowed
('attachable' was unexpected)`
(cherry picked from commit 5dbabdd)
(cherry picked from commit 0c1ad94)
…13730) * Run CI (Build Image and committer PRs) on self-hosted runner The queue for GitHub actions is starting to get painful. This PR changes it to run any build of main (push or scheduled), or a PR from a committer on the self-hosted runner. However since this setting is configured _in_ the repo, it could be changed by a PR, so we can't rely on that; we also use a custom build of the self-hosted runner binary that enforces PRs are from contributors, otherwise it will fail the build * Don't free space on self-hosted runners It deletes the images we have pre-cached (defeating the point of pre-loading them) and we manage freeing space via another process. The exception to this is the K8s tests, where we don't need the images we pre-cache, but we do want the space in our tmpfs. * Some jobs take longer on self-hosted runners * When running KinD tests on CI, use pass through the host docker creds to Kind This enabled self-hosted runners (which are logged in to a docker hub user) to pull images without running in to pull limits. * Work around supurious blocking IO failures This seems to be an occasional problem on the build provider package job which produces a lot of output all at once. The error appears as the job just failing with an exit code (sometimes 1, othertimes 120) but no error. (cherry picked from commit 4816707)
The change to this file was untestable until it was merged to master :( (cherry picked from commit 9005f28)
When the image is prepared, PIP installation produces progress bars which are annoying - especially in the CI environment. This PR adds argument to control progress bar and sets it to "off" for CI builds. (cherry picked from commit 9b7852e)
…4221) Fixes apache#14104 (cherry picked from commit af5eb55)
…4239) When the user wants to test pull requests or - especilly - master build in their own fork, they will not likely have self-hosted runners configured and the builds will fail. This change uses self-hosted runners only if the build is run in the context of the `apache/airflow` repository. (cherry picked from commit 14890f0)
…pache#14239) (apache#14242) (cherry picked from commit 302be50)
This change attempts to stabilize pylint checks. Since we have recently added self-hosted runners with multiple CPUS, seems that re-enabling parallel mode for pylint makes perfect sense as we will finally be able to use the parallelism and speed up static checks significantly. Previously the tests were run in single-processor mode in attempt to avoid random mistakes where different files were processed in different processes. This led to random pylint or mypy problems and aither false-positives or false negatives before especially when it came to circular dependencies. but since we are now past heavy refactoring, this should be no problem for future changes and occasional false positive/negative is less disruptive than long checks. The attempt is made to apply sorting order in the files passed to pylint. This should provide more stability in the results of running the tests in PR and in master. We had some custom pylint plugins that prevented using of pylint parallelism. For now we are giving up on one of the plugins (no asserts use) and we rely on committer's review on that (we have a rule in place to only use asserts in tests). The other plugin was replaced by coming back to separation of "main code" and "test code" and applying different rules to those - we have now two different configuration files|: pylintrc and pylintrc-tests to control settings for those two different cases. Mypy and flake8 have been parallelized at the level of pre-commits. By implementing those changes we are likely to speed up the tests on self-hosted runners 6x-8x times. Also caching of pre-commit environments (including the pre-commit installation) is improved. Previously we only cached the environment created by the pre-commit, but with this change we also cache the pre-commit installation in .local directory - this should save 1-2 minutes of the job. (cherry picked from commit 82cb041)
…pache#14227) There are two types of constraints now: * default constraints that contain all depenedncies of airflow, all the provider packages released at the time of the relese of that version, as well as all transitive dependencies. Following those constraints, you can be sure Airflow's installation is repeatable * no-providers constraints - containing only the dependencies needed for core airflow installation. This allows to install/upgrade airflow without also forcing the provider's to be installed at specific version of Airflow. This allows for flexible management of Airflow and Provider packages separately. Documentation about it has been added. Also the provider 'extras' for apache airflow do not keep direct dependencies to the packages needed by the provider. Those dependencies are now transitive only - so 'provider' extras only depend on 'apache-airflow-provider-EXTRA' package and all the dependencies are transitive. This will help in the future to avoid conflicts when installing newer providers using extras. (cherry picked from commit d524cec)
(cherry picked from commit cc9827d)
* Fix caching of python images during builds After GHCR change, github image caching for python images was broken. GHCR requires each image that we push to it to be tagged with "org.opencontainers.image.source" label to point to the right project repository. Otherwise it will not appear in the repository images and we will not be able to manage permissions of the image. So we'v been extending the python base image with appropriate label, but it was not in all places necessary. As the result. the builds in CI were usign different images than the cache image in DockerHub, which in turn caused full image rebuilds ALWAYS. By implementing this change we make sure tha the the "airflow-tainted" python images - including the label. All images (when using breeze) are now built using such airflow-tainted image which in turn should give5 up to 5-8 minutes saving on building the images job per each job (CI and production) The change implements the following behaviours: 1) When image is built on CI or locally without UPGRADE_TO_NEWER_DEPENDENCIES flag and without FORCE_PULL_IMAGES flag -t the latest image from the GitHub Registry or DockerHub registry is used. 2) When both FORCE_PULL_IMAGES and UPGRADE_TO_NEWER_DEPENDENCY are set to something different than "false" - the latest Python images are used - they are pulled first and tagged appropriately. This way - always when UPGRADE_TO_NEWER_DEPENDENCIES are set (also when master pushes are done) - such builds will use latest version of python base images published on DockerHub. * Update _push_pull_remove_images.sh (cherry picked from commit b7d9c34)
…che#15197) This feature was added in apache#11784 but it was broken as it got `pod_template_override` from `executor_config` instead of `pod_template_file`. closes apache#14199 (cherry picked from commit 5606137)
1. fix typo parametrized -> parameterized 2. update `from airflow.operators.bash_operator import BashOperator` -> `from airflow.operators.bash import BashOperator` (cherry picked from commit 4099108)
- Fix an apparent mistake in doc relating to catchup - Fix typo pickable (should be picklable) (cherry picked from commit 53dafa5)
…che#15191) closes apache#15178 closes apache#13761 This feature was added in 2015 in apache#74 and it was expected to set `doc_md` (or `doc_rst` and other `doc_*`) via `task.doc_md` instead of passing via arg. However, this did not work with DAG Serialization as we only allowed a selected args to be stored in Serialized version of DAG. (cherry picked from commit e86f5ca)
…pache#14160) closes: apache#13805 (cherry picked from commit 2b5d4e3)
This adds back the base lineage backend which can be extended to send lineage metadata to any custom backend. closes: apache#14106 Co-authored-by: Joao Ponte <jpe@plista.com> Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com> (cherry picked from commit af2d11e)
(cherry picked from commit 932f8c2)
Fixes: apache#14675 Instead of building the relative url manually, we can leverage flask's url generation to account for differing airflow base URL and HTML base URL. (cherry picked from commit aaa3bf6)
Currently as long as argument '-p' if present, code tries to mask it. However, '-p' may mean something else (not password), like a boolean flag. Such cases may result in exception (cherry picked from commit 486b764)
…eated Pods (apache#15204) closes: apache#15193 Currently condition if the pod is created by Airflow is not considered. This commit fixes this. We decide if the Pod is created by Airflow via checking if it has all the labels added in PodGenerator.construct_pod() or KubernetesPodOperator.create_labels_for_pod(). (cherry picked from commit c594d9c)
(cherry picked from commit 97b7780)
The current implementation imports Connection on import time, which causes a circular import when a model class needs to reference a hook class. By applying this fix, the airflow.hooks package is completely decoupled with airflow.models on import time, so model code can reference hooks. Hooks, on the other hand, generally don't reference model classes. Fix apache#15325. (cherry picked from commit 7560316)
This module was removed/renamed in mysqlclient 2.0 -- this new name works for both (cherry picked from commit a162f2b)
(cherry picked from commit f69bb82)
There is some problem where the stackdriver logging "leaks" out of this test and remains configured, (where as the S3 one doesn't) but given this test is not actually testing logs, a quick, fix is just to change with remote logging handler we use.
After stabilizing the builds on CI, the master builds started to finally get green - except pushing to prod image cache which continuous to fail as it missed python image to push. This PR fixes it. (cherry picked from commit 1dfbb8d)
) This bug got introduced in apache#14909. Removed sorting from list and tuple as list & tuples preserve order unlike set. The following DAG errors with: `TypeError: '<' not supported between instances of 'dict' and 'dict'` ```python from airflow import models from airflow.operators.dummy import DummyOperator from datetime import datetime, timedelta params = { "staging_schema": [{"key:":"foo","value":"bar"}, {"key:":"this","value":"that"}] } with models.DAG(dag_id='test-dag', start_date=datetime(2019, 2, 14), schedule_interval='30 13 * * *', catchup=False, max_active_runs=1, params=params ) as dag: my_task = DummyOperator( task_id='task1' ) ``` Full Error: ``` File "/usr/local/lib/python3.7/site-packages/airflow/serialization/serialized_objects.py", line 210, in <dictcomp> return cls._encode({str(k): cls._serialize(v) for k, v in var.items()}, type_=DAT.DICT) File "/usr/local/lib/python3.7/site-packages/airflow/serialization/serialized_objects.py", line 212, in _serialize return sorted(cls._serialize(v) for v in var) TypeError: '<' not supported between instances of 'dict' and 'dict' During handling of the above exception, another exception occurred: ... ``` This is because `sorted()` does not work with dict as it can't compare. Removed sorting from list & tuples which fixes it. It also fails when we have set with multiple types. (cherry picked from commit d115040)
…nches (apache#15394) They use the same python image as master (as already mentioned in the comments in ci_prepare_prod_image_on_ci.sh) so we don't want to try and push the python image when we aren't building the main branch. (cherry picked from commit f94effe)
(cherry picked from commit a0b217a)
Afer merging the constraints, the 'recursive' mode was not added to checkout resulting with non-checked out github push action. This commit fixes it and adds color to diff output in commit to better show differences when pushing. (cherry picked from commit 6b78394)
(cherry picked from commit adbab36)
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.