Skip to content

Conversation

@parejadan
Copy link


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

jhtimmins and others added 30 commits February 4, 2021 14:34
From 1.10.x -> 2.0, the required permissions to trigger a dag have changed from DAG.can_edit to DAG.can_read + DagRun.can_create. Since the Viewer role has DAG.can_read by default, it isn't possible to give a Viewer trigger access to a single DAG without giving access to all DAGs.

This fixes that discrepancy by making the trigger requirement DAG.can_edit + DagRun.can_create. Now, to trigger a DAG, a viewer will need to be given both permissions, as neither is with the Viewer role by default.

This PR also hides the Trigger/Refresh buttons on the home page if the user doesn't have permission to perform those actions.

closes: apache#13891
related: apache#13891
(cherry picked from commit 629abfd)
Fixes the issue wherein regardless of what role anonymous users are assigned (via the `AUTH_ROLE_PUBLIC` env var), they can't see any DAGs.

Current behavior causes:
Anonymous users are handled as a special case by Airflow's DAG-related security methods (`.has_access()` and `.get_accessible_dags()`). Rather than checking the `AUTH_ROLE_PUBLIC` value to check for role permissions, the methods reject access to view or edit any DAGs.

Changes in this PR:
Rather than hardcoding permission rules inside the security methods, this change checks the `AUTH_ROLE_PUBLIC` value and gives anonymous users all permissions linked to the designated role.

**This places security in the hands of the Airflow users. If the value is set to `Admin`, anonymous users will have full admin functionality.**

This also changes how the `Public` role is created. Currently, the `Public` role is created automatically by Flask App Builder. This PR explicitly declares `Public` as a default role with no permissions in `security.py`. This change makes it easier to test.

closes: apache#13340
(cherry picked from commit 78aa921)
…pache#14032)

In the case of OperationalError (caused deadlocks, network blips), the scheduler will now retry those methods 3 times.

closes apache#11899
closes apache#13668

(cherry picked from commit 914e9ce)
closes apache#14050

We were not de-serializing `BaseOperator.sla` properly, hence
we were returning float instead of `timedelta` object.

Example: 100.0 instead of timedelta(seconds=100)

And because we had a check in _manage_sla in `SchedulerJob` and `DagFileProcessor`,
we were skipping SLA.

SchedulerJob:
https://github.com/apache/airflow/blob/88bdcfa0df5bcb4c489486e05826544b428c8f43/airflow/jobs/scheduler_job.py#L1766-L1768

DagFileProcessor:
https://github.com/apache/airflow/blob/88bdcfa0df5bcb4c489486e05826544b428c8f43/airflow/jobs/scheduler_job.py#L395-L397
(cherry picked from commit 604a37e)
This fixes the test test_should_load_plugins_from_property, which is currently quarantined as a "Heisentest".

Current behavior:
The test currently fails because the records that it expects to find in the logger are not present.

Cause:
While the test sets the logger as "DEBUG", it doesn't specify which logger to update. Python loggers are namespaced (typically based on the current file's path), but this has to be defined explicitly. In the absence of a specified logger, any attempts to lookup will return the BaseLogger instance.

The test is therefore updating the log level for the base logger, but when the test runs, the plugins_manager.py file defines a namespaced logger log = logging.getLogger(__name__) used throughout the file. Since a different logger is used, the original log level, in this case INFO, is used. INFO is a higher level than DEBUG, so the calls to log.debug() get filtered out, and when the test looks for log records it finds an empty list.

Fix:
Just specify which logger to update when modifying the log level in the test.

(cherry picked from commit e80ad5a)
…pache#14067)

Only `Admin` or `Op` roles should have permissions to view Configurations.

Previously, Users with `User` or `Viewer` role were able to get/view configurations using
the REST API or in the Webserver. From Airflow 2.0.1, only users with `Admin` or `Op` role would be able
to get/view Configurations.

(cherry picked from commit 3909232)
…s. (apache#14107)

Revert "Fix Commands to install Airflow in docker/install_airflow.sh (apache#14099)"

This reverts commit 68758b8.

Also fixes the docker build script that was the reason for original
attempt to fix it.

(cherry picked from commit 212d5cd)
`attachable` is only a property of compose version 3.1 files, but we are
on 2.2 still.

This was failing on self-hosted runners with an error
`networks.example.com value Additional properties are not allowed
('attachable' was unexpected)`

(cherry picked from commit 5dbabdd)
…13730)

* Run CI (Build Image and committer PRs) on self-hosted runner

The queue for GitHub actions is starting to get painful.

This PR changes it to run any build of main (push or scheduled), or a PR
from a committer on the self-hosted runner.

However since this setting is configured _in_ the repo, it could be
changed by a PR, so we can't rely on that; we also use a custom build of
the self-hosted runner binary that enforces PRs are from contributors,
otherwise it will fail the build

* Don't free space on self-hosted runners

It deletes the images we have pre-cached (defeating the point of
pre-loading them) and we manage freeing space via another process.

The exception to this is the K8s tests, where we don't need the images
we pre-cache, but we do want the space in our tmpfs.

* Some jobs take longer on self-hosted runners

* When running KinD tests on CI, use pass through the host docker creds to Kind

This enabled self-hosted runners (which are logged in to a docker hub
user) to pull images without running in to pull limits.

* Work around supurious blocking IO failures

This seems to be an occasional problem on the build provider package job
which produces a lot of output all at once.

The error appears as the job just failing with an exit code (sometimes 1, othertimes 120) but no
error.

(cherry picked from commit 4816707)
The change to this file was untestable until it was merged to master :(

(cherry picked from commit 9005f28)
When the image is prepared, PIP installation produces progress
bars which are annoying - especially in the CI environment.

This PR adds argument to control progress bar and sets it to "off"
for CI builds.

(cherry picked from commit 9b7852e)
…4239)

When the user wants to test pull requests or - especilly - master
build in their own fork, they will not likely have self-hosted
runners configured and the builds will fail.

This change uses self-hosted runners only if the build is run in
the context of the `apache/airflow` repository.

(cherry picked from commit 14890f0)
This change attempts to stabilize pylint checks. Since we have
recently added self-hosted runners with multiple CPUS, seems that
re-enabling parallel mode for pylint makes perfect sense as we will
finally be able to use the parallelism and speed up static checks
significantly.

Previously the tests were run in single-processor mode in attempt
to avoid random mistakes where different files were processed in
different processes. This led to random pylint or mypy problems
and aither false-positives or false negatives before especially
when it came to circular dependencies. but since we are now past
heavy refactoring, this should be no problem for future changes
and occasional false positive/negative is less disruptive than
long checks.

The attempt is made to apply sorting order in the files passed
to pylint. This should provide more stability in the results
of running the tests in PR and in master.

We had some custom pylint plugins that prevented using of pylint
parallelism. For now we are giving up on one of the plugins
(no asserts use) and we rely on committer's review on that (we
have a rule in place to only use asserts in tests). The other
plugin was replaced by coming back to separation of "main code"
and "test code" and applying different rules to those - we have
now two different configuration files|: pylintrc and
pylintrc-tests to control settings for those two different cases.

Mypy and flake8 have been parallelized at the level of pre-commits.

By implementing those changes we are likely to speed up the
tests on self-hosted runners 6x-8x times.

Also caching of pre-commit environments (including the
pre-commit installation) is improved. Previously we only cached the
environment created by the pre-commit, but with this change we
also cache the pre-commit installation in .local directory - this
should save 1-2 minutes of the job.

(cherry picked from commit 82cb041)
…pache#14227)

There are two types of constraints now:

* default constraints that contain all depenedncies of airflow,
  all the provider packages released at the time of the relese
  of that version, as well as all transitive dependencies. Following
  those constraints, you can be sure Airflow's installation is
  repeatable

* no-providers constraints - containing only the dependencies needed
  for core airflow installation. This allows to install/upgrade
  airflow without also forcing the provider's to be installed at
  specific version of Airflow.

This allows for flexible management of Airflow and Provider
packages separately. Documentation about it has been added.

Also the provider 'extras' for apache airflow do not keep direct
dependencies to the packages needed by the provider. Those
dependencies are now transitive only - so 'provider' extras only
depend on 'apache-airflow-provider-EXTRA' package and all
the dependencies are transitive. This will help in the future
to avoid conflicts when installing newer providers using extras.

(cherry picked from commit d524cec)
* Fix caching of python images during builds

After GHCR change, github image caching for python images was
broken. GHCR requires each image that we push to it to be
tagged with "org.opencontainers.image.source" label to point
to the right project repository. Otherwise it will not appear
in the repository images and we will not be able to manage
permissions of the image.

So we'v been extending the python base image with appropriate
label, but it was not in all places necessary. As the result.
the builds in CI were usign different images than the cache
image in DockerHub, which in turn caused full image rebuilds
ALWAYS. By implementing this change we make sure tha the the
"airflow-tainted" python images - including the label.

All images (when using breeze) are now built using such
airflow-tainted image which in turn should give5 up to 5-8
minutes saving on building the images job per each job (CI
and production)

The change implements the following behaviours:

1) When image is built on CI or locally without
   UPGRADE_TO_NEWER_DEPENDENCIES flag and without
   FORCE_PULL_IMAGES flag -t the latest image from the
   GitHub Registry or DockerHub registry is used.

2) When both FORCE_PULL_IMAGES and UPGRADE_TO_NEWER_DEPENDENCY
   are set to something different than "false" - the latest Python
   images are used - they are pulled first and tagged appropriately.

This way - always when UPGRADE_TO_NEWER_DEPENDENCIES are set (also
when master pushes are done) - such builds will use latest version
of python base images published on DockerHub.

* Update _push_pull_remove_images.sh

(cherry picked from commit b7d9c34)
kaxil and others added 25 commits April 15, 2021 14:00
…che#15197)

This feature was added in apache#11784 but
it was broken as it got `pod_template_override` from `executor_config`
instead of `pod_template_file`.

closes apache#14199

(cherry picked from commit 5606137)
1. fix typo parametrized ->  parameterized
2. update `from airflow.operators.bash_operator import BashOperator` -> `from airflow.operators.bash import BashOperator`

(cherry picked from commit 4099108)
- Fix an apparent mistake in doc relating to catchup
- Fix typo pickable (should be picklable)

(cherry picked from commit 53dafa5)
…che#15191)

closes apache#15178
closes apache#13761

This feature was added in 2015 in apache#74 and it was expected to set `doc_md` (or `doc_rst` and other `doc_*`) via `task.doc_md` instead of passing via arg. However, this did not work with DAG Serialization as we only allowed a selected args to be stored in Serialized version of DAG.

(cherry picked from commit e86f5ca)
This adds back the base lineage backend which can be extended to send lineage metadata to any custom backend.
closes: apache#14106

Co-authored-by: Joao Ponte <jpe@plista.com>
Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>
(cherry picked from commit af2d11e)
Fixes: apache#14675

Instead of building the relative url manually, we can leverage flask's url generation to account for differing airflow base URL and HTML base URL.

(cherry picked from commit aaa3bf6)
Currently as long as argument '-p' if present, code tries to mask it.

However, '-p' may mean something else (not password), like a boolean flag. Such cases may result in exception

(cherry picked from commit 486b764)
…eated Pods (apache#15204)

closes: apache#15193

Currently condition if the pod is created by Airflow is not considered. This commit fixes this.

We decide if the Pod is created by Airflow via checking if it has all the labels added in PodGenerator.construct_pod() or KubernetesPodOperator.create_labels_for_pod().

(cherry picked from commit c594d9c)
…he#15338)

We've seen instances of connection resets happening, particularly in
Azure, that are remedied by enabling tcp_keepalive. Enabling it by
default should be safe and sane regardless of where we are running.

(cherry picked from commit 6e31465)
The current implementation imports Connection on import time, which
causes a circular import when a model class needs to reference a hook
class.

By applying this fix, the airflow.hooks package is completely decoupled
with airflow.models on import time, so model code can reference hooks.
Hooks, on the other hand, generally don't reference model classes.

Fix apache#15325.

(cherry picked from commit 7560316)
This module was removed/renamed in mysqlclient 2.0 -- this new name
works for both

(cherry picked from commit a162f2b)
There is some problem where the stackdriver logging "leaks" out of this
test and remains configured, (where as the S3 one doesn't) but given
this test is not actually testing logs, a quick, fix is just to change
with remote logging handler we use.
After stabilizing the builds on CI, the master builds started
to finally get green - except pushing to prod image cache which
continuous to fail as it missed python image to push.

This PR fixes it.

(cherry picked from commit 1dfbb8d)
)

This bug got introduced in apache#14909. Removed sorting from list and tuple as list & tuples preserve order unlike set.

The following DAG errors with: `TypeError: '<' not supported between instances of 'dict' and 'dict'`

```python
from airflow import models
from airflow.operators.dummy import DummyOperator
from datetime import datetime, timedelta
params = {
    "staging_schema": [{"key:":"foo","value":"bar"},
                       {"key:":"this","value":"that"}]
}

with models.DAG(dag_id='test-dag',
                start_date=datetime(2019, 2, 14),
                schedule_interval='30 13 * * *',
                catchup=False,
                max_active_runs=1,
                params=params
                ) as dag:
    my_task = DummyOperator(
        task_id='task1'
    )
```

Full Error:

```
  File "/usr/local/lib/python3.7/site-packages/airflow/serialization/serialized_objects.py", line 210, in <dictcomp>
    return cls._encode({str(k): cls._serialize(v) for k, v in var.items()}, type_=DAT.DICT)
  File "/usr/local/lib/python3.7/site-packages/airflow/serialization/serialized_objects.py", line 212, in _serialize
    return sorted(cls._serialize(v) for v in var)
TypeError: '<' not supported between instances of 'dict' and 'dict'
During handling of the above exception, another exception occurred:
...
```

This is because `sorted()` does not work with dict as it can't compare. Removed sorting from list & tuples which fixes it.
It also fails when we have set with multiple types.

(cherry picked from commit d115040)
…nches (apache#15394)

They use the same python image as master (as already mentioned in the
comments in ci_prepare_prod_image_on_ci.sh) so we don't want to try
and push the python image when we aren't building the main branch.

(cherry picked from commit f94effe)
Afer merging the constraints, the 'recursive' mode was not added
to checkout resulting with non-checked out github push action.

This commit fixes it and adds color to diff output in commit
to better show differences when pushing.

(cherry picked from commit 6b78394)
@boring-cyborg boring-cyborg bot added area:dev-tools area:production-image Production image improvements and fixes labels Nov 18, 2021
@boring-cyborg
Copy link

boring-cyborg bot commented Nov 18, 2021

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@parejadan parejadan changed the base branch from main to constraints-2.1.0-setuptools November 18, 2021 00:52
@parejadan parejadan closed this Nov 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools area:production-image Production image improvements and fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.