merge invalid_kubernetes_config rule and pod_template_file rule#12510
Closed
dimberman wants to merge 211 commits intov1-10-stablefrom
Closed
merge invalid_kubernetes_config rule and pod_template_file rule#12510dimberman wants to merge 211 commits intov1-10-stablefrom
dimberman wants to merge 211 commits intov1-10-stablefrom
Conversation
yamllint: v1.24.2 -> v1.25.0 isort: 5.5.3 -> 5.5.4 (cherry picked from commit b7183de)
(cherry picked from commit 3db2e7c)
(cherry picked from commit a575c79)
The script was previously placed in scripts/ci which caused a bit of a problem in 1-10-test branch where PRs were using scripts/ci from the v1-10-test HEAD but they were missing the ci script from the PR. (cherry picked from commit 2a1be3c8408df708b675adbbc5edc734aa1dfc1c)
Adds new rule that will be run during airflow upgrade-check and will look for old clasess and imports incompatible with Airflow 2.0.
Github just released Github Code Scanning: https://github.blog/2020-09-30-code-scanning-is-now-available/ (cherry picked from commit 7c66936)
It has been raised quite a few times that workflow added in forked repositories might be pretty invasive for the forks - especially when it comes to scheduled workflows as they might eat quota or at least jobs for those organisations/people who fork repositories. This is not strictly necessary because Recently GitHub recognized this as being a problem and introduced new rules for scheduled workflows. But for people who are already forked, it would be nice to not run those actions. It is enough that the CodeQL check is done when PR is opened to the "apache/airflow" repository. Quote from the emails received by Github (no public URL explaining it yet): > Scheduled workflows will be disabled by default in forks of public repos and in public repos with no activity for 60 consecutive days. We’re making two changes to the usage policy for GitHub Actions. These changes will enable GitHub Actions to scale with the incredible adoption we’ve seen from the GitHub community. Here’s a quick overview: > * Starting today, scheduled workflows will be disabled by default in new forks of public repositories. > * Scheduled workflows will be disabled in public repos with no activity for 60 consecutive days. (cherry picked from commit 1b9e59c)
The PR builds are now better handled with regards to both running (using merge-request) and canceling (with cancel notifications). First of all we are using merged commit from the PR, not the original commit from the PR. Secondly - the workflow run notifies the original PR with comment stating that the image is being built in a separate workflow - including the link to that workflow. Thirdly - when canceling duplicate PRs or PRs with failed jobs, the workflow will add a comment to the PR stating the reason why the PR is being cancelled. Last but not least, we also add cancel job for the CodeQL duplicate messages. They run for ~ 12 miinutes so it makes perfect sense to also cancel those CodeQL jobs for which someone pushed fixups in a quick succession. Fixes: #10471 (cherry picked from commit a4478f5)
Replaces the annoying comments with "workflow_run" links with Run Checks. Now we will be able to see the "Build Image" checks in the "Checks" section including their status and direct link to the steps running the image builds as "Details" link. Unfortunately Github Actions do not handle well the links to details - even if you provide details_url to link to the other run, the "Build Image" checks appear in the original workflow, that's why we had to introduce another link in the summary of the Build Image check that links to the actual workflow. (cherry picked from commit a33a919)
Github Actions deprecated the set-env action due to moderate security vulnerability they found. https://github.blog/changelog/2020-10-01-github-actions-deprecating-set-env-and-add-path-commands/ This commit replaces set-env with env file as explained in https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-commands-for-github-actions#environment-files (cherry picked from commit 975558b)
(cherry picked from commit 45176c8)
When installing airflow 1.10 via breeze we now enable rbac by default, but we can disable it with --no-rbac-ui flag. This is useful to test different variants of 1.10 when testing release candidataes in connection with the 'start-airflow' command. (cherry picked from commit 22c6a84)
(cherry picked from commit 9dc32a3)
Seems that the k8s cache for virtualenv got broken during the recent problems. This commits bumps the cache version to make it afresh (cherry picked from commit 666e81a)
(cherry picked from commit f5b7bbc)
) * Constraints and PIP packages can be installed from local sources This is the final part of implementing #11171 based on feedback from enterprise customers we worked with. They want to have a capability of building the image using binary wheel packages that are locally available and the official Dockerfile. This means that besides the official APT sources the Dockerfile build should not needd GitHub, nor any other external files pulled from outside including PIP repository. This change also includes documentation on how to prepare set of such binaries ready for inspection and review by security teams in Enterprise environment. Such sets of "known-working-binary-whl" files can then be separately committed, tracked and scrutinized in an artifact repository of such an Enterprise. Fixes: #11171 * Update docs/production-deployment.rst (cherry picked from commit 0497390)
The push and schedule builds should not be cancelled even if they are duplicates. By seing which of the master merges failed, we have better visibility on which merge caused a problem and we can trace it's origin faster even if the builds will take longer overall. Scheduled builds also serve it's purpose and they should be always run to completion. (cherry picked from commit 401a579)
Wrong if query in the GitHub action caused upgrade to latest constraints did not work for a while. (cherry picked from commit a34f5ee)
We have started to experience "unknown_blob" errors intermittently recently with GitHub Docker registry. We might eventually need to migrate to GCR (which eventually is going to replace the Docker Registry for GitHub: The ticket is opened to the Apache Infrastructure to enable access to the GCR and to make some statements about Access Rights management for GCR https://issues.apache.org/jira/projects/INFRA/issues/INFRA-20959 Also a ticket to GitHub Support has been raised about it https://support.github.com/ticket/personal/0/861667 as we cannot delete our public images in Docker registry. But until this happens, the workaround might help us to handle the situations where we got intermittent errors while pushing to the registry. This seems to be a common error, when NGINX proxy is used to proxy Github Registry so it is likely that retrying will workaround the issue. (cherry picked from commit f9dddd5)
* Add capability of customising PyPI sources This change adds capability of customising installation of PyPI modules via custom .pypirc file. This might allow to install dependencies from in-house, vetted registry of PyPI (cherry picked from commit 45d33db)
* Modify helm chart to use pod_template_file Since we are deprecating most k8sexecutor arguments we should use the pod_template_file when launching airflow using the KubernetesExecutor * fix tests * one more nit * fix dag command * fix pylint (cherry picked from commit 56bd9b7)
This decreases scheduler delay between tasks by about 20% for larger DAGs, sometimes more for larger or more complex DAGs. The delay between tasks can be a major issue, especially when we have dags with many subdags, figures out that the scheduling process spends plenty of time in dependency checking, we took the trigger rule dependency which calls the db for each task instance, we made it call the db just once for each dag_run (cherry picked from commit 50efda5)
If you used context from git repo, the .piprc file was missing and COPY in Dockerfile is not conditional. This change copies the .pypirc conditionally from the docker-context-files folder instead. Also it was needlessly copied in the main image where it is not needed and it was even dangerous to do so. (cherry picked from commit 53e5d8f)
In secured cluster there is a need to run this job with specific user id (cherry picked from commit 1266b29)
So that it can pull the specified image from a private registry. (cherry picked from commit 473f506)
Co-authored-by: Flavien Dereume-Hancart <flavien@LL-PC0BE1K9-1.goiba.net> (cherry picked from commit 5f403a8)
(cherry picked from commit e495be9)
Co-authored-by: Kamil Olszewski <kamil.olszewski@polidea.com> (cherry picked from commit 48ce4bd)
Co-authored-by: Daniel Imberman <daniel@astronomer.io> (cherry picked from commit 0faa06e)
For Kubernetes tests all tests can be executed in the same python version - default one - no matter which PYTHON_MAJOR_MINOR is used. This is because we are testing Airflow which is deployed via production image. Thanks to that we can fix the python version to be default and avoid any python version problems (this is especially important for cherry-picking to 1.10 where we have python 2.7 and 3.5. (cherry-picked from commit cbd6daf)
(cherry picked from commit bfbbb24)
We've observed the tests for last couple of weeks and it seems most of the tests marked with "quarantine" marker are succeeding in a stable way (#10118) The removed tests have success ratio of > 95% (20 runs without problems) and this has been verified a week ago as well, so it seems they are rather stable. There are literally few that are either failing or causing the Quarantined builds to hang. I manually reviewed the master tests that failed for last few weeks and added the tests that are causing the build to hang. Seems that stability has improved - which might be casued by some temporary problems when we marked the quarantined builds or too "generous" way of marking test as quarantined, or maybe improvement comes from the #10368 as the docker engine and machines used to run the builds in GitHub experience far less load (image builds are executed in separate builds) so it might be that resource usage is decreased. Another reason might be Github Actions stability improvements. Or simply those tests are more stable when run isolation. We might still add failing tests back as soon we see them behave in a flaky way. The remaining quarantined tests that need to be fixed: * test_local_run (often hangs the build) * test_retry_handling_job * test_clear_multiple_external_task_marker * test_should_force_kill_process * test_change_state_for_tis_without_dagrun * test_cli_webserver_background We also move some of those tests to "heisentests" category Those testst run fine in isolation but fail the builds when run with all other tests: * TestImpersonation tests We might find that those heisentest can be fixed but for now we are going to run them in isolation. Also - since those quarantined tests are failing more often the "num runs" to track for those has been decreased to 10 to keep track of 10 last runs only. (cherry picked from commit b746f33)
We've been mocking UUID badly and this cause tests with new version of sentry to fail. This change fixes both - fixes the mocking as well as removes the upper constraint on sentry.
By doing it this way we can add new features to upgrade_check, such as the adding a flag to make changes automatically, without having to release a new version of Airflow.
When testing out the ugprade check command on 1.10 with the default
SQLite backend I got the following error:
```
[2020-11-20 12:10:28,248] {base.py:1372} ERROR - Error closing cursor
Traceback (most recent call last):
File "/home/ash/.virtualenvs/airflow-1-10/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1284, in fetchall
l = self.process_rows(self._fetchall_impl())
File "/home/ash/.virtualenvs/airflow-1-10/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1230, in _fetchall_impl
return self.cursor.fetchall()
sqlite3.ProgrammingError: Cannot operate on a closed database.
```
This was caused because the `@provide_session` decorator closed the
connection when the function returned, and since we were using a
generator expression, we hadn't yet fetched all the rows.
This changes it so the rows are fetched before returning.
The intent here is this content will be visible on the pypi page for apache-airflow-upgrade-check (once it is published.) This is far from perfect, but the _first_ step is the hardest :)
By default, the logs would appear in the middle of the status stream,
which makes it slightly harder to parse the output.
Before:
```
============================================= STATUS =============================================
Legacy UI is deprecated by default......................................................SUCCESS
Users must set a kubernetes.pod_template_file value.....................................FAIL
Changes in import paths of hooks, operators, sensors and others.........................FAIL
Remove airflow.AirflowMacroPlugin class.................................................SUCCESS
[2020-11-20 14:26:04,083] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-20 14:26:04,083] {dagbag.py:417} INFO - Filling up the DagBag from /home/ash/airflow/dags
Jinja Template Variables cannot be undefined............................................SUCCESS
```
After:
```
============================================= STATUS =============================================
Legacy UI is deprecated by default......................................................SUCCESS
Users must set a kubernetes.pod_template_file value.....................................FAIL
Changes in import paths of hooks, operators, sensors and others.........................FAIL
Remove airflow.AirflowMacroPlugin class.................................................SUCCESS
Jinja Template Variables cannot be undefined............................................SUCCESS
```
…11528) * add upgrade rule to check for mesos config and flag to remove it. * change from checking the mesos config section to core/executor config * remove leading new line and indent in desc
(cherry picked from commit 5fac1e15c702faa9606c22390bf9b143193dfc79)
turbaszek
approved these changes
Nov 20, 2020
|
The PR needs to run all tests because it modifies core of Airflow! Please rebase it to latest master or ask committer to re-run it! |
|
The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Since these two rules do very similar things, it would make sense to merge them to create less confusion for users.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.