Skip to content

Commit

Permalink
Tell users what to do if their scanners find issues in the image (#37652
Browse files Browse the repository at this point in the history
)

We often get reports with results of the image scanning sent to
the security team. However, for 3rd-party CVEs which are public,
this is wrong way of reporting them and our users have other ways
they can either handle it, or research it or contribute back their
findings back and it's not clear for them that a) they have those
options b) their expectations are that Airflow security team will
tell them how to clear their security scan reports, c) they do not
know they should (and can) contribute back.

This change restructures and clarifies the chapter that was describing
it in a pretty vague way - turning it into "How to" guide for the
users, explaining all the options they have and explaining what are
the ways they can contribute back - also making it crystal clear
what is the responsibility of the security team for it and that
the community expects contributions in such cases from commercial
users who want their security reports cleared, not the other way
round.

(cherry picked from commit 6a707e3)
  • Loading branch information
potiuk authored and ephraimbuddy committed Mar 6, 2024
1 parent 00922c0 commit d5f9877
Show file tree
Hide file tree
Showing 2 changed files with 110 additions and 25 deletions.
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,7 @@ def _get_rst_filepath_from_path(filepath: pathlib.Path):
manual_substitutions_in_generated_html = ["example-dags.html", "operators.html", "index.html"]
if PACKAGE_NAME == "docker-stack":
# Substitute in links
manual_substitutions_in_generated_html = ["build.html"]
manual_substitutions_in_generated_html = ["build.html", "index.html"]

html_css_files = ["custom.css"]

Expand Down
133 changes: 109 additions & 24 deletions docs/docker-stack/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ The Apache Airflow image provided as convenience package is optimized for size,
it provides just a bare minimal set of the extras and dependencies installed and in most cases
you want to either extend or customize the image. You can see all possible extras in :doc:`apache-airflow:extra-packages-ref`.
The set of extras used in Airflow Production image are available in the
`Dockerfile <https://github.com/apache/airflow/blob/main/Dockerfile>`_.
`Dockerfile <https://github.com/apache/airflow/blob/|airflow-version|/Dockerfile>`__.

However, Airflow has more than 60 community-managed providers (installable via extras) and some of the
default extras/providers installed are not used by everyone, sometimes others extras/providers
Expand All @@ -94,29 +94,114 @@ not even when those versions contain critical security fixes. The process of Air
around upgrading dependencies automatically where applicable but only when we release a new version of Airflow,
not for already released versions.

If you want to make sure that Airflow dependencies are upgraded to the latest released versions containing
latest security fixes in the image you use, you should implement your own process to upgrade
those yourself when you build custom image based on the Airflow reference one. Airflow usually does not
upper-bound versions of its dependencies via requirements, so you should be able to upgrade them to the
latest versions - usually without any problems. And you can follow the process described in
:ref:`Building the image <build:build_image>` to do it (even in automated way).

Obviously - since we have no control over what gets released in new versions of the dependencies, we
cannot give any guarantees that tests and functionality of those dependencies will be compatible with
Airflow after you upgrade them - testing if Airflow still works with those is in your hands,
and in case of any problems, you should raise issue with the authors of the dependencies that are problematic.
You can also - in such cases - look at the `Airflow issues <https://github.com/apache/airflow/issues>`_
`Airflow Pull Requests <https://github.com/apache/airflow/pulls>`_ and
`Airflow Discussions <https://github.com/apache/airflow/discussions>`_, searching for similar
problems to see if there are any fixes or workarounds found in the ``main`` version of Airflow and apply them
to your custom image.

The easiest way to keep-up with the latest released dependencies is however, to upgrade to the latest released
Airflow version via switching to newly released images as base for your images, when a new version of
Airflow is released. Whenever we release a new version of Airflow, we upgrade all dependencies to the latest
applicable versions and test them together, so if you want to keep up with those tests - staying up-to-date
with latest version of Airflow is the easiest way to update those dependencies.

What should I do if my security scan shows critical and high vulnerabilities in the image?
==========================================================================================

We often hear questions that our users use various security scanners on the image and find out that
there are some critical and high vulnerabilities in the image - not coming from Airflow but for some other
components. In general, this is normal and expected that such vulnerabilities are found in the image after
it's been released and fixed - precisely because we are NOT updating the images after they are released as
explained above. Also sometimes even the latest releases contain vulnerabilities that are not yet fixed
in the base image we use or in the dependencies we use and cannot upgrade, because some of our providers
have limits and did not manage to upgrade yet and we have no control over that. So it is possible
that even the most recent release of our image there are some High and Critical vulnerabilities that
are not yet fixed.

**What can you do in such case?**

First of all - you should know what you should NOT do.

Do NOT send private email to the Airflow Security Team with scan results and asking what to do.
The Security team at Airflow takes care exclusively about undisclosed vulnerabilities in Airflow itself, not
in the dependencies or in the base image. The security email should only be used to report privately any
security issues that can be exploited via Airflow. This is nicely explained in our
`Security Policy <https://github.com/apache/airflow/security/policy>`__ where you can find more details
including the need to provide reproducible scenarios and submitting ONE issue per email. NEVER submit multiple
vulnerabilities in one email - those are rejected immediately, as they make the process of handling the issue
way harder for everyone, including the reporters.

Also DO NOT open aa GitHub Issue with the scan results and asking what to do. The GitHub Issues are for
reporting bugs and feature requests to Airflow itself, not for asking for help with the security scans on
3rd party components.

So what are your options?

You have four options:

1. Build your own custom image following the examples we share there - using the latest base image and
possibly manually bumping dependencies you want to bump. There are quite a few examples
in :ref:`Building the image <build:build_image>` which you can follow. You can use "slim" image as a base
for your images and rather than basing your image on the "reference" image that has a number of extras
and providers installed, you can only install what you actually need and upgrade some dependencies that
otherwise would not be possible to upgrade - because some of the provider libraries have limits and
did not manage to upgrade yet and we have no control over that. This is the most flexible way to
build your image and you can build your process to combine it with quickly upgrading to latest Airflow
versions (see point 2. below).

2. Wait for a new version of Airflow and upgrade to it. Airflow images are updated to latest "non-conflicting"
dependencies and use latest "base" image at release time, so what you have in the reference images
at the moment we publish the image / release the version is what is "latest and greatest"
available at the moment with the base platform we use (Debian Bookworm is the reference image we use).
This is one of good strategies you can take - build a process to upgrade your Airflow version regularly
- quickly after it has been released by the community, this will help you to keep up with the latest
security fixes in the dependencies.

3. If the base platform we use (currently Debian Bookworm) does not contain the latest versions you want
and you want to use other base images, you can take a look at what system dependencies are installed
and scripts in the latest ``Dockerfile`` of airflow and take inspiration from it and build your own image
or copy it and modify it to your needs. See the
`Dockerfile <https://github.com/apache/airflow/blob/|airflow-version|/Dockerfile>`__ for the latest version.

4. Research if the vulnerability affects you or not. Even if there is a dependency with high or critical
vulnerability, it does not mean that it can be exploited in Airflow (or specifically in the way you are
using Airflow). If you do have a reproducible scenario how a vulnerability can be exploited in Airflow, you should -
of course - privately report it to the security team. But if you do not have reproducible
scenario, please make a research and try to understand the impact of the vulnerability on Airflow. That
research might result in a public GitHub Discussion where you can discuss the impact of the
vulnerability if you research will indicate Airflow might not be impacted or private security email if
you find a reproducible scenario on how to exploit it.


**How do I discuss publicly about public CVEs in the image?**

The security scans report public vulnerabilities in 3rd-party components of Airflow. Since those are
already public vulnerabilities, this is something you can talk about but others also are talking about.
So you can do research on your own first. Try to find discussions about the issues, how others were handling
it and possibly even try to explore, whether the vulnerability can be exploited in Airflow or not.
This is a very valuable contribution to the community you can do in order to help others to
understand the impact of the vulnerability on Airflow. We highly appreciate our commercial users do it,
because Airflow is maintained by volunteers, so if you or your company can spend some time and skills of
security researchers to help the community to understand the impact of the vulnerability on Airflow, it
could be a fantastic contribution to the community and way to give back to the project that your company uses
for free.

You are free to discuss it publicly, open a `Github Discussion <https://github.com/apache/airflow/discussions>`_
mentioning your findings and research you've done so far. Ideally (as a way to contribute to Airflow) you
should explain the findings of your own security team in your company to help to research and understand
the impact of the vulnerability on Airflow (and your way of using it).
Again - strong suggestion is to open ONE discussion per vulnerability. You should NOT post scan results in
bulk - this is not helpful for a discussion, and you will not get meaningful answers if you will attempt to
discuss all the issues in one discussion thread.

Yes - we know it's the easy way to copy & paste your result and ask others what to do, but doing it is
going to likely result in silence because such actions in the community as seen as pretty selfish way of
getting your problems solved by tapping into time of other volunteers, without spending your time on making it
easier for them to help. If you really want to get help from the community, focus your discussion on
particular CVE, provide your findings - including analyzing your report in detail and finding which
binaries and base images exactly are causing the scanner to report the vulnerability. Remember that only
you have access to your scanner and you should bring as much helpful information so that others can
comment on it. Show that you have done your homework and that you bring valuable information to the community.

Opening a GitHub Discussion for this kind of issues is also a great way to communicate with the
maintainers and security team in an open and transparent way - without reverting to the private security
mailing list (which serves different purpose as explained above). If after such a discussion there will be
a way to remove such a vulnerability from the scanned image - great, you can even contribute a PR to the
Dockerfile to remove the vulnerability from the image. Maybe such a discussion will lead to a PR to allow
Airflow to upgrade to newer dependency that fixes the vulnerability or remove it altogether, or maybe
there is already a way to mitigate it or maybe there is already a PR that someone works on to fix it.
All this can (and should) be discussed publicly and transparently in a GitHub Discussion, not via private
security email, nor GitHub Issues which are exclusively about Airflow Issues not 3rd-party components
public security issues.

Support
=======
Expand Down

0 comments on commit d5f9877

Please sign in to comment.