Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix leak sensitive field via V1EnvVar on exception #29016

Merged
merged 9 commits into from Feb 23, 2023

Conversation

ecerulm
Copy link
Contributor

@ecerulm ecerulm commented Jan 18, 2023

Currently the KubernetesPodOperator env_vars will be printed on the task logs if there is any templating error (like an UndefinedError, TemplateSyntaxError or KeyError)

[2023-01-16, 23:03:17 UTC] {abstractoperator.py:592} ERROR - Exception rendering Jinja template for task 'dry_run_demo', field 'env_vars'. Template: [{'name': 'password', 'value': 'secretpassword', 'value_from': None}, {'name': 'VAR2', 'value': '{{ var.value.nonexisting}}', 'value_from': None}]
Traceback (most recent call last):
  File "/Users/rubelagu/.pyenv/versions/3.10.7/envs/venv-airflow-250/lib/python3.10/site-packages/airflow/models/abstractoperator.py", line 585, in _do_render_template_fields
    rendered_content = self.render_template(
  File "/Users/rubelagu/.pyenv/versions/3.10.7/envs/venv-airflow-250/lib/python3.10/site-packages/airflow/models/abstractoperator.py", line 657, in render_template
    return [self.render_template(element, context, jinja_env, oids) for element in value]
  File "/Users/rubelagu/.pyenv/versions/3.10.7/envs/venv-airflow-250/lib/python3.10/site-packages/airflow/models/abstractoperator.py", line 657, in <listcomp>
    return [self.render_template(element, context, jinja_env, oids) for element in value]
  File "/Users/rubelagu/.pyenv/versions/3.10.7/envs/venv-airflow-250/lib/python3.10/site-packages/airflow/models/abstractoperator.py", line 664, in render_template
    self._render_nested_template_fields(value, context, jinja_env, oids)
  File "/Users/rubelagu/.pyenv/versions/3.10.7/envs/venv-airflow-250/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 321, in _render_nested_template_fields
    self._do_render_template_fields(content, ("value", "name"), context, jinja_env, seen_oids)
  ...
  ...
  File "/Users/rubelagu/.pyenv/versions/3.10.7/envs/venv-airflow-250/lib/python3.10/site-packages/airflow/models/variable.py", line 141, in get
    raise KeyError(f"Variable {key} does not exist")
KeyError: 'Variable nonexisting does not exist'

this happens when there is any error on the templates. For example a KeyError raised when using var.value.somemistypedvalue:

        env_vars={
            "password": "{{ conn.test_connection.password }}",
            "VAR2": "{{ var.value.nonexisting}}",
        },

This PR uses the airflow.utils.log.secrets_maker.redact to remove any field contained in DEFAULT_SENSITIVE_FIELDS or
sensitive_var_conn_names.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@ecerulm
Copy link
Contributor Author

ecerulm commented Jan 18, 2023

I had to introduce a ConvertableToDict type , in order to keep the mypy happy since k8s.V1EnvVar is not a dict. Initially I used hasattr(item, "to_dict") but mypy will complain about the typing since Redactable is not guaranteed to have to_dict, and it seems that the mypy does not understand the hasattr() the same way it recognizes the isinstance() .

dict_key: self._redact(subval, name=dict_key, depth=(depth + 1))
for dict_key, subval in item.items()
}
return to_return
elif isinstance(item, ConvertableToDict): # things like V1EnvVar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are other instances this would be desired? Personally I’d prefer to keep this more conservative and only check for V1EnvVar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I check explicitly for V1EnvVar then I guess we introduce a direct dependency to the kubernetes provider (which is the one that provides the kubernetes client, if I'm not mistaken ). That was my only motivation to use the to_dict/ConvertableToDict , I didn't want to introduce that dependency because I thought that would be not allowed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other alternative to avoid introducing a dependency to V1EnvVar would be to mask everything that is not explicitly Redactable. Which one shall I go for? the ConvertableToDict, introducing a dependency to V1EnvVar or mask everything not redactable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uranusjr , what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could try-import kubernetes (if we can’t import it the object would obviously not be a V1EnvVar), but this is OK I think. We can always make the check stricter later if something breaks.

Maybe add a note on ConvertableToDict describing the rationale? (e.g. we don’t want to depend directly on kubernetes so this checks for the “shape” of a V1EnvVar instead)

@ecerulm ecerulm requested review from uranusjr and removed request for jedcunningham January 23, 2023 11:59
@ecerulm
Copy link
Contributor Author

ecerulm commented Feb 2, 2023

Nevermind, I realized now that the ConvertableToDict approach does not work on Python 3.7 (as typing.Protocol was introduced in 3.8). So I guess I'll try your try-import approach

@uranusjr
Copy link
Member

uranusjr commented Feb 2, 2023

We actually have Protocol available on 3.7 via airflow.typing_compat, but either way is fine as I mentioned above, so since you already made the change let’s just keep this.

@ecerulm
Copy link
Contributor Author

ecerulm commented Feb 3, 2023

I had to fix some formatting issues and some stuff from the rebase from main but now "all checks have passed".

@ecerulm
Copy link
Contributor Author

ecerulm commented Feb 3, 2023

Is there anything else that I can do? I just wonder if I need to do anything else before you merge it?

@Taragolis Taragolis added the full tests needed We need to run full set of tests for this PR to merge label Feb 18, 2023
@Taragolis
Copy link
Contributor

Just in case run full test

Currently the KubernetesPodOperator `env_vars` will be printed
on the task logs if there is any templating error (like an
`UndefinedError`, `TemplateSyntaxError` or `KeyError`)

```
[2023-01-16, 23:03:17 UTC] {abstractoperator.py:592} ERROR - Exception rendering Jinja template for task 'dry_run_demo', field 'env_vars'. Template: [{'name': 'password', 'value': 'secretpassword', 'value_from': None}, {'name': 'VAR2', 'value': '{{ var.value.nonexisting}}', 'value_from': None}]
Traceback (most recent call last):
  File "/Users/rubelagu/.pyenv/versions/3.10.7/envs/venv-airflow-250/lib/python3.10/site-packages/airflow/models/abstractoperator.py", line 585, in _do_render_template_fields
    rendered_content = self.render_template(
  File "/Users/rubelagu/.pyenv/versions/3.10.7/envs/venv-airflow-250/lib/python3.10/site-packages/airflow/models/abstractoperator.py", line 657, in render_template
    return [self.render_template(element, context, jinja_env, oids) for element in value]
  File "/Users/rubelagu/.pyenv/versions/3.10.7/envs/venv-airflow-250/lib/python3.10/site-packages/airflow/models/abstractoperator.py", line 657, in <listcomp>
    return [self.render_template(element, context, jinja_env, oids) for element in value]
  File "/Users/rubelagu/.pyenv/versions/3.10.7/envs/venv-airflow-250/lib/python3.10/site-packages/airflow/models/abstractoperator.py", line 664, in render_template
    self._render_nested_template_fields(value, context, jinja_env, oids)
  File "/Users/rubelagu/.pyenv/versions/3.10.7/envs/venv-airflow-250/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 321, in _render_nested_template_fields
    self._do_render_template_fields(content, ("value", "name"), context, jinja_env, seen_oids)
  ...
  ...
  File "/Users/rubelagu/.pyenv/versions/3.10.7/envs/venv-airflow-250/lib/python3.10/site-packages/airflow/models/variable.py", line 141, in get
    raise KeyError(f"Variable {key} does not exist")
KeyError: 'Variable nonexisting does not exist'
```

this happens when there is any error on the templates. For example
a `KeyError` raised when using `var.value.somemistypedvalue`:

```
        env_vars={
            "password": "{{ conn.test_connection.password }}",
            "VAR2": "{{ var.value.nonexisting}}",
        },
```

This PR uses the `airflow.utils.log.secrets_maker.redact` to remove any
field contained in `DEFAULT_SENSITIVE_FIELDS` or
`sensitive_var_conn_names`.
@eladkal eladkal added this to the Airflow 2.5.2 milestone Feb 22, 2023
@uranusjr uranusjr merged commit 78115c5 into apache:main Feb 23, 2023
@ecerulm ecerulm deleted the vienvvar_leak branch February 23, 2023 11:01
@pierrejeambrun pierrejeambrun added the type:bug-fix Changelog: Bug Fixes label Feb 27, 2023
pierrejeambrun pushed a commit that referenced this pull request Mar 7, 2023
Co-authored-by: Andrey Anshin <Andrey.Anshin@taragol.is>
(cherry picked from commit 78115c5)
pierrejeambrun pushed a commit that referenced this pull request Mar 8, 2023
Co-authored-by: Andrey Anshin <Andrey.Anshin@taragol.is>
(cherry picked from commit 78115c5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:logging area:providers full tests needed We need to run full set of tests for this PR to merge provider:cncf-kubernetes Kubernetes provider related issues type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants