Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BROKER_URL_SECRET Not working as of Airflow 2.7 #34612

Closed
2 tasks done
JonnyWaffles opened this issue Sep 25, 2023 · 6 comments · Fixed by #34782
Closed
2 tasks done

BROKER_URL_SECRET Not working as of Airflow 2.7 #34612

JonnyWaffles opened this issue Sep 25, 2023 · 6 comments · Fixed by #34782
Labels
affected_version:main_branch Issues Reported for main branch area:CLI kind:bug This is a clearly a bug

Comments

@JonnyWaffles
Copy link
Contributor

JonnyWaffles commented Sep 25, 2023

Apache Airflow version

2.7.1

What happened

Hi team,

In the past you can use AIRFLOW__CELERY__BROKER_URL_SECRET as a way to retrieve the credentials from a SecretBackend at runtime. However, as of Airflow 2.7, this technique appears to be broken. I believe this related to the discussion 34030 - Celery configuration elements not shown to be fetched with _CMD pattern. The irony is the pattern works when using the config get-value command, but does not work when using the actual airflow celery command. I suspect this has something to do with when the wrapper calls ProvidersManager().initialize_providers_configuration().

unset AIRFLOW__CELERY__BROKER_URL
AIRFLOW__CELERY__BROKER_URL_SECRET=broker_url_east airflow config get-value celery broker_url

This correct prints the secret from the backend!

redis://:<long password>@<my url>:6379/1

However actually executing celery with the same methodolgy results in the default Redis

AIRFLOW__CELERY__BROKER_URL_SECRET=broker_url_east airflow celery worker

Relevant output

- ** ---------- [config]
- ** ---------- .> app:         airflow.providers.celery.executors.celery_executor:0x7f4110be1e50
- ** ---------- .> transport:   redis://redis:6379/0

Notice the redis/redis and default port with no password.

What you think should happen instead

I would expect the airflow celery command to be able to leverage the _secret API similar to the config command.

How to reproduce

You must use a secret back end to reproduce as described above. You can also do

AIRFLOW__CELERY__BROKER_URL_CMD='/usr/bin/env bash -c "echo -n ZZZ"' airflow celery worker

And you will see the ZZZ is disregarded

- ** ---------- .> app:         airflow.providers.celery.executors.celery_executor:0x7f0506d49e20
- ** ---------- .> transport:   redis://redis:6379/0

It appears neither historical _CMD or _SECRET APIs work after the refactor to move celery to the providers.

Operating System

ubuntu20.04

Versions of Apache Airflow Providers

Relevant ones
apache-airflow-providers-celery 3.3.3
celery 5.3.4

Deployment

Docker-Compose

Deployment details

No response

Anything else

I know this has something to do with when ProvidersManager().initialize_providers_configuration() is executed but I don't know the right place to put the fix.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@JonnyWaffles JonnyWaffles added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Sep 25, 2023
@JonnyWaffles
Copy link
Contributor Author

Playing around with it I discovered calling conf.get() directly returns the default but calling the config get-value via the CLI has some side effect where everything works.

import os

# Quick hack for the issue instead of using a proper fixture
os.environ.pop("AIRFLOW__CELERY__BROKER_URL")
os.environ["AIRFLOW__CELERY__BROKER_URL_SECRET"] = "broker_url_east"


def test_airflow_cli_uses_broker_url_secret(capsys):
    """
    Shows that executing through the CLI changes
    the behavior of the conf.get command
    in the config_command::get_value function
    """
    from airflow.cli import cli_parser
    from airflow.configuration import conf

    default_redis = "redis://redis:6379/0"
    assert default_redis == conf.get("celery", "broker_url")

    test_args = ["config", "get-value", "celery",  "broker_url"]

    parser = cli_parser.get_parser()
    args = parser.parse_args(test_args)
    args.func(args)
    sout = capsys.readouterr()
    secret_redis_url = None

    for line in sout.out.splitlines():
        line: str = line.strip()
        if line.startswith("redis"):
            secret_redis_url = line
            break

    if secret_redis_url is None:
        raise AssertionError(
            "Could not get secret from backend!"
        )

    assert secret_redis_url and secret_redis_url != default_redis

@JonnyWaffles
Copy link
Contributor Author

Now I am very confused because in my debugger I can see that cli lazy_load_command which wraps config_command::get_value, is also being wrapped by providers_configuration_loaded, but I cannot see where the latter is being applied to the former! I think maybe my Pycharm introspection is broken from upgrading/downgrading the Airflow container one too many times!

image

Regardless, now I think it has something to do with the timing of initialize_providers_configuration() and when the default celery config tries to get the broker_url. Will keep investigating tomorrow but would be great if someone more knowledgeable can weigh in!

@hussein-awala hussein-awala added area:CLI and removed area:core needs-triage label for new issues that we didn't triage yet labels Sep 26, 2023
@hussein-awala
Copy link
Member

I confirm that there is a problem on the configurations loading in the CLI.

@hussein-awala hussein-awala added the affected_version:main_branch Issues Reported for main branch label Sep 26, 2023
@Aakcht
Copy link
Contributor

Aakcht commented Sep 28, 2023

Hi! Looks like this issue also breaks airflow helm chart installation when using AIRFLOW__CELERY__BROKER_URL_SECRET. While the workers itself seem to work fine, the liveness probe for workers fails with ConnectionError(str(exc)) from exc kombu.exceptions.OperationalError: Error -5 connecting to redis:6379. No addres. So it seems that the liveness probe tries to contact the default redis(instead of the one specified in the secrets backend). And the liveness probe fail forces the workers to restart every 5 minutes.

@JonnyWaffles
Copy link
Contributor Author

I think I have a fix for this ready to go but I cannot think of a way to properly test it since the patch fixes import time side effects. I should have a PR soon and someone wiser than me can help add a test if necessary.

@potiuk
Copy link
Member

potiuk commented Oct 12, 2023

While I understand why the liveness probe mentioned by @Aakcht (because it runs the app directly) in Helm chart could fail in this case, I do not think the "worker" should have this problem.

I have no idea how this would not work for worker. I tried to reproduce it localy and could not:

root@7d9718db3570:/opt/airflow# cat /tmp/x
#!/bin/bash
echo -n "x"

root@7d9718db3570:/opt/airflow# AIRFLOW__CELERY__BROKER_URL_CMD=/tmp/x airflow celery worker

 -------------- celery@7d9718db3570 v5.3.4 (emerald-rush)
--- ***** -----
-- ******* ---- Linux-5.15.49-linuxkit-pr-aarch64-with-glibc2.17 2023-10-12 04:32:34
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app:         airflow.providers.celery.executors.celery_executor:0xffff9a8438b0
- ** ---------- .> transport:   amqp://guest:**@x:5672//
- ** ---------- .> results:     postgresql://postgres:**@postgres/airflow
- *** --- * --- .> concurrency: 8 (prefork)

The airflow worker command has properly:

@providers_configuration_loaded
def worker(args):

Is there any special scenario where Celery App is run as separately started Python interpreter when worker is started?

Do you have any other specific configuration ? @hussein-awala - can you show how you reproduced it ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected_version:main_branch Issues Reported for main branch area:CLI kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants