Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Environment variable for Broker URL is prioritized over app settings #4284

Open
2 tasks done
Checkroth opened this issue Sep 25, 2017 · 13 comments
Open
2 tasks done

Environment variable for Broker URL is prioritized over app settings #4284

Checkroth opened this issue Sep 25, 2017 · 13 comments

Comments

@Checkroth
Copy link

Checkroth commented Sep 25, 2017

Checklist

  • I have included the output of celery -A proj report in the issue.
    (if you are not able to do this, then at least specify the Celery
    version affected).
  • I have verified that the issue exists against the master branch of Celery.

Steps to reproduce

Celery version: 4.0.2 (latentcall)
Issue is about the celery python app, which we are currently using with django (not django-celery).
Issue can be seen here: https://github.com/celery/celery/blob/master/celery/app/utils.py#L85-L106

My team had a misunderstanding about Celery settings and environment variables (we thought the celery app would set the config from environment variables automatically using config_from_object). We had CELERY_RESULT_BACKEND and CELERY_BROKER_URL set in our environment variables. This worked perfectly for a while, until we started doing operations that relied on the result backend.

It took a while to figure out why the broker url was working fine but the backend was not. After some investigation, I realized that the broker url wasn't set at all, and that the call to broker_url was actually using the environment variable we had set instead of the app settings.

Expected behavior

Celery broker settings should take priority. Currently, even if we had properly configured celery, a stray environment variable could break our application.

Ideally, the broker url would not be read from environment variables at all, at least not at the level in which it is currently read, as this only causes confusion. I am not familiar with why it is implemented this way, but it looks like it is probably a bit of legacy code.

Actual behavior

Celery broker url is read from environment variables before app settings.

@Checkroth
Copy link
Author

I have added a test to demonstrate the issue:

#4285

@auvipy auvipy added this to the v4.2 milestone Dec 19, 2017
@auvipy auvipy modified the milestones: v4.2, v5.0.0 Jan 13, 2018
@ant31
Copy link
Contributor

ant31 commented Feb 15, 2018

I don't see an issue here, many software prioritizes environment variable over conffile when both exist.

Things to eventually improve:

  • document what's the precedence order
  • log from where the configuration is taken

@thomas-riccardi
Copy link
Contributor

This is not the current issue: there is no conffile involved in the scenario.
The issue is precedence of env vars over explicit configuration passed as Celery constructor or otherwise edited by explicit code (dev controlled), not a user controlled conf file vs env var.

@Checkroth
Copy link
Author

@thomas-riccardi is correct.

My main concern would be that it seems like your settings are being taken when you explicitly set them, but Celery is "magically" picking up environment variables that you might not even be aware that you have set.

The issue I'm raising is a personal issue I guess, where I feel that any explicit setting in your code should take priority over automatic settings (like those from environment variables). I have since realized that prioritizing Environment variables is by design so I have closed my pull request regarding the issue, however I have a few thoughts on mitigating confusion in the future:

On the PR adding tests demonstrating this issue ( #4285 ) @georgepsarakis brought up a point that would at least remove the confusion for anybody else in the future.

However your test seems to raise another question regarding the behavior of ConfigurationView and properties in sub-classes such as Settings.broker_url; an exception should have been raised when you try to set a value for broker_url and this does not happen. Perhaps AttributeDictMixin is responsible for this.

If your broker_url is set, and you make a manual change that wouldn't actually have any influence on your Celery configuration, Celery should tell you with an exception or a warning log or something instead of just silently eating the manual setting.

@ant31 's improvements ( #4284 (comment) ) would also be much appreciated.

@auvipy
Copy link
Member

auvipy commented Apr 25, 2018

may be doc changes worth mention it?

@auvipy auvipy modified the milestones: v5.0.0, v4.2 Apr 25, 2018
@auvipy auvipy modified the milestones: v4.2, v4.3 May 25, 2018
@xirdneh xirdneh added the Sprint Candidate Candidate for sprint label Oct 11, 2018
@auvipy auvipy modified the milestones: v4.3, v5.0.0 Nov 17, 2018
@auvipy auvipy modified the milestones: v5.0.0, 4.7 May 10, 2019
@auvipy auvipy modified the milestones: 4.7, 4.6 Jul 6, 2019
@auvipy auvipy modified the milestones: 4.6, 4.5 Apr 2, 2020
@tspecht
Copy link

tspecht commented Aug 24, 2020

Sorry for bumping this, but if the CELERY_BROKER_URL (when using the CELERY prefix for Django) environment variable is taking precedence over anything configured in settings, how can we specify multiple Broker URLS then? Tried with comma & newline separated values and no luck so far. Would really hate having to choose a custom name for this just to make this work ...

@thedrow
Copy link
Member

thedrow commented Aug 25, 2020

Multiple broker URLs should be specified with a semicolon.
The first is the active broker and the rest are simply failovers.

@ladmerc
Copy link

ladmerc commented Jun 11, 2021

I also stumbled on this problem when configuring pytest to use another broker. According to the docs here, we can override the broker and backend specifically for testing purposes.

After hours debugging why my changes weren't being picked, I discovered env variables had precedence over any custom settings. In my case, CELERY_BROKER_URL was being picked from my env so it ignored the settings from my pytest fixtures.

For anyone stumbling on this via pytest fixtures, here's the hack I'm using:

@pytest.fixture(scope="session")
def celery_config():
    test_broker = "memory://"
    test_backend = "cache+memory://"

    if original_broker := os.environ.get("CELERY_BROKER_URL"):
        os.environ["CELERY_BROKER_URL"] = test_broker
    if original_backend := os.environ.get("CELERY_BACKEND"):
        os.environ["CELERY_BACKEND"] = test_backend

    yield {
        "broker_url": test_broker,
        "result_backend": test_backend,
    }

    if original_broker:
        os.environ["CELERY_BROKER_URL"] = original_broker
    if original_backend:
        os.environ["CELERY_BACKEND"] = original_backend

It essentially overwrites the env key if it was set, and resets it to the original during cleanup

@auvipy auvipy modified the milestones: 4.5, 5.2 Jun 15, 2021
@auvipy auvipy modified the milestones: 5.2, 5.2.x Oct 31, 2021
@illagrenan
Copy link

I was very surprised by this behavior because I didn't find any mention of prioritizing ENV variables in the documentation.

My use-case is as follows: I need several instances of Celery because I need to connect to multiple VHOSTs (dynamically). So I set the global ENV variable CELERY_BROKER_URL (without VHOST) and append VHOST in the constructor. And then I couldn't figure out why I was getting the error amqp.exceptions.NotAllowed: connection.open: (530) NOT_ALLOWED - access to vhost '/' refused.

Here is a minimal code sample:

import os

from celery import Celery

os.environ["CELERY_BROKER_URL"] = "amqp://aaa:aaa@localhost:5672"

vhost = "bbb"
app = Celery(broker="%s/%s" % (os.environ['CELERY_BROKER_URL'], vhost))

assert app.conf.broker_url == "amqp://aaa:aaa@localhost:5672/bbb"
>>> AssertionError

@tomwojcik
Copy link
Contributor

tomwojcik commented Mar 14, 2022

Thank you @illagrenan for this example. My current worker file looks like that

import os
from celery import Celery
from django.conf import settings


if not settings.configured:
    os.environ.setdefault("DJANGO_SETTINGS_MODULE", "config.settings")  # pragma: no cover

# https://github.com/celery/celery/issues/4284#issuecomment-1067196398
os.environ["CELERY_BROKER_URL"] = os.environ["SOME_ENV_VAR"]
os.environ["CELERY_RESULT_BACKEND"] = os.environ["REDIS_URL"]

app = Celery(
    "project_name",
    # broker=settings.CELERY_BROKER_URL,  # doesn't matter what you set here
)
app.config_from_object("django.conf:settings", namespace="CELERY")

I was so confused as to why it doesn't work as settings.CELERY_BROKER_URL is set correctly and I have no control over env variable name, but this simple "overwrite env var with another env var" when needed fixed this weird issue.

celery==5.2.1
django==4.0

I use django-health-check in my project and when I access the dashboard, it says that the results backend is not configured. Doing the same trick for result backend fixes this issue as well.

@larrybotha
Copy link

larrybotha commented Jun 22, 2022

I'm experiencing the same precedence issues when using Docker secrets, .e.g deploying using docker stack deploy instead of docker compose up

With Docker secrets, values are obtained as follows:

  1. create a secret with Docker
    • secrets are then exposed via /run/secrets/[secret_name] files inside containers
  2. environment variables are assigned the path to the secret, e.g. CELERY_BROKER_URL=/run/secrets/celery_broker_url
  3. the value of the secret is then read from the path assigned to the environment variable

With environment variables having higher precedence, Celery uses the following invalid broker URL: amqp://guest:**@%2Frun%2Fsecrets%2Fcelery_broker_url:5672//

As a workaround, I'm manually overriding the Celery env vars with the values read from the secrets, as suggested in #4284 (comment) and #4284 (comment)

This was surprising to me - I expected the priority to be something along the lines of (highest to lowest):

  1. user-defined configs
  2. command line values
  3. environment variables
  4. config file

As an aside, evaluating the environment variables of a running container shows that the updated values are not exposed.


EDIT: turns out that even official Docker images rewrite secrets to environment variables, and Kubernetes supports this out of the box:

@auvipy auvipy modified the milestones: 5.2.x, 5.3.x Jun 29, 2022
@lafrech
Copy link
Contributor

lafrech commented Aug 30, 2022

I came here looking for the opposite: a way to configure Celery with env vars.

Setting an env var with an absolute path to a config file (#5303) would be fine.

Setting an env var per setting is fine, too.

I'd keep the behaviour and document it.

I don't buy the "celery picks stuff from random forgotten vars in my env" argument. When deploying, environments are supposed to be clean. And using env vars is a common way of overriding stuff.

Sure, logging where each value is picked up could be nice, to save people from the issues described in comments above (not knowing an env var is overriding an explicit value).

@auvipy
Copy link
Member

auvipy commented Dec 8, 2022

I would be happy to review any contributions in this area.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests