Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New 4.3.0 release breaks Redis Sentinel connection #1004

Closed
jobec opened this issue Feb 6, 2019 · 7 comments
Closed

New 4.3.0 release breaks Redis Sentinel connection #1004

jobec opened this issue Feb 6, 2019 · 7 comments

Comments

@jobec
Copy link
Contributor

jobec commented Feb 6, 2019

The new 4.3.0 breaks the Redis Sentinel connection, triggering an error like when queueing a task:

redis.sentinel.MasterNotFoundError: No master found for 'celery-master'

While starting a celery worker gives this error

 ERROR - celery.worker.consumer.consumer - consumer: Cannot connect to sentinel://localhost:26379/0: No master found for 'celery-master'.

kombu==4.2.2-post1 did not have this issue.

Redis library = 3.0.1

@auvipy
Copy link
Member

auvipy commented Feb 7, 2019

can you help to bisect the commit causing this?

@jobec
Copy link
Contributor Author

jobec commented Feb 7, 2019

Looks like it happened in this commit: a59b44e

When running though a debugger, self.connection.client.alt is empty on this code line

for url in self.connection.client.alt:

This causes the next part to connect to an empty list of sentinels:

kombu/kombu/transport/redis.py

Lines 1107 to 1111 in b4be5cf

sentinel_inst = sentinel.Sentinel(
connection_list,
min_other_sentinels=getattr(self, 'min_other_sentinels', 0),
sentinel_kwargs=getattr(self, 'sentinel_kwargs', None),
**additional_params)

@jobec
Copy link
Contributor Author

jobec commented Feb 7, 2019

Celery (Django) setting:

CELERY_BROKER_URL = "sentinel://localhost:26379/0"
CELERY_BROKER_TRANSPORT_OPTIONS = {
    'master_name': 'celery-master',
}

@auvipy
Copy link
Member

auvipy commented Feb 7, 2019

@badcure could you check?

@lithammer
Copy link
Contributor

I'm thinking that maybe something like this ought to fix it. But it's just guesses.

-        connection_list = []
+        connection_list = [(connparams['host'], connparams['port'])]
         for url in self.connection.client.alt:
             url = _parse_url(url)
             if url.scheme == 'sentinel':
                 connection_list.append([url.hostname, str(url.port)])

Or:

         connection_list = []
         for url in self.connection.client.alt:
             url = _parse_url(url)
             if url.scheme == 'sentinel':
                 connection_list.append([url.hostname, str(url.port)])
+        if not connection_list:
+            connection_list = [(connparams['host'], connparams['port'])]

lithammer added a commit to lithammer/kombu that referenced this issue Feb 20, 2019
The problem is that `self.connection.client.alt` is only populated when
there's more than one client URL provided, e.g.
`"sentinel://foo;sentinel://bar"`. It will also always contain all URLs,
including the primary/first entry.

So if the `alt` list is empty, it (usually) means there was only one
client URL provided.

I also took the liberty to perform name and type changes to be more in
line with the examples and documentation in the `redis` library. The
argument is `sentinels`, not `connection_list`. And it's of the type
`List[Tuple[str, int]]`, not `List[List[str, str]]`.

Fixes celery#1004
@lithammer
Copy link
Contributor

Tried to fix this in #1010.

thedrow pushed a commit that referenced this issue Feb 21, 2019
The problem is that `self.connection.client.alt` is only populated when
there's more than one client URL provided, e.g.
`"sentinel://foo;sentinel://bar"`. It will also always contain all URLs,
including the primary/first entry.

So if the `alt` list is empty, it (usually) means there was only one
client URL provided.

I also took the liberty to perform name and type changes to be more in
line with the examples and documentation in the `redis` library. The
argument is `sentinels`, not `connection_list`. And it's of the type
`List[Tuple[str, int]]`, not `List[List[str, str]]`.

Fixes #1004
@thedrow thedrow added this to the 4.3.x Maintenance milestone Feb 21, 2019
@hajarada
Copy link

Hello All, I know this is issue is for 4.3, but it seems that 4.2.1 has that issue as well. and the same workaround applies.
In my setup we have 8 sentinels sitting behind two loadbalancers that use GSLB, so originally all consumers were configured to talk to the GSLB hostname address which translates to one of the two LB IP addresses. two weeks ago following a routine libraries upgrade, all celery applications stopped working with the following trace:

==> /apps/run/logs/app_name.internal.worker.log <==
self.send_packed_command(self.pack_command(*args))
File "/apps/app_name/lib/python3.6/site-packages/redis/connection.py", line 594, in send_packed_command
self.connect()
File "/apps/app_name/lib/python3.6/site-packages/redis/sentinel.py", line 45, in connect
self.connect_to(self.connection_pool.get_master_address())
File "/apps/app_name/lib/python3.6/site-packages/redis/sentinel.py", line 101, in get_master_address
self.service_name)
File "/apps/app_name/lib/python3.6/site-packages/redis/sentinel.py", line 224, in discover_master
raise MasterNotFoundError("No master found for %r" % (service_name,))
kombu.exceptions.OperationalError: No master found for 'app_name'

I changed the broker_url to be the two LB names instead of the GSLB name and that seems to have fixed it.

Here is the pip list
ackage Version


amqp 2.4.1
asn1crypto 0.24.0
bcrypt 3.1.6
billiard 3.6.0.0
celery 4.2.1
certifi 2018.11.29
cffi 1.12.2
chardet 3.0.4
Click 7.0
cryptography 2.5
Flask 1.0.2
Flask-Auth 0.85
Flask-HTTPAuth 3.2.4
Flask-Session 0.3.1
idna 2.8
itsdangerous 1.1.0
Jinja2 2.10
kombu 4.3.0
ldap3 2.5.2
MarkupSafe 1.1.1
mod-wsgi 4.6.5
packaging 19.0
paramiko 2.4.2
pip 19.0.3
pip-review 1.0
pyasn1 0.4.5
pycparser 2.19
pycrypto 2.6.1
pymongo 3.7.2
PyNaCl 1.3.0
pyparsing 2.3.1
pytz 2018.9
redis 3.2.0
requests 2.21.0
scp 0.13.0
setuptools 40.8.0
six 1.12.0
urllib3 1.24.1
vine 1.2.0
Werkzeug 0.14.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants