Unable to connect to Redis due to EOF #694

zetaron · 2021-07-15T17:42:39Z

When configuring a self-hosted kodiak bot to connect to a Redis Instance managed by DigitalOcean it fails with the following error message on repeat:

INFO asyncio_redis:connection.py:116 Connecting to redis
INFO asyncio_redis:protocol.py:897 Redis connection made
INFO asyncio_redis:protocol.py:955 EOF received in RedisProtocol
INFO asyncio_redis:protocol.py:979 Redis connection lost
ERROR asyncio:main.py:306 Task exception was never retrieved
future: <Task finished coro= exception=ConnectionLostError(None)>
Traceback (most recent call last):
  File "/var/app/.venv/src/asyncio-redis/asyncio_redis/protocol.py", line 911, in initialize
    yield from self.auth(self.password)
  File "/var/app/.venv/src/asyncio-redis/asyncio_redis/protocol.py", line 733, in wrapper
    result = yield from method(protocol_self, _NoTransaction, *a, **kw)
  File "/var/app/.venv/src/asyncio-redis/asyncio_redis/protocol.py", line 1231, in _query
    transaction, answer_f, _bypass=_bypass, call=call
  File "/var/app/.venv/src/asyncio-redis/asyncio_redis/protocol.py", line 1155, in _get_answer
    result = yield from answer_f
asyncio_redis.exceptions.ConnectionLostError: None

From the DigitalOcean FAQ I take that the EOF error would usualy happen when the connecting client does not support TLS.

Is there a way to use TLS or would that require switching to the official redis library?
If switching would be required, how hard would that be?
Would you be open to PRs?

How valuable is the data thats stored inside the redis?
Would it be bad to loose it?

We are hosting on Kubernetes and I could deploy redis in memory ... but that would be subject to potential rescheduling and movement accross cluster nodes.

The text was updated successfully, but these errors were encountered:

chdsbd · 2021-07-16T00:57:42Z

Hi @zetaron,

Thanks for opening this issue.

Kodiak uses a slightly modified fork of asyncio-redis, which doesn't have TLS support built in.

I think it's possible to update the library to support TLS. It uses the asyncio loop.create_connection, which supports TLS.

https://github.com/chdsbd/asyncio-redis/blob/7d174e0ebe2dc52340001097eeee48a868454480/asyncio_redis/connection.py#L118-L120

There are other asyncio Redis clients around, but I don't know how hard they would be to integrate. I'm open to PRs that add TLS support.

One alternative is to use stunnel to create a connection using TLS between hosts without needed to update the Redis client.

Another option is using something like Digital Ocean's private networking feature to isolate traffic.

Kodiak uses Redis to store merge queue positions. So if your Redis instance were to disappear, Kodiak would lose track of any PR waiting to be merged.

zetaron · 2021-07-16T06:20:48Z

Thanks for your quick reply :)

I'll attempt to update your fork of the asyncio_redis library to support TLS, huge thanks for the pointers.

DigitalOceans private networking still (from what I see) requires TLS to connect.

zetaron · 2021-07-16T09:59:23Z

I've created two PRs to implement TLS support:

I did test the change locally using docker-compose against my DigitalOcean managed Redis and the above error did not happen again.
Since I'm not too familiar with python and this project in general I was unable to add unit-tests. If that is something you want please feel free to modify my PRs :)

In response to chdsbd/kodiak#694 this PR is the preparation for TLS support in kodiaks bot.

In response to #694 this PR is the preparation for TLS support in kodiaks bot. This PR requires chdsbd/asyncio-redis#6 to be merged first.

chdsbd · 2021-07-16T21:43:53Z

Thanks for the PRs!

You should be all set with Redis and TLS using the latest commit: f487e79

zetaron · 2021-07-17T09:21:01Z

Great :D thanks for the quick merge!

Would you mind also triggering a release for the docker image?

sbdchd · 2021-07-17T14:41:28Z

CI automatically builds and releases a docker image with the latest commit SHA, so if you nav to https://hub.docker.com/r/cdignam/kodiak/tags, the most recently released version is cdignam/kodiak:f487e799a74ae074ee16c9106c9750925d329be4

zetaron · 2021-07-17T15:03:32Z

Ah that slipped me when I checked.

zetaron · 2021-07-18T11:35:35Z

Hi, again :)

I've deployed the new Image and after some time observed the EOF coming back but seemingly recovering.
Maybe you know of the top of your head how many locations there are where a redis connection gets created/recreated?
Or if the recreation of the connection is expected after it got used for handling an event?
I'm currently using a Pool Size of 10.

logs-from-server-in-kodiak-hlztb-deployment-6477778f9b-vwvx7.txt

sbdchd · 2021-07-18T17:24:57Z

I believe the EOF method on the asyncio.Protocol is called when the server sends an EOF

https://github.com/chdsbd/asyncio-redis/blob/388a6ac390ae70ee16bdc45cf19b308d11212566/asyncio_redis/protocol.py#L954-L956

https://docs.python.org/3.9/library/asyncio-protocol.html#asyncio.Protocol.eof_received

So my first thought is Redis is killing the connections after a while and the python client is recreating the connection when it dies

Do you have timestamps to determine how long after the initial pool creation the connections die? Might shed some light on if it's timeout related

zetaron · 2021-07-18T19:38:43Z

I’ll see to get some, in the UI there are timestamps but the downloaded file seems to miss them.

Oddly enough from what I observed it was related to events being handled as it only happened after the first events started coming in no matter how quick or long that took.
But I’ll try to solidify that with some more log data maybe a longer timeframe to be able to spot recurrence.

zetaron · 2021-07-19T07:07:04Z

After following your hint on timeouts and some research on redis timeouts in relation to DigitalOcean Managed Redis I found this Question on how to set it to 0 (disabled)

The default timeout for the managed Redis seems to be: 300

I've gathered some more logs, but apart from an overwhelming amount of the following lines occurring every 2-3 seconds apart, which now makes total sense since the default connection timeout is 300ms.

I'll try and have DigitalOcean that set to 0 on my kodiak instance by opening a Ticket with them.

Sorry for the comotion but maybe this helps someone else stumbling into similar issues :)

zetaron · 2021-08-05T14:09:04Z

After DigitalOcean set the timeout to 0 the issue got resolved :)

zetaron added the bug Something isn't working label Jul 15, 2021

This was referenced Jul 16, 2021

passthrough ssl connection property chdsbd/asyncio-redis#6

Merged

introduce ssl/tls support #695

Merged

kodiakhq bot pushed a commit to chdsbd/asyncio-redis that referenced this issue Jul 16, 2021

passthrough ssl connection property (#6)

388a6ac

In response to chdsbd/kodiak#694 this PR is the preparation for TLS support in kodiaks bot.

kodiakhq bot pushed a commit that referenced this issue Jul 16, 2021

introduce ssl/tls support (#695)

f487e79

In response to #694 this PR is the preparation for TLS support in kodiaks bot. This PR requires chdsbd/asyncio-redis#6 to be merged first.

chdsbd closed this as completed Jul 16, 2021

sbdchd reopened this Jul 18, 2021

zetaron closed this as completed Aug 5, 2021

chdsbd mentioned this issue Aug 18, 2021

self-hosted: redis connection pool fills up after a while and bot enters an unhealthy state #712

Open

sbdchd mentioned this issue Feb 19, 2022

self hosted: EOF received in RedisProtocol after upgrading to release v0.49 #788

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to connect to Redis due to EOF #694

Unable to connect to Redis due to EOF #694

zetaron commented Jul 15, 2021

chdsbd commented Jul 16, 2021

zetaron commented Jul 16, 2021

zetaron commented Jul 16, 2021

chdsbd commented Jul 16, 2021

zetaron commented Jul 17, 2021

sbdchd commented Jul 17, 2021

zetaron commented Jul 17, 2021

zetaron commented Jul 18, 2021 •

edited

sbdchd commented Jul 18, 2021

zetaron commented Jul 18, 2021

zetaron commented Jul 19, 2021 •

edited

zetaron commented Aug 5, 2021

Unable to connect to Redis due to EOF #694

Unable to connect to Redis due to EOF #694

Comments

zetaron commented Jul 15, 2021

chdsbd commented Jul 16, 2021

zetaron commented Jul 16, 2021

zetaron commented Jul 16, 2021

chdsbd commented Jul 16, 2021

zetaron commented Jul 17, 2021

sbdchd commented Jul 17, 2021

zetaron commented Jul 17, 2021

zetaron commented Jul 18, 2021 • edited

sbdchd commented Jul 18, 2021

zetaron commented Jul 18, 2021

zetaron commented Jul 19, 2021 • edited

zetaron commented Aug 5, 2021

zetaron commented Jul 18, 2021 •

edited

zetaron commented Jul 19, 2021 •

edited