Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection pool doesn't recover from aborts or resets #1772

Closed
Enchufa2 opened this issue Dec 2, 2021 · 6 comments · Fixed by #1832
Closed

Connection pool doesn't recover from aborts or resets #1772

Enchufa2 opened this issue Dec 2, 2021 · 6 comments · Fixed by #1832
Assignees
Labels
bug Bug

Comments

@Enchufa2
Copy link
Contributor

Enchufa2 commented Dec 2, 2021

Version: 4.0.2

Platform: Python 3.9.7 on Fedora 34

Description: Steps to reproduce:

  1. Make redis establish a connection. For instance, in a local setting:
>>> from redis import Redis()
>>> redis = Redis()
>>> redis.get("something")
  1. Now there's a connection in the pool. The connection gets killed somehow. For instance:
$ sudo ss -K dst [::1] dport = redis
  1. Execute anything else:
>>> redis.get("something")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/***/.local/lib/python3.9/site-packages/redis/commands/core.py", line 1023, in get
    return self.execute_command('GET', name)
  File "/home/***/.local/lib/python3.9/site-packages/redis/client.py", line 1068, in execute_command
    conn = self.connection or pool.get_connection(command_name, **options)
  File "/home/***/.local/lib/python3.9/site-packages/redis/connection.py", line 1179, in get_connection
    if connection.can_read():
  File "/home/***/.local/lib/python3.9/site-packages/redis/connection.py", line 743, in can_read
    return self._parser.can_read(timeout)
  File "/home/***/.local/lib/python3.9/site-packages/redis/connection.py", line 315, in can_read
    return self._buffer and self._buffer.can_read(timeout)
  File "/home/***/.local/lib/python3.9/site-packages/redis/connection.py", line 224, in can_read
    self._read_from_socket(timeout=timeout,
  File "/home/***/.local/lib/python3.9/site-packages/redis/connection.py", line 192, in _read_from_socket
    data = self._sock.recv(socket_read_size)
ConnectionAbortedError: [Errno 103] Software caused connection abort

The same happens if there's a firewall that resets the connection, but then a ConnectionResetError is obtained instead. This doesn't happen with v3.5.3, which automatically detects the loss of connection and sets up a new one.

@Andrew-Chen-Wang
Copy link
Contributor

does running CLIENT KILL have the same effect as ss -K or is Redis doing something extra on CLIENT KILL that allows for cleanup on redis-py and aioredis's side? Because I'm not able to reproduce...

@Enchufa2
Copy link
Contributor Author

Enchufa2 commented Dec 3, 2021

CLIENT KILL closes the connection gracefully. redis 3.5.3 and 4.0.2 do not fail after a CLIENT KILL, aioredis does fail.

@Enchufa2
Copy link
Contributor Author

Enchufa2 commented Dec 3, 2021

The main problem here is that the built-in ConnectionError has BrokenPipeError, ConnectionAbortedError, ConnectionRefusedError and ConnectionResetError as subclasses. But redis and aioredis are overwriting ConnectionError, and these other errors are not subclasses anymore. Consequently, the pool's get_connection method, which expects a ConnectionError, cannot catch them.

One possible solution would be to catch the superclass OSError too.

@Andrew-Chen-Wang
Copy link
Contributor

Andrew-Chen-Wang commented Dec 3, 2021 via email

@Enchufa2
Copy link
Contributor Author

Enchufa2 commented Dec 3, 2021

Ah good catch! We’re these part of socket.error before?

I think so.

@Andrew-Chen-Wang
Copy link
Contributor

Hm aioredis prior to the current latest commit (fix socket.error) had class ConnectionError inherit from RedisError and builtins.ConnectionError which would still lead to an exception being raised if we use the OSError approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants