Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: SSLConnection __init__() got an unexpected keyword argument 'ssl' #1346

Closed
Fabma opened this issue Sep 25, 2020 · 28 comments
Closed

Comments

@Fabma
Copy link

Fabma commented Sep 25, 2020

Version ==1.5.2

With rq version ==1.5.2 I am running into an __init__() got an unexpected keyword argument 'ssl'.
(with rq version ==1.5.1 the error does not appear)

Traceback (most recent call last):
  File "<>/site-packages/redis/connection.py", line 1185, in get_connection
    connection = self._available_connections.pop()
IndexError: pop from empty list

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<>/site-packages/rq/worker.py", line 931, in perform_job
    self.prepare_job_execution(job, heartbeat_ttl)
  File "<>/site-packages/rq/worker.py", line 826, in prepare_job_execution
    pipeline.execute()
  File "<>/site-packages/redis/client.py", line 4013, in execute
    self.shard_hint)
  File "<>/site-packages/redis/connection.py", line 1187, in get_connection
    connection = self.make_connection()
  File "<>/site-packages/redis/connection.py", line 1227, in make_connection
    return self.connection_class(**self.connection_kwargs)
  File "<>/site-packages/redis/connection.py", line 828, in __init__
    super(SSLConnection, self).__init__(**kwargs)
TypeError: __init__() got an unexpected keyword argument 'ssl'

Unfortunately I am not able to provide a concise code fragment to reproduce the error. It looks like a regression from #1327
where the ssl keyword and re-use of SSLConnection were introduced:
56e756f

@herb
Copy link

herb commented Oct 23, 2020

I've had the same issue. redis==3.5.3. afaict redis never supported a ssl argument to its Connection class. Was #1327 tested against some other redis client implementation/library?

@selwin
Copy link
Collaborator

selwin commented Nov 7, 2020

Redis does indeed accept ssl argument. Check the docs here: https://redis-py.readthedocs.io/en/stable/#redis.Redis

If you can get a failing test case, please post it here.

@selwin selwin closed this as completed Nov 7, 2020
@BobReid
Copy link
Contributor

BobReid commented Nov 25, 2020

RQ Worker fails with this argument when used with the --with-scheduler options. I also reproduced the error by running the worker directly.

redis = Redis(..., ssl=True)
worker = Worker('default', connection=redis)
worker.work(with_scheduler=True)

Stack Trace

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/rq/worker.py", line 970, in perform_job
    self.prepare_job_execution(job, heartbeat_ttl)
  File "/usr/local/lib/python3.7/site-packages/rq/worker.py", line 865, in prepare_job_execution
    pipeline.execute()
  File "/usr/local/lib/python3.7/site-packages/redis/client.py", line 4013, in execute
    self.shard_hint)
  File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 1187, in get_connection
    connection = self.make_connection()
  File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 1227, in make_connection
    return self.connection_class(**self.connection_kwargs)
  File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 828, in __init__
    super(SSLConnection, self).__init__(**kwargs)
TypeError: __init__() got an unexpected keyword argument 'ssl'
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 1185, in get_connection
    connection = self._available_connections.pop()
IndexError: pop from empty list

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/rq/worker.py", line 970, in perform_job
    self.prepare_job_execution(job, heartbeat_ttl)
  File "/usr/local/lib/python3.7/site-packages/rq/worker.py", line 865, in prepare_job_execution
    pipeline.execute()
  File "/usr/local/lib/python3.7/site-packages/redis/client.py", line 4013, in execute
    self.shard_hint)
  File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 1187, in get_connection
    connection = self.make_connection()
  File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 1227, in make_connection
    return self.connection_class(**self.connection_kwargs)
  File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 828, in __init__
    super(SSLConnection, self).__init__(**kwargs)
TypeError: __init__() got an unexpected keyword argument 'ssl'

@selwin
Copy link
Collaborator

selwin commented Nov 25, 2020

Are you running version 1.6.1?

@BobReid
Copy link
Contributor

BobReid commented Nov 25, 2020

I am running rq v1.6.1 and redis v3.5.3

@BobReid
Copy link
Contributor

BobReid commented Nov 25, 2020

It seems to be something to do with the connection pool being empty, either rq or redis attempts to create a new connection, and somehow ssl gets passed to the SSLConnection class itself. ssl is a supported arg on the Redis client class not the connection class which makes sense since it uses that argument to dynamically select which connection subclass to use.

@selwin
Copy link
Collaborator

selwin commented Nov 25, 2020 via email

@selwin
Copy link
Collaborator

selwin commented Nov 25, 2020 via email

@BobReid
Copy link
Contributor

BobReid commented Nov 25, 2020

I would be happy to submit a failing test case but at first glance it does not seem trivial because of the ssl requirement. As far as I can tell there is no support for SSL in the test infrastructure.

As far as I can tell we would need to do the following

  1. Install & configure an ssl tunnel service such as stunnel in the test Docker image
  2. Change the way that connections are created in the tests. Currently there is a non-ssl connection set up by the test base class. Either that or I can create a one of connection for the test and just use the same dbnum as the original connection.

Am I on the right track for this or am I missing something?

Do you mind opening a PR with a failing test case for this?

On Nov 25, 2020, at 11:00 AM, Selwin Ong @.> wrote:  Ok, will take a look at this soon. >> On Nov 25, 2020, at 10:56 AM, BobReid @.> wrote: >> >  > I am running rq v1.6.1 and redis v3.5.3 > > — > You are receiving this because you modified the open/close state. > Reply to this email directly, view it on GitHub, or unsubscribe.

@selwin selwin reopened this Nov 25, 2020
@selwin
Copy link
Collaborator

selwin commented Nov 25, 2020

I was thinking maybe we can just create a Redis connection with SSL, pass it into RQScheduler to instantiate a scheduler can call get_connection(). We only need to test these two bits here, I think:

@BobReid
Copy link
Contributor

BobReid commented Nov 25, 2020

Yes I agree we can just create a connection for the specific test case, that part should be trivial. The bigger problem is SSL support in the redis-server itself.

Redis only got first class TLS support in v6, even then it is an optional compile time configuration. On top of that the redis-server apt-get package used by the test infrastructure is still on v5 so it will need a 3rd part ssl tunnel to be able to connect to the server. Which will take me some time to work through configuring. I am using a managed Redis instance at the moment which comes with SSL support so I haven't set that up before.

As for the code you have linked I would suspect that the following line is the problem:
https://github.com/rq/rq/blob/master/rq/scheduler.py#L49

This seems wrong. LIke I said earlier, ssl is not a parameter to the connection class, it is a parameter to the redis client class. The redis client select the connection class based on it. In this case the code already has already gotten an SSLConnection.

Neither SSLConnection nor Connection take ssl as an argument
https://github.com/andymccurdy/redis-py/blob/2c9f41f46991f94f0626598600d1023d4e12f0bc/redis/connection.py#L506
https://github.com/andymccurdy/redis-py/blob/2c9f41f46991f94f0626598600d1023d4e12f0bc/redis/connection.py#L824

I suspect some of the confusion might be coming from the language around connection in the RQ code base. In RQ a connection generally refers to an instance of a Redis class which is probably more correctly referred to as the redis client. In the redis code base a connection is an instance of the Connection class (or subclass) and is held by a redis client instance. The Redis class constructor takes ssl as an argument but the Connection class does not.

@selwin
Copy link
Collaborator

selwin commented Nov 25, 2020

Yes I agree we can just create a connection for the specific test case, that part should be trivial. The bigger problem is SSL support in the redis-server itself.

Redis only got first class TLS support in v6, even then it is an optional compile time configuration. On top of that the redis-server apt-get package used by the test infrastructure is still on v5 so it will need a 3rd part ssl tunnel to be able to connect to the server. Which will take me some time to work through configuring. I am using a managed Redis instance at the moment which comes with SSL support so I haven't set that up before.

I was thinking we could do something that doesn't involve setting up Redis with SSL support. So we just try to get it to one of these points:
a. It can create connection successfully without raising TypeError
b. It tries to connect to Redis but fails because Redis doesn't support SSL.

Something like this:

redis = Redis(ssl=True)
scheduler = RQScheduler(connection=redis)
with self.assertRaises(ConnectionError):  # Or whatever error raised by not being able to connect to redis. As it is now currently, it will raise `TypeError`
    scheduler.get_connection()

@BobReid
Copy link
Contributor

BobReid commented Nov 25, 2020

I can try this but it might not capture the error I am seeing entirely. What I am seeing is the initial connection working but then fails when a subsequent connection is required because there are not available. The result is jobs ids are being popped from the queue but then being orphaned in redis.

@BobReid
Copy link
Contributor

BobReid commented Nov 25, 2020

I am pretty sure I understand the problem now. It is the init method of the scheduler.

        self._connection_kwargs = connection.connection_pool.connection_kwargs
        # Redis does not accept parser_class argument which is sometimes present
        # on connection_pool kwargs, for example when hiredis is used
        self._connection_kwargs.pop('parser_class', None)
        self._connection_class = connection.__class__  # client
        connection_class = connection.connection_pool.connection_class
        if issubclass(connection_class, SSLConnection):
            self._connection_kwargs['ssl'] = True

The scheduler is holding a reference to the same connection_kwargs hash instance as the SSLConnection object. It then manipulates that object and adds the ssl parameter. The connection pool then goes to use its connection_kwargs instance and it has been manipulated from under its feet.

One fix would be to just make a copy of the connection_kwargs hash. On reading further I am little confused by the constructor. You have to pass a Redis instance in the connection parameter but it doesn't save that reference or do anything with it other than grab that connection_pool kwargs and the connection class. It uses that to recreate a Redis instance on demand when the connection property is accessed.

    @property
    def connection(self):
        if self._connection:
            return self._connection
        self._connection = self._connection_class(**self._connection_kwargs)
        return self._connection

@BobReid
Copy link
Contributor

BobReid commented Nov 25, 2020

The quick test case you wanted to try does not work because of the issues I raised above.

I went and did a quick and dirty setup of SSL in the docker container and switched all tests over to connect over SSL and the bug shows up in several of the existing test cases.

#1383

@selwin
Copy link
Collaborator

selwin commented Nov 25, 2020

On reading further I am little confused by the constructor. You have to pass a Redis instance in the connection parameter but it doesn't save that reference or do anything with it other than grab that connection_pool kwargs and the connection class. It uses that to recreate a Redis instance on demand when the connection property is accessed.

This strange block of code is required because Python >= 3.8.2 is unable to serialize Redis, leading to this fix where we recreate the Redis instance in the scheduler.

@selwin
Copy link
Collaborator

selwin commented Nov 25, 2020

Actually I tested Redis with SSL on version 1.6.1 and it seems to works just fine. It was able to construct the correct Redis instance and tried to connect to Redis.

ipython — 204×50 2020-11-26 06-37-46

@BobReid
Copy link
Contributor

BobReid commented Nov 26, 2020

@selwin RQ can connect to redis over ssl just fine. The problem arises when the connection pool is empty and it tries to instantiate a new connection because the scheduler has mutated the connection_pools kwargs. RQScheduler is being a bad citizen by mutating a hash that is does not own.

Checkout the PR I opened. If you revert the change to the scheduler constructor then 3 or 4 tests fails.

I am pretty confident in my assessment of the problem. Not sure the best way to capture this error without setting up SSL in the test suite. I don't think you can create a failing test case without it.

I am happy to clean up that code to a point where it can be checked in if you want to go down that path.

@BobReid
Copy link
Contributor

BobReid commented Nov 26, 2020

Here are the failures of running the entire suite under SSL

image

@selwin
Copy link
Collaborator

selwin commented Nov 26, 2020

the scheduler has mutated the connection_pools kwargs. RQScheduler is being a bad citizen by mutating a hash that is does not own.

Ok, now I understand.

In that case, something like this should solve the issue. In __init__:

if issubclass(connection_class, SSLConnection):
    self.ssl = True

In scheduler.connection:

    @property
    def connection(self):
        if self._connection:
            return self._connection
        self._connection = self._connection_class(**self._connection_kwargs, ssl=self.ssl)
        return self._connection

@selwin
Copy link
Collaborator

selwin commented Nov 26, 2020

I am pretty confident in my assessment of the problem. Not sure the best way to capture this error without setting up SSL in the test suite. I don't think you can create a failing test case without it. I am happy to clean up that code to a point where it can be checked in if you want to go down that path.

Given your previous comments, it looks like it won’t be that simple to have an SSL capable Redis running in CI, or set it up with stunnel. What do you think would be a reasonable way forward? I’d be happy with both either way.

@BobReid
Copy link
Contributor

BobReid commented Nov 26, 2020

I just cleaned up my PR and then realized why you said it will be difficult to run SSL on CI. I had just assumed it was using that same docker image in the repo under CI. Now I realize you are using a 3rd party action to run Redis.

I do think the right thing to do is to get SSL working on CI as it is a hole in the test coverage. That being said I would like to find a way to get this merged in sooner rather than later as I had to disable the scheduler on my infrastructure for the time being.

One thought is to create a test decorator that skips the SSL test on CI for the time being until the CI infrastructure is sorted. I am open to other thoughts.

@selwin
Copy link
Collaborator

selwin commented Nov 26, 2020

I agree with you. Let’s get this one in first and I’ll try to release a bugfix version this weekend.

I’m also ok with having a test that’s disabled by default, most people probably won’t have Redis with SSL on their dev machines available anyway.

Once we figure out how to get Redis with SSL to run on CI, we can modify our CI script to run it with SSL test enabled.

@BobReid
Copy link
Contributor

BobReid commented Nov 26, 2020

@selwin
I implemented the test skip for SSL. I think the PR is ready for review. The code cov is failing saying my new test code is not covered by tests. I think it can be ignored or a rule added to codecov to ignore the test code.

#1383

@BobReid
Copy link
Contributor

BobReid commented Nov 27, 2020

I see the fix got merged. When will you be able to publish a release?

@selwin
Copy link
Collaborator

selwin commented Nov 28, 2020 via email

@BobReid
Copy link
Contributor

BobReid commented Dec 9, 2020

I have deployed the update into prod and it is working. @selwin I think you can close this issue.

@selwin
Copy link
Collaborator

selwin commented Dec 10, 2020

Thanks!

@selwin selwin closed this as completed Dec 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants