Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection's recv buffer should be emptied before sending new command #52

Closed
myrfy001 opened this issue Aug 28, 2017 · 8 comments
Closed

Comments

@myrfy001
Copy link

In some case, if a command was interrupted by outer exceptions(such as timeout), the connection will be released to the pool containing unconsumed data. The unconsumed data will be read by the next command and thus lead to mistake.

The following code can reproduce this situation (it use async_timeout library from aio-libs):

# coding:utf-8

from aredis import StrictRedisCluster
import asyncio
import random
import async_timeout
import time

redis_cluster_client = StrictRedisCluster(
    startup_nodes=[{"host": "127.0.0.1", "port": 7001}],
    max_connections=40, max_connections_per_node=True)


async def worker():
    while(1):
        try:
            with async_timeout.timeout(0.05):
                for _ in range(2):
                    await redis_cluster_client.exists("test_key")
                    await asyncio.sleep(random.random() / 100)
                await redis_cluster_client.set("test_key", "1")
                await asyncio.sleep(random.random() / 100)
        except Exception as e:
            print(e)


def main():
    loop = asyncio.get_event_loop()
    asyncio.gather(*[worker() for _ in range(30)], loop=loop)
    loop.run_forever()


if __name__ == "__main__":
    main()

And you will get output like this after running for sometime:

'int' object has no attribute 'decode'
'int' object has no attribute 'decode'
'int' object has no attribute 'decode'
'int' object has no attribute 'decode'

In the code above, a very small timeout value (0.05 second) is set, if you can't reproduce the error, you can try to reduce it to a even smaller one.

The exists command will return 0 or 1, and the set command will get the int type value returned for exists command and lead to a error.

In my production code, a worker will execute a job with a timeout limit, if the job timeout, it may cause this error.

@myrfy001
Copy link
Author

Maybe there is another way to solve this problem. Add a state property to the Connection class, and mark a connection as busy, before sending a request and mark it as idle after handled the response. Before sending a new command, check the status.

@NoneGG
Copy link
Owner

NoneGG commented Aug 28, 2017

That sounds reasonable, would you like to make a pr? Or i will try to fix this problem this weekend

@NoneGG
Copy link
Owner

NoneGG commented Aug 30, 2017

@myrfy001
Do you think it should be regarded as reusable if a connection is broken(stop before or during reading response)?

For now i try to fix this issue by adding attribute to connection to indicate if it is waiting for response, if it is waiting then connection will be disconnected and never put back to available connection list again. The parse response error 'int' object has no attribute 'decode' is fixed and never raised again. (but TimeoutError raised by asyncio_timeout is still there)

But i am not sure if this kind of connection should be clean and reused again.
Looking forward to your opinion.

@myrfy001
Copy link
Author

I think close the connection is the safest way. Think this situation: because of network delay, or because of large responses by bulk execution, the response of the pervious command may arrive after cleaning the buffer. So, close the connection is a safe way to avoid this. Besides, this problem is an edge case so it's seldom happens, so, there won't be a lot of reconnect action that will lower the efficiency.

@NoneGG
Copy link
Owner

NoneGG commented Aug 31, 2017

That's good, this patch will be tested and merge to master soon.

@NoneGG
Copy link
Owner

NoneGG commented Aug 31, 2017

@myrfy001
Patch is already merged to master and you can try it with
pip install git+https://github.com/NoneGG/aredis.git

If there are other problems please tell me~

@NoneGG
Copy link
Owner

NoneGG commented Sep 4, 2017

@myrfy001
May i close this issue?

@myrfy001
Copy link
Author

myrfy001 commented Sep 4, 2017

It seems working well.
Looking forward to the next release.

@NoneGG NoneGG closed this as completed Sep 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants