Skip to content

Conversation

@bowensong
Copy link

I've noticed that when using the pyrax to do bulk uploads, it keeps closing and creating TCP connections to the same host, this is very inefficient, and should be avoid.

The solution for this issue is using a connection pool to manage HTTP(TCP) connections, because the requests library has implemented the connection pool (which is in the requests.Session module), I'm changing the pyrax.http module to use it.

For the concern of thread-safe, I'm using a thread local storage to store the sessions for each individual thread, so there will be no interference between multiple threads sharing one pyrax instance.

…y TCP reconnecting

Also updated pyrax.http unit test cases according to the changes
@coveralls
Copy link

Coverage Status

Coverage increased (+0.0%) to 96.72% when pulling 6e05325 on micro-alex:master into 70d5c10 on rackspace:master.

@sivel
Copy link
Contributor

sivel commented Jun 4, 2015

I'm not sure that this is the correct approach. Since pyrax can work within contexts, I think that a session would need to be created for the individual contexts, rather than at a global level.

@bowensong
Copy link
Author

@sivel I'll take a look tomorrow, especially focus on does share connections between context do any harms.

@bowensong
Copy link
Author

I've closely looked the code, seems there are a few opetions:
1, global sessions for each thread in pyrax.http
This is the implemented one in this PR
2, add session to the BaseClient class
In this case, whether user is using the context, the connection pool will working properly as long as the user doesn't create new client object each time they doing something with pyrax. However, if the user has large amount of *Client instance, this will cause the program keep large amount TCP connections. Also, this will increase the cost of creating / destroying any *Client objects.
3, add session to the BaseIdentity class
Users will call ctx.get_client() function to get the actually *Client class, so the session will pass to all *Client objects created from the get_client() function, all of the *Client instance created form the same context will share the same session (connection pool), this maybe NOT thread-safe. See stackoverflow question of this.
Also, obviously, if an user doesn't use context, this will be not helpful at all, the user will still suffering from recreating connections.
@sivel Which solution do you think is better? Or, maybe you have better suggestions?

@bowensong
Copy link
Author

BTW, I think I've made a little mistake - this PR is against the master branch, not the working branch.

@briancurtin
Copy link
Contributor

Would you be able to send me a generic example script that shows the operation(s) that would be benefitted by this change - basically, what were you running that caused you to see a need for this change? We're in the process of building out a replacement for pyrax and we're reasonably close to "done" in a number of areas*, and it's built from the ground up around sessions. I'd like to see that we're doing the right thing in that project, and perhaps have you look at it as well.

* It's an OpenStack project right now, but most of the key areas of it will already work verbatim for Rackspace such as Cloud Servers and Cloud Files. There's still a few other services we'll need Rackspace-specific support added to, but they're coming up soon.

@bowensong
Copy link
Author

@briancurtin here's a simple test script I was using for test, please open a tcpdump / tcptrack or other network monitoring tools when you running this script.

import threading
import random
import pyrax


def randomdata(length):
    return ''.join([chr(random.randint(0, 255)) for i in range(length)])


def thread_func_inf_obj_handle(thread_num, cf):
    counter = 0
    while True:
        filename = 'TEST_%d_%d.test' % (thread_num, counter)
        counter += 1
        content = randomdata(random.randint(1000, 100000))
        cf.store_object(filename, content, return_none=True)


def main():
    pyrax.settings.set('identity_type', 'rackspace')
    pyrax.set_credential_file(MY_CREDENTIAL_FILE)
    cloudfiles = pyrax.cloudfiles
    cloudfiles.timeout = 120
    cf = cloudfiles.get_container('test')

    threads = []
    for i in range(10):
        threads.append(threading.Thread(
            target=thread_func_inf_obj_handle,
            args=(i, cf)))
        threads[-1].start()

    for t in threads:
        t.join()

if __name__ == '__main__':
    main()

In our production environment, the filename and randomdata() is replaced by real data, and the while True is replaced by a for loop to iterate some data.

Just in case you need to know the credential file:

[rackspace_cloud]
username = xxx
api_key = xxx
region = LON
identity_type = rackspace

@bowensong
Copy link
Author

I have been waiting for a month now, I'm hoping someone can review this PR, then decide to merge it, close it or give me some better ideas.

@briancurtin
Copy link
Contributor

I'll leave this one to sivel to decide if he wants to merge it - he's on vacation for a while so that may take some time. This looks maybe ok, but I'd want to be sure from him if it has any implications when used with Ansible. Pyrax being lacking in functional tests and having fragile unit tests means a change like this is fairly risky.

As I mentioned before, we're working on the replacement for pyrax and so are not actively developing this project. We expect to have an announcement soon on what the official plans are for deprecation, release planning, etc.

@bowensong
Copy link
Author

Thanks, I'm looking forward.

@bowensong
Copy link
Author

@briancurtin Hi, has @sivel come back for the vacation?

@idealatom
Copy link

idealatom commented Jul 12, 2017

Any news about this pull request? Almost two years gone already...

@briancurtin
Copy link
Contributor

We are no longer supporting Pyrax.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants