-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up knife vault refresh #178
Conversation
Doing a full search on the node index means the entire node object is downloaded for each hit. As we are only interested in the name, only ask for that.
288e941
to
efc947c
Compare
+1 |
I'm not entirely sure how to get the specs to run properly here, help appreciated. :-) |
Thanks for the idea/contribution. I am however concerned about thread safety here. I don't believe that Chef guarantees that the chef-client code is thread safe, and even if the current implementation is on MRI, there's no guarantee that it will always be that way. Inside of the thread, you're calling add_client, which in turn calls add on the ChefVault::ItemKeys object. That mutates the ItemKeys object (pushing entries onto the clients array), but without any protection around the update. On MRI that may work because all user code is implicitly single threaded, but if MRI ever improved its threading to match that of JRuby or Rubinius, it would be subject to race collisions. It seems like the better way to speed things up is to not search for nodes names, then re-issue the search query to get the public key of each node. Is the public key something we can get from the original search query? Otherwise, I'd want to see race protection and unit tests to exercise the edge cases in order to consider merging this. |
I was thinking about batching as well with the suggestion above (search for nodes, then search for keys from list of nodes). I'm concerned over possible race condition as well in between batch updates as there's no guarantee in between batch that the underlaying client info haven't changed. |
]] James FitzGibbon
Aren't pushing items onto an array thread safe in Ruby? Oh well. :-)
Nodes don't have keys. Clients have keys, so that's basically my other client_name = 'foo' OR client_name = 'bar' OR client_name = 'baz' OR … This could potentially be even quicker, since fewer roundtrips. I can whip up a patch to do this and you can see what you think? Tollef Fog Heen |
In MRI, only one user thread can run at a time (more than one system thread can run simultaneously), so you get synchroniziation "for free", but it's implementation dependent and not guaranteed. There are also edge cases in which MRI isn't thread safe: http://www.rubyinside.com/does-the-gil-make-your-ruby-code-thread-safe-6051.html But even if we put a mutex around the updates to the item keys, there is no guarantee that the chef client code is thread safe or will remain so. The patch as written issues multiple API operations using the same underlying Chef::REST object in multiple threads. I'm not aware of any chef-authored code that does this, and Chef could choose to break this without warning (because it's not guaranteed by the API). So, to do any of this multithreaded, there would need to be a mutex around the chef API operations and an mutex around the item key mutations, at which point why bother trying to multithread? So trying to expand the clients into the "NAME or NAME or NAME" style may be a better route, but what happens with multi-thousand node installations? Does SOLR have a limit on how many clauses can be in the query? What's the algorithmic complexity of SOLR searches? I have no idea off the top of my head. Anything up to linear then your approach is worth exploring, but quadratic or worse is probably a show stopper. I guess what I'm saying is that I don't see a safe, obvious solution to speed this up. Even the new Chef server API v1 doesn't fix this, as you don't get the key back from a client query - you have to pull a sub-resource (e.g. /clients/foo/keys/1) to get it. I'm open to changes that make chef-vault better, but only if they don't introduce stability problems or require people to use Chef client or server v12 or Ruby 2. There's a huge installed base of people still using chef-client v11, and preventing them from using chef-vault would have to be paired with a major version bump. Thanks |
This will use a partial search for the client names for smaller deployments, but use a full client search for large deployments. It's still a win due to the much lower number of roundtrips.
I've pushed a non-threaded solution. This runs in ~30s for me, as compared with ~15 minutes with master. Again, this will likely require test adjustments. Happy for input on the approach. |
Have you had a chance to think about the approach in the latest patch here? I believe it is correct (but will of course require fixing up the tests so it actually passes still) |
@tfheen I'll go ahead and verify the tests with the new patch (is this upstream now or still a branch?). I think it should still work out of the box since partial_search just go through the same chef-zero search mechanism that emulated by test-kitchen databags injecting databags into the local chef-zero environment. |
@sysbot it's not merged yet (even if everything else was great, I don't think upstream would merge a patch with failing tests). As you can see from the status on the pull request, the tests fail (but fixing it should be fairly easy, from what I can see). |
The API for Chef::Search::Query changed between Chef 11 and Chef 12: https://github.com/chef/chef/blob/11.18.14/lib/chef/search/query.rb https://github.com/chef/chef/blob/12.4.1/lib/chef/search/query.rb Using v11-style args against the v12 args works, but I don't think it works the other way. The hash of options that you're passing will be interpreted as the sort arg in 11, and it will attempt to URL-encode the .to_s representation of the Hash. Support for chef-client v11 will not be dropped until the next major release of chef-vault. I appreciate the attempt to solve this problem, and I'm not trying to be purposefully negative, but the speedup you're trying to achive is not trivial to do without breaking existing installations. Perhaps what is required here is some ancillary library for managing chef-vault data bags, with its own cli. Then you could use threads with ridley or chef-api (not sure if that's threadsafe, so don't quote me) and get the speedup you desire without the change also affecting runtime decryption of secrets, which is the part that has to be compatible with Ruby 1.9 and chef-client v11. |
This is based on @tfheen's work in #178, and a suggestion from @rveznaver in #237 Signed-off-by: Thom May <thom@chef.io>
This is based on @tfheen's work in #178, and a suggestion from @rveznaver in #237 Signed-off-by: Thom May <thom@chef.io>
Some of this landed in #240, and it's unlikely we'll rewrite to do threads any time soon. Thanks for the PR! |
Use partial search and threads to speed up.
Fixes: #177