Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Issue/stable/pup 2659 synchronize webrick rest #2708
Puppet had some scattered thread safety code, which was removed in 4da350b, because Puppet itself is not using threads. However, Webrick starts a new thread to handle each incoming request. This was probably causing subtle problems during simultaneous requests to a webrick master, but since 3.6.1 the issue became much more noticeable due to the fix for PUP-2610 merged in 0dc5ad3 which was fixing a case where rack masters would lose track of newly established environment loaders using the master settings. The change to Puppet::Context used a new mark/rollback method for overriding context making it very obvious when concurrent threads would collide attempting to rollback the same marker and exhaust the context stack. We are fixing this by synchronizing around incoming webrick requests, effectively serializing their access to Puppet's innards. Initial manual benchmarking did not show a significant decrease in performance, probably because Ruby uses green threads handled by the vm which cannot take advantage of multiple processors.
It is difficult to test concurrent requests with Beaker as it does not allow parrallel requests to be made. I'm using backgrounded subprocess in a shell script to get around this, and I'm only making direct catalog requests via curl, since an agent will halt if it spots another's pidfile in existence on the same node. Still, this is one of the major calls that the agent makes, and it does effectively expose PUP-2659 on a non-synchronized webrick master.
While it looks like a good idea, the problem is the performance cost of this. Before during a compilation run, another agent could access the CA or the file server subsystem. Now it will have to wait (and timeout if the catalog compilation takes some time) to get served.
I know that webrick is not the recommended setup, but the ticket PUP-2569 shows that there's a lot of people still using it in small/mid-size environments, so it should definitely work well for them.