Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(PUP-744) Support persistent HTTP connections #2863

Merged

Conversation

joshcooper
Copy link
Contributor

This enables puppet agent and puppet apply to use persistent http connections, affecting catalog, pluginsync, remote file sources, reports. This is most noticeable when pluginsyncing many small files, e.g. modules, external facts, as previously, puppet would make a new HTTP connection (including TCP and SSL handshake) for every file.

Given a master:

root@puppetmaster:/etc/puppet/manifests# ruby --version
ruby 1.8.7 (2010-08-16 patchlevel 302) [x86_64-linux]
root@puppetmaster:/etc/puppet/manifests# puppet --version
3.6.2
root@puppetmaster:/etc/puppet/manifests# puppet module list
/etc/puppet/modules
├── puppetlabs-acl (v1.0.2)
├── puppetlabs-apache (v1.1.1)
├── puppetlabs-concat (v1.1.0)
├── puppetlabs-dism (v0.2.1)
├── puppetlabs-mysql (v2.3.1)
├── puppetlabs-reboot (v0.1.7)
├── puppetlabs-registry (v1.0.2)
└── puppetlabs-stdlib (v4.3.2)

Given a script to execute the agent 10 times (deleting the pluginsync destination directory each time):

#!/bin/bash
for i in `seq 1 10`
do
  printf .
  rm -rf ~/.puppet/var/lib
  bundle exec puppet agent --onetime --no-daemonize --server puppetmaster.solar.lan 2&> err.log
done

Without these commits, it takes the agent 1m22s:

$ git rev-parse HEAD
cb4ad1f19b043a70ba17e4103a25f5c97a76948b
$ time ./test.sh
..........
real    1m22.362s
user    0m43.996s
sys 0m7.456s

With these commits, it takes 38s:

$ git rev-parse HEAD
867ea2e86fa263d2ee64db2dda0d9f4004b86878
$ time ./test.sh
..........
real    0m38.357s
user    0m24.662s
sys 0m6.807s

@joshcooper joshcooper changed the title Do Not Merge (PUP-744) Do Not Merge Jul 11, 2014
@puppetcla
Copy link

CLA signed by all contributors.

@joshcooper
Copy link
Contributor Author

It appears Net::HTTPS#ca_file has been available since at least 1.8.7

@joshcooper
Copy link
Contributor Author

HTTP 1.1 assumes keep-alive and ruby defaulted to HTTP 1.1 as far back as 1.8.7. Also verified that when the puppet agent explicitly starts the connection via Net::HTTP#start, the server responds with its keep-alive header, indicating that it is ready to persist the connection:

<- "GET /production/file_metadatas/pluginfacts?links=manage&recurse=true&ignore=.svn&ignore=CVS&ignore=.git&checksum_type=md5 HTTP/1.1\r\nAccept: pson, msgpack, yaml, b64_zlib_yaml, raw\r\nAccept-Encoding: identity\r\nUser-Agent: Ruby\r\nHost: puppetmaster:8140\r\n\r\n"
-> "HTTP/1.1 200 OK \r\n"
-> "X-Puppet-Version: 3.6.2\r\n"
-> "Content-Type: text/pson\r\n"
-> "Server: WEBrick/1.3.1 (Ruby/1.9.3/2014-02-24) OpenSSL/1.0.1e\r\n"
-> "Date: Fri, 18 Jul 2014 07:37:41 GMT\r\n"
-> "Content-Length: 302\r\n"
-> "Connection: Keep-Alive\r\n"

@joshcooper joshcooper changed the title (PUP-744) Do Not Merge (PUP-744) Support persistent HTTP connections Jul 24, 2014
This commit creates a site value object containing a scheme, host, and
port. A site identifies an HTTP(S) connection. It does not currently
take into account proxy host, port, user, password, as these are global
puppet settings.

The Site class defines the `eql?` and `hash` methods making it suitable
for use as the key in a hash table.
This commit adds a Site#move_to method that creates a new site pointing
to a new location. Note that we don't preserve the path from the new
site, as sites are only concerned with the HTTP connection layer, not
the HTTP request path and query parameters.
This commit adds a site helper method for use_ssl?
An HTTP session binds a puppet HTTP connection and an expiration time.
Previously, puppet connections were directly responsible for creating
Net::HTTP objects. This was okay because there was only one code path
where new Net::HTTP objects needed to be created.

However, in order to support different connection pool behaviors,
caching and non-caching of Net::HTTP objects, we want to have two pool
implementations, and each of them will need the ability to create
Net::HTTP objects.

This commit creates a factory object for creating Net::HTTP objects,
and has the puppet connection delegate to it.
This commit adds a DummyPool that doesn't perform any caching of network
connections. It also adds the pool to the context system, but does so
using a Proc to defer evaluation of the pool. This way we are not
loading networking code while bootstrapping puppet.
This commit adds a pool for caching puppet network connections. The pool
maintains a hash indexed by site.  Each site refers to an array of
sessions, where each session refers to a puppet HTTP connection and its
timeout.

It is expected the consumers of the pool will follow the pattern:

    conn = pool.take_connection(site)
    # do something with conn
    pool.add_connection(site, conn)

Whenever a connection is (re-)added to the pool, the pool will assign a
expiration timeout. When a caller next requests a connection, the pool
will lazily close any expired connection(s) for that site.

The timeout exists, because if there is a long delay between the time
that an open connection is added to the pool and when it is taken, the
remote peer may close the connection, e.g. apache. As a result, we
only want to cache connections for a short time. Currently the timeout
defaults to 15 seconds.
Previously, the dummy and caching pools allowed the caller to take a
connection from the pool and retain ownership of the connection. This is
undesirable because if the caller encounters an exception while using
the connection, then they have to remember to close it, otherwise the
underlying socket will leak.

There is also the issue that a caller may forget to add it back to the
pool before the pool is closed, in which case the socket will again
leak.

This commit changes the api to a block form to ensure connections are
always returned to the pool or closed.

Note the `reuse` flag is used to ensure the connection is returned to
the pool even if the caller does a non-local return:

    with_connection(..) { |conn| return }
Previously, the puppet connection owned a ruby http connection, and
could create it lazily in a number of different ways, e.g. `#address`.

If an HTTP request resulted in a redirect, the `@connection` instance
variable would be overwritten with a new ruby http object. This made
it difficult to reason about the lifecycle of ruby http objects.

The commit associates a site with each puppet connection. As a result
it is trivial to ask the puppet connection what its `#address` is and
do so without creating a ruby http object.

This commit also removes the `@connection` instance variable. Instead
connections are obtained from the pool for the duration of the request,
e.g. `Connection#get`.

The `Connection#connection` private method is retained for now, as a
number of tests rely on it, via send(:connection). Currently it
delegates to the factory to return a new connection, but will be deleted
in a future commit.

Finally, the redirect loop relied on a non-local jump. This exposed a
bug in the pool implementation, because the connection wasn't returned
to the pool. That bug was fixed in a preceding commit, but to ensure
these sorts of things don't happen in the future, this commit returns
the response outside of the `with_connection` block.
This commit overrides the http_pool binding stored in the context system
with a caching pool implementation. The caching pool is only used for
the duration of the configurer run, which includes both agent and apply
applications.

The pool is closed in an ensure block to ensure any cached connections
are closed. Also the `run_internal` method is marked as private to make
it clear which version callers should use.
DummyPool was a bad name, renaming to what it actually does.
Previously, the puppet connection class owned the Net::HTTP factory and
passed it to the pool when requesting a connection. However, the caching
pool only needs the factory when creating a new connection.

This commit makes the factory stateless and pushes it into the
respective pools. The SSL verifier is stateful (contains peer certs)
and remains in puppet's connection class.

When a pool needs to create a connection, it calls back into the puppet
connection object to initialize SSL on the Net::HTTP object. In the case
of the caching poool, it needs to do this before starting the
connection, which will perform the TCP connect and SSL handshake
operations.
The puppet connection class doesn't have an ssl_host instance variable.
Previously, the "Starting connection" message came before the "Creating
connection" message. This commit reorders the messages to appear in the
correct order.
Commit 553b2ad (circa 2007) monkey patched Net::HTTP to set
`Net::HTTP#ca_file=`. This method is needed to specify the file
containing the set of CA cert(s) that the SSL client will use to
authenticate SSL servers.

Note the `ca_file=` method is available in all versions of ruby 1.8.7
and up[1], so it is no longer necessary to monkey patch this.

[1] https://github.com/ruby/ruby/blob/v1_8_7/lib/net/https.rb#L145
When starting a persistent HTTP connection, we do not expliclty specify
`Connection: Keep-Alive`, because that is the default in HTTP/1.1[1].

This commit adds a test to ensure we are using HTTP 1.1 or later.
Amazingly ruby has defaulted to 1.1 as far back as ruby 1.8.7[2].

[1] http://tools.ietf.org/html/rfc7230#appendix-A.1.2
[2] https://github.com/ruby/ruby/blob/v1_8_7/lib/net/http.rb#L282
Previously the http keepalive timeout was hard coded to be 15 seconds.
This commit adds a new puppet setting, `http_keepalive_timeout`, which
defaults to 4 seconds. The reason for the smaller value, is because
Apache 2.2 and 2.4 have a corresponding server side keepalive timeout,
which defaults to 5 seconds.

If the client timeout were greater than or equal to the server timeout,
then the client may try to reuse a connection from the pool that the
server has almost certainly closed.
This commit adds a puppet setting, `http_debug`, enabling HTTP request
and responses to be written to stderr. It is disabled by default, and
should remain so in production environments, because it can leak
sensitive information.

If enabled, it displays information like:

    opening connection to puppetmaster...
    opened
    <- "GET /production/node/node?...\r\n"
    -> "HTTP/1.1 200 OK \r\n"
    -> "X-Puppet-Version: 3.6.2\r\n"
    -> "Content-Type: text/pson\r\n"
    -> "Server: WEBrick/1.3.1 (Ruby/1.9.3/2014-02-24) OpenSSL/1.0.1e\r\n"
    -> "Date: Fri, 18 Jul 2014 07:37:41 GMT\r\n"
    -> "Content-Length: 4202\r\n"
    -> "Connection: Keep-Alive\r\n"
    -> "\r\n"
    reading 4202 bytes...
    -> ""
    -> <response_body>
    read 4202 bytes
    Conn keep-alive
The puppet connection class has request_get, etc methods that accept a
block. I tried to deprecate them, to eliminate the possibility that a
caller could borrow a connection from the pool indefinitely.

However, I found that we have to pass a block to ruby's request method
in order to stream the response body. If you don't pass a block, ruby
will always read the entire response body into memory. If you later
try to stream the body:

    response.read_body { |chunk| yield uncompressor.uncompress(chunk) } }

It will fail with:

    Net::HTTPOK#read_body called twice

This commit updates the comment so it's clear why these other
request_methods exist.
Previously, the `Connection#connection` method was used in
`connection_spec.rb` to validate the state of the Net::HTTP object
created by the puppet connection.

Since the puppet connection no longer owns an Net::HTTP object (it may
use newly created or cached instances), the `#connection` method
doesn't make sense, and since it was private, can be removed.

The tests that relied on the private method have been updated, and in
some cases moved to the factory_spec, e.g. verifying proxy settings.
Previously, the connection would mutate its host, port, etc when
following redirects. This wasn't really an issue, because connections
were always single use.

Now that connections can be cached, we do not want to mutate the
original site information even when following permanent redirects.
Previously, the `redirect_limit` specified the maximum number of HTTP
requests that the connection would make, not the maximum number of
redirects that it would follow. As a result, a limit of 0 would prevent
any HTTP requests from being made. This was introduced in 60f22eb when
the redirect logic was added.

This commit changes the limit so that 0 means don't follow redirects,
1 means follow one redirect, etc.
We do use persistent HTTP connections, sometimes.
Previously, I was concerned that a caller could borrow a connection
from the pool, and perform a long operation before returning it to the
pool. That could make it trivial for the caller to add expired
connections back to the pool, and cause issues for the next HTTP
request.

However, that is not likely, because connections are only borrowed for
the duration of the `Connection#get`, `#post`, etc calls, and those
methods don't take a block. The connection methods that do take a
block, e.g. `#request_get` are deprecated, and are not used within
puppet.
Remove expectations that were secondary to the actual test, also
clarify behavior that was being tested.
Add yard doc for http pooling. Also adds a test to verify multiple
persistent connections can be borrowed from the pool.
Previously, it was possible for puppet to create an HTTPS connection
with VERIFY_NONE, and cache the connection. As a result, the pool code
could hand out an insecure connection, when the caller expected a secure
one (VERIFY_PEER).

In practice this isn't an issue, because puppet only uses insecure HTTPS
connections when bootstrapping its SSL key and certs, and that happens
before `Configurer#run` overrides the default non-caching pool
implementation. However, it could be an issue if a provider were to
request a insecure connection *during* catalog application, as that
connection would be cached, and possibly reused by puppet, e.g. when
sending the report.

This commit modifies the connection pool to only cache HTTP or secure
HTTPS connections. All other connections are closed after one HTTP
request/response.
@joshcooper
Copy link
Contributor Author

rebased onto master to pick up spec test fixes

@kylog
Copy link

kylog commented Aug 5, 2014

This looks great! You write very clearly - thanks for that. Other than my minor comments inline I'm 👍

Previously, validator_spec.rb was requiring
'puppet/ssl/configuration', because puppet/ssl.rb did not require
it. Commit 658e4fd fixed puppet/ssl.rb, so it is no longer necessary
or desired for validator_spec.rb to require puppet/ssl/configuration.

Also, the default and no_validators were not consistent in expressing
their dependencies on openssl and puppet/ssl.
@kylog
Copy link

kylog commented Aug 6, 2014

Okay thanks for the backstory on some of the questions above. This looks great! I'm 👍 but will hold off on merging until we get current Jenkins issues sorted out.

kylog pushed a commit that referenced this pull request Aug 7, 2014
…ooling

(PUP-744) Support persistent HTTP connections
@kylog kylog merged commit 789e00b into puppetlabs:master Aug 7, 2014
@joshcooper joshcooper deleted the ticket/master/PUP-744-http-pooling branch August 14, 2018 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants