Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a high-level HTTP client API #7398

Merged
merged 5 commits into from
Nov 19, 2022
Merged

Conversation

basil
Copy link
Member

@basil basil commented Nov 16, 2022

Background

During the past few weeks we have been rewriting various usages of Commons HttpClient 3.x to Commons HttpClient 4.x or the new HTTP client in the Java Platform. While doing so, we have observed that there is a lot of duplicate code for HTTP client configuration spread across many plugins.

Problem

A common need is to configure the HTTP client with the proxy server and port specified in the Jenkins configuration (optionally with proxy authentication username and password), and another use case we have seen is configuring the timeout or user agent to a configured value. While Jenkins core defines ProxyConfiguration#open and ProxyConfiguration#getInputStream APIs based on HttpURLConnection, the underlying Java API is too low-level to be practical. Some library plugins (e.g., OkHttp) define a Jenkins-specific wrapper API for handling this configuration for a higher-level library, but others (e.g., Apache HttpComponents Client 4.x API) do not, leading to a proliferation of duplicate code in consumers (e.g., https://github.com/jenkinsci/jdk-tool-plugin/blob/62d2fd4b47931c9cb023eedbc33447eeb91d852e/src/main/java/hudson/tools/JDKInstaller.java#L463-L480) and a variety of mechanisms for configuring common options.

Solution

While thinking about the above we began to wish for a standard Jenkins core API that provided a pre-configured and easy-to-use HTTP client. With the completion of JEP-321, Java 11 provides a standardized high-level HTTP client. While not as fully-featured as Apache Commons HttpClient or other libraries, it has the advantage of being simple and easy-to-use at a high level and is part of the Java Platform. This PR adds some wrappers to make this API easier to use in the context of Jenkins core and plugins.

Pros

We believe that plugins should generally prefer this new API to one of the many HTTP client library plugins when targeting Java 11. We have been programming with the Java Platform HTTP client for a month or so now, and we find it much simpler and easier to use than the Apache HttpClient 4.x and 5.x APIs (which we are very familiar with). Using the Java Platform's HTTP client rather than a plethora of third-party libraries makes it more likely that configuration can be consistently applied for consumers in core and plugins.

Cons

The Java Platform HTTP client is not without its share of flaws, though. For example, SSL hostname verification can only be disabled with a JVM-wide system property, not per HttpClient instance. Furthermore, proxy authentication does not seem to work when using a proxy to connect to an HTTPS URL unless HTTPS is removed from the jdk.http.auth.tunneling.disabledSchemes setting via JVM system properties. And there are some use cases not accomodated by the new Java Platform HTTP client, mostly revolving around retries. My opinion is that this type of robustness logic is better implemented by composing HTTP logic with something like Retrier or Spring Retry rather than baking in the concept of retries to every high-level API. Retries (with the associated concepts of exponential backoff, thundering herds, idempotence, etc.) are a complex topic, and this complex topic deserves its own library and API (my all-time favorite in Python is Tenacity).

Implementation

Back to this PR, we added two new APIs to ProxyConfiguration: one to get an instance of HttpClient and one to get an instance of HttpRequest. The former automatically handles proxy configuration and connection timeout, while the latter automatically handles setting the user agent. These are all set to standard Jenkins values that are configurable either in the Jenkins UI or with well-established Jenkins-specific system properties.

We then migrated a handful of consumers to use the new APIs; namely, form validation and telemetry submission. These are low-risk consumers, so if anything goes wrong the impact should be minimal. But we think they are enough to demonstrate the viability of the general approach: providing a very thin wrapper layer around the native Java Platform APIs. We did not attempt to convert some more advanced usages, like actually fetching the JSON blob from the Update Center. Those use cases are currently implemented with a lot of low-level HttpURLConnection-based code, wrapped in Kohsuke's RetryableHttpStream, and do lots of tricky low-level things with retries, last-modified headers, redirects, etc. which seemed like it would be tricky to reason about and test in the context of this change, a prospect made even more challenging by the need to maintain compatibility with existing method signatures. While migrating core usages from the low-level HttpURLConnection to the high-level HttpClient is a worthy endeavor, it is not the purpose of this PR. This PR merely seeks to demonstrate the viability of the concept, which it does by converting some of the simpler use cases. Plugins are expected to use this API for similar simple use cases, in much the same way as the callers that we have migrated in this PR and callers in plugins that use high-level libraries like Apache HttpComponents or OkHttp.

While we were here and testing the various form validation methods, we also improved the form validation to be more consistent and print better error messages. We also noticed that the form validation URLs could be using the HTTP HEAD method more consistently (some were and some were not). These methods are not doing anything with the body of the response, so an HTTP HEAD request is appropriate.

We wrote a custom implementation of java.net.ProxySelector. The Java Platform provides two: sun.net.spi.DefaultProxySelector (which is not programmatically configurable without the use of system properties and therefore not suitable for our use case) and java.net.ProxySelector.StaticProxySelector (which supports exclusions, but only through a complex constellation of legacy system properties that are only present for compatibility with HttpURLConnection, not via its own programmatic API). So we wrote our own, which is essentially a moderately improved version of java.net.ProxySelector.StaticProxySelector with support for exclusions (which we need in order to be able to implement the "no proxy hosts" functionality provided in the Jenkins configuration UI).

Of note is that once the HttpClient wrapper is created, changes to the proxy configuration in the Jenkins UI will not be reflected in it. In this regard it is no different than the existing ProxyConfiguration#open API. Nevertheless, it is something to be aware of. The returned instances of HttpClient and HttpRequest are intended to be short-lived, not to be stored in memory in some sort of cache where their authentication details could become stale.

Testing done

To test this change we set up an authenticated proxy by running docker run -e SQUID_USERNAME=foo -e SQUID_PASSWORD=bar -p 3128:3128 lifenz/squid-docker-simple-auth and configured the proxy in the UI. We then stepped through the new logic in hudson.ProxyConfiguration#newHttpClientBuilder and hudson.ProxyConfiguration.JenkinsProxySelector in several scenarios:

  • Proxy disabled (validating both HTTP and HTTPS URLs)
  • Proxy enabled, no authentication (validating both HTTP and HTTPS URLs)
  • Proxy enabled with proxy authentication (validating both HTTP and HTTPS URLs)

The last test (proxy authentication for an HTTPS URL) required setting the -Djdk.http.auth.tunneling.disabledSchemes= option to the empty string to prevent the JDK from blocking the authentication for an HTTPS connection.

To test telemetry submission, we went to the Script Console and ran

import jenkins.telemetry.Telemetry

ExtensionList.lookupSingleton(Telemetry.TelemetryReporter.class).doRun()

to exercise the code with a proxy (with proxy authentication) while stepping through the code in the debugger.

To test the form valdation changes we went to each of the three modified forms and entered a variety of values (invalid URLs, valid HTTP URLs, valid HTTPS URLs, etc) and stepped through each form validation function in the debugger to ensure that the proxy configuration was being configured properly and taking effect when the request was made (by putting breakpoints in internal JDK methods as well as hudson.ProxyConfiguration.JenkinsProxySelector). We ensured that invalid values printed a graceful error message that could be expanded to view the full exception if desired.

As far as automated testing goes, we ran mvn clean verify -Dtest=hudson.core.PluginManagerOverrideTest,hudson.CustomPluginManagerTest,hudson.model.UpdateCenter2Test,hudson.model.UpdateCenterConnectionStatusTest,hudson.model.UpdateCenterCustomTest,hudson.model.UpdateCenterPluginInstallTest,hudson.model.UpdateCenterTest,hudson.PluginManagerCheckUpdateCenterTest,hudson.PluginManagerInstalledGUITest,hudson.PluginManagerTest,hudson.PluginManagerUtil,hudson.PluginManagerWhichTest,hudson.ProxyConfigurationTest,hudson.tools.ZipExtractionInstallerTest,jenkins.telemetry.TelemetryTest locally.

Proposed changelog entries

Developer: Introduce a high-level HTTP client API.

Proposed upgrade guidelines

N/A

Submitter checklist

  • The Jira issue, if it exists, is well-described.
  • The changelog entries and upgrade guidelines are appropriate for the audience affected by the change (users or developers, depending on the change) and are in the imperative mood (see examples).
    • Fill in the Proposed upgrade guidelines section only if there are breaking changes or changes that may require extra steps from users during upgrade.
  • There is automated testing or an explanation as to why this change has no tests.
  • New public classes, fields, and methods are annotated with @Restricted or have @since TODO Javadocs, as appropriate.
  • New deprecations are annotated with @Deprecated(since = "TODO") or @Deprecated(forRemoval = true, since = "TODO"), if applicable.
  • New or substantially changed JavaScript is not defined inline and does not call eval to ease future introduction of Content Security Policy (CSP) directives (see documentation).
  • For dependency updates, there are links to external changelogs and, if possible, full differentials.
  • For new APIs and extension points, there is a link to at least one consumer.

Desired reviewers

@mention

Maintainer checklist

Before the changes are marked as ready-for-merge:

  • There are at least two (2) approvals for the pull request and no outstanding requests for change.
  • Conversations in the pull request are over, or it is explicit that a reviewer is not blocking the change.
  • Changelog entries in the pull request title and/or Proposed changelog entries are accurate, human-readable, and in the imperative mood.
  • Proper changelog labels are set so that the changelog can be generated automatically.
  • If the change needs additional upgrade steps from users, the upgrade-guide-needed label is set and there is a Proposed upgrade guidelines section in the pull request title (see example).
  • If it would make sense to backport the change to LTS, a Jira issue must exist, be a Bug or Improvement, and be labeled as lts-candidate to be considered (see query).

@basil basil added the developer Changes which impact plugin developers label Nov 16, 2022
@daniel-beck
Copy link
Member

SSL hostname verification can only be disabled with a JVM-wide system property, not per HttpClient instance.

I expect this might become a problem for adoption, given how common it is for individual build steps etc. to allow disabling this.

we

Were others involved in this project?

To test telemetry submission, we went to the Script Console and ran … to exercise the code with a proxy (with proxy authentication) while stepping through the code in the debugger.

jenkins.telemetry.Telemetry.endpoint exists which might help with this.

@olamy
Copy link
Member

olamy commented Nov 17, 2022

Very good idea to avoid so much duplicated code in so many plugins!!

Pros

We believe that plugins should generally prefer this new API to one of the many HTTP client library plugins when targeting Java 11. We have been programming with the Java Platform HTTP client for a month or so now, and we find it much simpler and easier to use than the Apache HttpClient 4.x and 5.x APIs (which we are very familiar with).

because you never tried Jetty Http Client which have a very nice designed fluent api and it's very performant as well :P

@basil basil requested a review from a team November 17, 2022 16:29
@basil
Copy link
Member Author

basil commented Nov 18, 2022

This PR is now ready for merge. We will merge it after approximately 24 hours if there is no negative feedback. Please see the merge process documentation for more information about the merge process. Thanks!

@basil basil added the ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback label Nov 18, 2022
@basil basil merged commit 9e3c780 into jenkinsci:master Nov 19, 2022
@basil basil deleted the httpClient branch November 19, 2022 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
developer Changes which impact plugin developers ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback
Projects
None yet
4 participants