Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplicate the code that strips the user info from authority #5490

Closed
trustin opened this issue Mar 8, 2024 · 6 comments · Fixed by #5561
Closed

Deduplicate the code that strips the user info from authority #5490

trustin opened this issue Mar 8, 2024 · 6 comments · Fixed by #5561
Labels
improvement sprint Issues for OSS Sprint participants

Comments

@trustin
Copy link
Member

trustin commented Mar 8, 2024

When a user sends an HTTP request using Armeria client, a user can specify the user info part in a request URI:

WebClient.of()
         .get("http://myuserinfo@foo.com/");

However, the user info part is not actually used at all and stripped by removeUserInfo() method in Endpoint and DefaultRequestContext:

We see the following problems with the above code:

  • They are duplicate,
  • They both call HostAndPort.fromString() with the stripped authority; and
  • DefaultRequestTarget.normalizeSchemeAndAuthority() (code) doesn't use HostAndPort.fromString(), unlike Endpoint and DefaultClientRequestContext do.

To ensure the consistency in authority parsing, we should:

  • Always strip the user info part from authority before validation and normalization,
  • Always use new URI(String) to parse an authority string; and
  • Make sure we move this logic into ArmeriaHttpUtil so it is shared by DefaultRequestTarget, Endpoint and DefaultClientRequestContext.

For Endpoint and DefaultClientRequestContext, we could:

// e.g. //foo:1234
var uri = new URI("//" + authorityWithoutUserInfo);
uri.getRawAuthority(); // foo:1234
uri.getHost(); // foo
uri.getPort(); // 1234

For DefaultRequestTarget, we could:

// e.g. "http://foo:1234"
var uri = new URI(scheme + "://" + authorityWithoutUserInfo);
uri.getScheme(); // http
uri.getRawAuthority(); // foo:1234
uri.getHost(); // foo
uri.getPort(); // 1234
@kratosmy
Copy link
Contributor

Hi, @ikhoon, could I try this issue?

@jrhee17
Copy link
Contributor

jrhee17 commented Mar 30, 2024

Hi sorry, we're doing an internal open source sprint and this issue has been assigned 😅 Let us assign labels on such issues so that it is clear which issues are already being worked on 🙇

@kratosmy
Copy link
Contributor

Hi sorry, we're doing an internal open source sprint and this issue has been assigned 😅 Let us assign labels on such issues so that it is clear which issues are already being worked on 🙇

sure, so what kind of label could I take on?

@jrhee17
Copy link
Contributor

jrhee17 commented Apr 1, 2024

Armeria provides a HTTP client WebClient, which is able to send HTTP requests.

Usually, a user provides a String format URI and WebClient needs to parse the different parts of the URI to determine the correct protocol, endpoint, and various headers.

  URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

  hier-part   = "//" authority path-abempty
              / path-absolute
              / path-rootless
              / path-empty

ref: https://www.rfc-editor.org/rfc/rfc3986.html#section-3

Amongst the different URI components, the authority determines 1) Where to send the request 2) What to set to the Host or :authority header. Within Armeria, we 1) strip the userinfo 2) separate the host and port 3) and validate the host and port values.

authority = [ userinfo "@" ] host [ ":" port ]

ref: https://www.rfc-editor.org/rfc/rfc3986.html#section-3.2

However, the validation/parsing logic of the authority is fragmented across the code base:

  1. public static Endpoint parse(String authority) {
    requireNonNull(authority, "authority");
    checkArgument(!authority.isEmpty(), "authority is empty");
    return cache.get(authority, key -> {
    if (key.charAt(key.length() - 1) == ':') {
    // HostAndPort.fromString() does not validate an authority that ends with ':' such as "0.0.0.0:"
    throw new IllegalArgumentException("Missing port number: " + key);
    }
    final HostAndPort hostAndPort = HostAndPort.fromString(removeUserInfo(key)).withDefaultPort(0);
    return create(hostAndPort.getHost(), hostAndPort.getPort(), true);
    });
    }
  2. // The connection will be established with the IP address but `host` set to the `Endpoint`
    // could be used for SNI. It would make users send HTTPS requests with CSLB or configure a reverse
    // proxy based on an authority.
    final String host = HostAndPort.fromString(removeUserInfo(authority)).getHost();
    if (!NetUtil.isValidIpV4Address(host) && !NetUtil.isValidIpV6Address(host)) {
    endpoint = endpoint.withHost(host);
    }
  3. schemeAndAuthority = normalizeSchemeAndAuthority(scheme, authority);

The objective of this issue is to unify the validation/parsing logic above

To get jump-started with the current code base, it's probably best one checks from unit tests how the above logic works.
Here are some code snippets which should get one started with how the Armeria public APIs interact with the above logic.

i.e. When sending the request from WebClient

class WebClientTest {

    @RegisterExtension
    static final ServerExtension server = new ServerExtension() {
        @Override
        protected void configure(ServerBuilder sb) throws Exception {
            sb.service("/", (ctx, req) -> HttpResponse.of(200));
        }
    };

    @Test
    void test1() {
        final AggregatedHttpResponse res = WebClient.of().blocking().get(server.httpUri().toString());
        assertThat(res.status().code()).isEqualTo(200);
    }

    @Test
    void test2() {
        final HttpRequest req = HttpRequest.builder()
                                           .header(HttpHeaderNames.SCHEME, "http")
                                           .header(HttpHeaderNames.AUTHORITY, server.httpUri().getRawAuthority())
                                           .get("/")
                                           .build();
        final AggregatedHttpResponse res = WebClient.of().blocking().execute(req);
        assertThat(res.status().code()).isEqualTo(200);
    }

    @Test
    void test3() {
        final AggregatedHttpResponse res = WebClient.of(server.httpUri()).blocking().get("/");
        assertThat(res.status().code()).isEqualTo(200);
    }

    @Test
    void test4() {
        final Endpoint endpoint = Endpoint.of(server.httpUri().getRawAuthority());
        final AggregatedHttpResponse res = WebClient.of(SessionProtocol.HTTP, endpoint).blocking().get("/");
        assertThat(res.status().code()).isEqualTo(200);
    }
}

i.e. When altering headers from decorators

    @Test
    void test5() {
        final AggregatedHttpResponse res = server
                .blockingWebClient(cb -> cb.decorator((delegate, ctx, req) -> {
                    ctx.addAdditionalRequestHeader(HttpHeaderNames.AUTHORITY, "my-authority:8080");
                    assertThat(ctx.host()).isEqualTo("my-authority");
                    return delegate.execute(ctx, req);
                }))
                .get("/");
        assertThat(res.status().code()).isEqualTo(200);
    }

i.e. When parsing an arbitrary target string (this method is used in RedirectingClient/gRPC clients)

    @Test
    void test6() {
        final RequestTarget requestTarget = RequestTarget.forClient("http://my-authority:8080/hello");
        assertThat(requestTarget.authority()).isEqualTo("my-authority:8080");
    }

Afterwards, I recommend that we add a public method in ArmeriaHttpUtil and try to unify the above logic. The skeletal API would probably look something like the following:

public static ... parseAuthority(...) {
  ...
}

@jrhee17
Copy link
Contributor

jrhee17 commented Apr 1, 2024

sure, so what kind of label could I take on?

Let me add a sprint label to all affected issues

@jrhee17 jrhee17 added the sprint Issues for OSS Sprint participants label Apr 1, 2024
@jrhee17
Copy link
Contributor

jrhee17 commented Apr 1, 2024

Also note the following bullet points from the initial report:

  • Always strip the user info part from authority before validation and
  • Always use new URI(String) to parse an authority string; and
  • Make sure we move this logic into ArmeriaHttpUtil so it is shared by DefaultRequestTarget, Endpoint and DefaultClientRequestContext.

For starters, we may want to make this the initial implementation and trying to run the build. We could work on fixing each test case as we move forward.

minwoox pushed a commit that referenced this issue Apr 23, 2024
…ation (#5561)

Motivation:

- Make util method to deduplicate the code that strips the user info from authority

Modifications:

- ~Add `normalizeAuthority()`, `normalizeSchemeAndAuthority()` static method in `ArmeriaHttpUtil`~
- Add SchemeAndAuthorit class which is in charge of authority normalization


Result:

- Closes #5490. (If this resolves the issue.)

<!--
Visit this URL to learn more about how to write a pull request description:
https://armeria.dev/community/developer-guide#how-to-write-pull-request-description
-->
Dogacel pushed a commit to Dogacel/armeria that referenced this issue Jun 8, 2024
…ation (line#5561)

Motivation:

- Make util method to deduplicate the code that strips the user info from authority

Modifications:

- ~Add `normalizeAuthority()`, `normalizeSchemeAndAuthority()` static method in `ArmeriaHttpUtil`~
- Add SchemeAndAuthorit class which is in charge of authority normalization


Result:

- Closes line#5490. (If this resolves the issue.)

<!--
Visit this URL to learn more about how to write a pull request description:
https://armeria.dev/community/developer-guide#how-to-write-pull-request-description
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement sprint Issues for OSS Sprint participants
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants