Skip to content

Fix issue with LLv6 literals in URLs#418

Merged
ok2c merged 1 commit intoapache:5.3.xfrom
arturobernalg:feature/stripsIPv6ZoneId
Mar 6, 2023
Merged

Fix issue with LLv6 literals in URLs#418
ok2c merged 1 commit intoapache:5.3.xfrom
arturobernalg:feature/stripsIPv6ZoneId

Conversation

@arturobernalg
Copy link
Copy Markdown
Member

This pull request addresses issue LLv6 literals in URLs, where LLv6 literals in URLs are not properly handled by HttpClient due to a limitation in the Java standard method InetAddress.getAllByName(host). Specifically, when constructing a URL with an LLv6 literal, the % sign that prefixes the ZoneID must be quoted as %25 according to RFC 6874. However, HttpClient does not support quoted host literals.

To work around this limitation, this pull request modifies the SystemDefaultDnsResolver class to strip the IPv6 zone identifier from the host string before calling InetAddress.getAllByName(host). This allows HttpClient to handle LLv6 literals in URLs that conform to RFC 6874.

@arturobernalg arturobernalg force-pushed the feature/stripsIPv6ZoneId branch from 027b5dc to 128f126 Compare February 28, 2023 18:37
@Override
public InetAddress[] resolve(final String host) throws UnknownHostException {
return InetAddress.getAllByName(host);
try {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arturobernalg This approach has a substantial drawback: DNS lookup failures completely unrelated to LLv6 will not generate two DNS queries. For some applications it may be not a big deal but for some it may. What is the problem with always calling #stripsIPv6ZoneId prior to calling InetAddress#getAllByName?

Copy link
Copy Markdown
Member Author

@arturobernalg arturobernalg Mar 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @ok2c
The reason why we don't want to call stripsIPv6ZoneId all the time is that it involves additional processing and may not be necessary for hostnames that are not LLv6. This additional processing can cause a slight delay in the overall execution time of the resolve method.
In addition, if we always call stripsIPv6ZoneId before calling InetAddress.getAllByName, it means that we will be making two DNS queries for every hostname, even those that are not LLv6. This can result in unnecessary network traffic and may cause performance issues in certain environments.
Therefore, the approach of only calling stripsIPv6ZoneId when a UnknownHostException is caught and the hostname is an LLv6 address is a more targeted and efficient solution.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arturobernalg Repeating the same DNS query with the same input makes no sense and a lot of folks will be upset due to unnecessary execution costs. Please make sure the second query gets executed only if anything got stripped and the input is not the same

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ok2c That's a valid point.
Changed.
TY

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arturobernalg I suggest #stripsIPv6ZoneId be inlined as it is not being used anywhere else and extra hostname comparison be eliminated.

@arturobernalg arturobernalg force-pushed the feature/stripsIPv6ZoneId branch from b8248cf to 90f4c50 Compare March 1, 2023 20:38
@arturobernalg arturobernalg requested a review from ok2c March 2, 2023 13:14
@Override
public InetAddress[] resolve(final String host) throws UnknownHostException {
return InetAddress.getAllByName(host);
try {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arturobernalg I suggest #stripsIPv6ZoneId be inlined as it is not being used anywhere else and extra hostname comparison be eliminated.

void resolve() throws UnknownHostException {
final SystemDefaultDnsResolver resolver = SystemDefaultDnsResolver.INSTANCE;

final InetAddress[] result1 = resolver.resolve("example.com");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arturobernalg While extra test coverage is nice, in this particular case these tests introduce dependency on DNS infrastructure with public network access. These tests may fail when run in a private network where 'example.com' does not resolve. Please re-write the tests that do not require public network access or remote them.

@arturobernalg arturobernalg force-pushed the feature/stripsIPv6ZoneId branch from 906ff1d to fcfb09d Compare March 2, 2023 20:11
@arturobernalg arturobernalg requested a review from ok2c March 2, 2023 20:11
@arturobernalg
Copy link
Copy Markdown
Member Author

arturobernalg commented Mar 2, 2023

HI @ok2c
While I don't entirely agree with placing logic in the catch block, I still made the change. In my opinion, it's preferable to keep the logic separate.
TY

@arturobernalg arturobernalg force-pushed the feature/stripsIPv6ZoneId branch from 7d6db33 to 9923d99 Compare March 3, 2023 06:51
strippedHost = host.substring(0, i) + "]";
}
}
if (!strippedHost.equals(host)) {
Copy link
Copy Markdown
Member

@ok2c ok2c Mar 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arturobernalg The whole point of inlining the method was to avoid this extra comparison! Please use a boolean variable instead or set strippedHost to null by default!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arturobernalg In fact if you use null to signify a case when the original host is unchanged (not stripped) and no second DNS is not needed you can put this code into a separate method as before, if you like it better that way.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmmm @ok2c would you mind give me an example? had chased some many time this method

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arturobernalg How about this?

String strippedHost = null;
if (host.charAt(0) == '[') {
  final int i = host.lastIndexOf('%');
  if (i != -1) {
    strippedHost = host.substring(0, i) + "]";
  }
}
if (strippedHost != null) {
  return InetAddress.getAllByName(strippedHost);
}
throw e;

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ok2c , there is no significant difference between the two code snippets you provided. Both snippets attempt to strip the IPv6 zone ID from the host name if it is present and retry the resolution if the default resolver fails.
But I any case, I made the change

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arturobernalg There is a biiiig difference in terms of performance between if (strippedHost != null) and if (!strippedHost.equals(host))

@arturobernalg arturobernalg requested a review from ok2c March 3, 2023 17:33
…ssue with URL quoting for LLv6 host literals, the original implementation of SystemDefaultDnsResolver does not support constructing URLs with LLv6 literals that have a ZoneID. This commit implements a workaround that strips the ZoneID from LLv6 literals before resolving them. This allows URLs with LLv6 literals to be constructed without quoting the ZoneID. Note that this workaround does not fully comply with RFC 6874, but it should work in most cases.
@arturobernalg arturobernalg force-pushed the feature/stripsIPv6ZoneId branch from cd71654 to 3c5adc5 Compare March 3, 2023 19:45
@ok2c ok2c merged commit 4dde83b into apache:5.3.x Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants