Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-2950. Upgrade jetty to the latest 9.4 release #508

Merged
merged 8 commits into from
Feb 7, 2020

Conversation

elek
Copy link
Member

@elek elek commented Jan 30, 2020

What changes were proposed in this pull request?

The jetty which is used by web interfaces of Ozone is from the September of 2018.

Since then many bug and security fixes added to the Jetty project. I would suggest to use the latest jetty (from January of 2020).

As HttpServer2 (hadoop 3.2 class) has a strong dependency on the older version of Jetty (it uses SessionManager which is removed), it seems to be easier to clone HttpServer2 and move it to the Ozone project.

It provides us the flexibility:

  • To upgrade jetty independent from the Hadoop releases
  • To remove unused features and make our HTTP stack more reliable (eg. Yarn, WebHDFS part of the code)
  • To customize our HTTP server usage (eg. SPNEGO should be ignore for S3 REST)
  • To support ozone style configuration and maintain all our configuration est
  • To add additional insights/metrics to our own HttpServer
  • To simplify the current server initialization (current BaseHttpServer class of hdds/ozone is a wrapper around the Hadoop class, but we can combine the two functionalities)

What is included in this patch

  • HttpServer2 and lightweight dependencies are cloned from Hadoop 3.2
  • HADOOP-16152 (Jetty upgrade since 3.2 release)
  • Checkstyle / findbugs (to follow our coding standards)
  • As we have >5 http related classes I moved them to a dedicated package in the framework project.
  • **The main logic has not been modified by this patch ** This is mainly code organization and code import.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-2950

How was this patch tested?

There are related unit tests, but also checked with docker-compose based cluster + checking the ui pages from browser.

@arp7
Copy link
Contributor

arp7 commented Jan 30, 2020

@smengcl can you take a look at this pull request? This is forking the HttpWebServer from Hadoop and applying your changes to port to 9.4.26.

I think long term it makes sense to fork the Hadoop HttpWebServer.

@smengcl
Copy link
Contributor

smengcl commented Jan 30, 2020

@smengcl can you take a look at this pull request? This is forking the HttpWebServer from Hadoop and applying your changes to port to 9.4.26.

I think long term it makes sense to fork the Hadoop HttpWebServer.

Sure. Looking at this.

Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick review of this. Overall looks good. Most classes are straight from Hadoop 3.2 (and mostly the same as 3.3 trunk). The only potential questions are in HttpServer2 class.

As in the comment, there are a few advancements from Hadoop 3.2. e.g. HADOOP-16718, HADOOP-16727, HADOOP-16398. Please take a look if we need those new commits as well. (Maybe not)


public static final String FILTER_INITIALIZER_PROPERTY
= "ozone.http.filter.initializers";

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hadoop.http.sni.host.check.enabled is added in HADOOP-16718 in trunk. We might want it here as well?

@elek
Copy link
Member Author

elek commented Feb 3, 2020

Thank you very much @smengcl

As a background: as of now we use hadoop 3.2. The most safest import is to use exactly the same HttpServer2.java what we have in the latest hadoop 3.2 release.

I tried to use trunk HttpServer2 from the Hadoop trunk but found multiple changes related the authorization broke S3 gateway.

After that I decided to keep the same HttpServer2 which is used now (with Jetty patch included).

Therefore (as a rule of thumb) I prefer to import here just minimal changes and import newer patches in separated jira (where we can carefully test s3g).

I checked the suggested patches (thanks for the suggestions):

  • HADOOP-16727 It's a small null check, can be good to add it as soon as possible. I added it to the patch.
  • HADOOP-16398 Prometheus support is backported from Ozone to Hadoop. We don't need the patch as we already have the original solution. (Later we can simplify our initialization to make it more similar to the other default servlets but it's a bigger refactor.)
  • HADOOP-16718 I tried to understand why this is required and didn't find any information. If you have something, let me know. (Might be required for webhdfs which shouldn't be supported in our side.) For me it seems to be optional, and s3g and other ozone components work well based on the tests. Unless we have a strong reason I would prefer to keep the default Jetty behavior (and use one less config).

@smengcl
Copy link
Contributor

smengcl commented Feb 5, 2020

Thank you very much @smengcl

As a background: as of now we use hadoop 3.2. The most safest import is to use exactly the same HttpServer2.java what we have in the latest hadoop 3.2 release.

I tried to use trunk HttpServer2 from the Hadoop trunk but found multiple changes related the authorization broke S3 gateway.

After that I decided to keep the same HttpServer2 which is used now (with Jetty patch included).

Therefore (as a rule of thumb) I prefer to import here just minimal changes and import newer patches in separated jira (where we can carefully test s3g).

I checked the suggested patches (thanks for the suggestions):

  • HADOOP-16727 It's a small null check, can be good to add it as soon as possible. I added it to the patch.
  • HADOOP-16398 Prometheus support is backported from Ozone to Hadoop. We don't need the patch as we already have the original solution. (Later we can simplify our initialization to make it more similar to the other default servlets but it's a bigger refactor.)
  • HADOOP-16718 I tried to understand why this is required and didn't find any information. If you have something, let me know. (Might be required for webhdfs which shouldn't be supported in our side.) For me it seems to be optional, and s3g and other ozone components work well based on the tests. Unless we have a strong reason I would prefer to keep the default Jetty behavior (and use one less config).

Thanks for digging into this. I'm good with using same HttpServer2.java as Hadoop 3.2.

The two new commits lgtm.

As for HADOOP-16718, I believe the scope is larger than WebHDFS. Web UIs with HTTPS using jetty may all be impacted under certain cases. In short, this can cause connection failure in some circumstances:

if the server's JKS file has a private/public key/cert pairing that is valid but it also has another trustedCertEntry certificate that has the hostname in subjectAltName extension, the trusted cert gets picked.
This triggers an internal failure to determine a common cipher to use and the server will return the following exception to the client:
fatal error: 40: no cipher suites in common

For details, please take a look at the link I sent over to you on Slack.

Copy link
Contributor

@arp7 arp7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 lgtm. Thanks @smengcl for the review.

@smengcl
Copy link
Contributor

smengcl commented Feb 6, 2020

I am good with backporting HADOOP-16718 in another jira. So far lgtm+1

Thanks @elek @arp7

@xiaoyuyao
Copy link
Contributor

Thanks @elek for the update. The latest change LGTM, +1.

@xiaoyuyao xiaoyuyao merged commit 1ac8263 into apache:master Feb 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants