Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HWKMETRICS-330 Update filter to be async #481

Merged
merged 4 commits into from Apr 29, 2016

Conversation

tsegismont
Copy link
Contributor

HWKMETRICS-330 Update filter to be async

A bit of context first. Hawkular Metrics is a reactive application and
calls to a blocking API should be avoided as much as possible.

The Openshift integration library had a servlet filter mechanism to
authenticate users. The filter code used the JDK's HTTP client to call
Kubernetes' master server. While this was alright has a first
implementation (in order to provide the feature as quickly as possible),
it is a serious limitation to reach our performance goals.

There were two problems to tackle:

  1. choose an async HTTP client API
  2. make the filter code asynchronous

For the HTTP client, I considered a few options: Netty, vert.x,
Undertow. I picked undertow because it does not add a new dependency,
we can reuse the io threads (instead of creating yet another thread
pool). The downside is that the Undertow client API is low-level (no pool
implementation) and not well documented.

For the filter code, my first try was to use servlet async API from the
filter. But RestEasy throws an exception because it wants start the
async exchange itself. I wrote to the resteasy-user list but got no
reply. And even if we had a fix, we would have to wait for a new
RestEasy and Wildfly release. So I wrote an Undertow extension which is
executed in io threads before the servlet handler is involved.

The implementation consists in pooling Undertow HTTP client connections and
filering the Metrics client requests (only dispatch to the servlet
handler if the user is authenticated).

While working on the implementation I had to find a way to share the
MetricsRegistry between the webapp code and the authenticator code. So
there's a MetricsRegistry provider class in core-util now.

Also, I enhanced the Gatling scenario file to support multiple
authentication mechanisms (none, Hawkular integration, Openshift
htpasswd file, Openshift token). I took the opportunity to document the
scenario options in the project README.

Note that performance enhancements will be more visible in environments
where Kubernetes reponse time is minimal.

@tsegismont
Copy link
Contributor Author

I tested the changes with a mock Kubernetes master server.

Here are the results.

Before patch:

[tsegismont@sombrero load-tests]$ mvn gatling:execute -Dclients=1000 -Dloops=100 -Dramp=5 -DauthType=openshiftToken -Dtoken="blaslkdkslkqj=="
[INFO] Scanning for projects...
[INFO] srcdeps-maven-plugin not triggered by any of the goals [[gatling:execute]]
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building Hawkular Metrics Load Tests 0.15.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- gatling-maven-plugin:2.1.7:execute (default-cli) @ hawkular-metrics-load-tests ---
Simulation org.hawkular.metrics.loadtest.MetricsSimulation started...
Simulation finished
Parsing log file(s)...
Parsing log file(s) done
Generating reports...

================================================================================
---- Global Information --------------------------------------------------------
> request count                                     100000 (OK=100000 KO=0     )
> min response time                                     44 (OK=44     KO=-     )
> max response time                                   7754 (OK=7754   KO=-     )
> mean response time                                  1195 (OK=1195   KO=-     )
> std deviation                                        425 (OK=425    KO=-     )
> response time 50th percentile                       1165 (OK=1165   KO=-     )
> response time 75th percentile                       1194 (OK=1194   KO=-     )
> mean requests/sec                                438.681 (OK=438.681 KO=-     )
---- Response Time Distribution ------------------------------------------------
> t < 800 ms                                          4423 (  4%)
> 800 ms < t < 1200 ms                               73038 ( 73%)
> t > 1200 ms                                        22539 ( 23%)
> failed                                                 0 (  0%)
================================================================================

Reports generated in 2s.
Please open the following file: /home/tsegismont/Projets/hawkular/hawkular-metrics/integration-tests/load-tests/target/gatling/results/metricssimulation-1460560895394/index.html
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 03:55 min
[INFO] Finished at: 2016-04-13T17:25:25+02:00
[INFO] Final Memory: 18M/322M
[INFO] ------------------------------------------------------------------------

After patch:

[tsegismont@sombrero load-tests]$ mvn gatling:execute -Dclients=1000 -Dloops=100 -Dramp=5 -DauthType=openshiftToken -Dtoken="blaslkdkslkqj=="
[INFO] Scanning for projects...
[INFO] srcdeps-maven-plugin not triggered by any of the goals [[gatling:execute]]
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building Hawkular Metrics Load Tests 0.15.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- gatling-maven-plugin:2.1.7:execute (default-cli) @ hawkular-metrics-load-tests ---
Simulation org.hawkular.metrics.loadtest.MetricsSimulation started...
Simulation finished
Parsing log file(s)...
Parsing log file(s) done
Generating reports...

================================================================================
---- Global Information --------------------------------------------------------
> request count                                     100000 (OK=100000 KO=0     )
> min response time                                     43 (OK=43     KO=-     )
> max response time                                   1711 (OK=1711   KO=-     )
> mean response time                                   123 (OK=123    KO=-     )
> std deviation                                        188 (OK=188    KO=-     )
> response time 50th percentile                         74 (OK=74     KO=-     )
> response time 75th percentile                        106 (OK=106    KO=-     )
> mean requests/sec                                857.067 (OK=857.067 KO=-     )
---- Response Time Distribution ------------------------------------------------
> t < 800 ms                                         97702 ( 98%)
> 800 ms < t < 1200 ms                                 901 (  1%)
> t > 1200 ms                                         1397 (  1%)
> failed                                                 0 (  0%)
================================================================================

Reports generated in 2s.
Please open the following file: /home/tsegismont/Projets/hawkular/hawkular-metrics/integration-tests/load-tests/target/gatling/results/metricssimulation-1460554981267/index.html
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 02:03 min
[INFO] Finished at: 2016-04-13T15:45:00+02:00
[INFO] Final Memory: 18M/322M
[INFO] ------------------------------------------------------------------------

The throughput is nearly twice as greater than before, the response time 75th percentile is more than ten times better and the response time distribution is narrowed down.

The tests ran on my laptop, with 1 Metrics server, 1 C* node.

@tsegismont
Copy link
Contributor Author

@mwringe @burmanm this is Openshift related, would you mind to review? Thank you guys.

@burmanm
Copy link
Contributor

burmanm commented Apr 15, 2016

AFAICT LGTM (enough acronyms ?) ;)

@tsegismont
Copy link
Contributor Author

2016-04-15 22:56 GMT+02:00 Michael Burman notifications@github.com:

(enough acronyms ?) ;)

FWIW, IANAL but you need more acronyms to approve a PR. At least for a RH
DEV on HWKMETRICS :P

@stefannegrea
Copy link
Contributor

Still reviewing the PR.

@mwringe
Copy link
Contributor

mwringe commented Apr 26, 2016

This PR does not work for me and it fails to function properly. I just deployed the war into the deployment directory of WildFly, so if there is any other steps I need to do, please let me know.
hawkular-metrics.txt

if (SECURITY_OPTION.contains(OPENSHIFT_OAUTH)) {
authenticator = new TokenAuthenticator(containerHandler);
} else if (SECURITY_OPTION.contains(HTPASSWD)) {
authenticator = new BasicAuthenticator(containerHandler);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have both basic auth and/or token auth enabled at the same time. Its should not be considered one or the other.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at https://git.io/vwwlB, my understanding is that SECURITY_OPTION
allows just one mechanism at a time. Or am I missing something?

2016-04-26 21:49 GMT+02:00 Matthew Wringe notifications@github.com:

In
containers/hawkular-metrics-openshift-integration/src/main/java/org/hawkular/openshift/auth/OpenshiftAuthHandler.java
#481 (comment)
:

+public class OpenshiftAuthHandler implements HttpHandler {

  • private static final String OPENSHIFT_OAUTH = "openshift-oauth";
  • private static final String HTPASSWD = "htpasswd";
  • private static final String DISABLED = "disabled";
  • private static final String SECURITY_OPTION_SYSPROP = "hawkular-metrics.openshift.auth-methods";
  • private static final String SECURITY_OPTION = System.getProperty(SECURITY_OPTION_SYSPROP, OPENSHIFT_OAUTH);
  • private final HttpHandler containerHandler;
  • private final Authenticator authenticator;
  • public OpenshiftAuthHandler(HttpHandler containerHandler) {
  •    this.containerHandler = containerHandler;
    
  •    if (SECURITY_OPTION.contains(OPENSHIFT_OAUTH)) {
    
  •        authenticator = new TokenAuthenticator(containerHandler);
    
  •    } else if (SECURITY_OPTION.contains(HTPASSWD)) {
    
  •        authenticator = new BasicAuthenticator(containerHandler);
    

We can have both basic auth and/or token auth enabled at the same time.
Its should not be considered one or the other.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
https://github.com/hawkular/hawkular-metrics/pull/481/files/6f1f68379c3a457ff34180821423e68be5469a2b#r61152239

Thomas Segismont
JBoss ON Engineering Team

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly, that code selects which one is based on what the auth header is for the request and if the corresponding SECURITY_OPTION is specified. There is nothing in there that prevents multiple SECURITY_OPTION from being available.

We need to support multiple SECURITY_OPTIONS, and we do use both in our current implementation: https://github.com/openshift/origin-metrics/blob/master/deployer/templates/hawkular-metrics.yaml#L75

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooops, right. I overlooked the "contains" method. Fixing it now.

2016-04-27 16:49 GMT+02:00 Matthew Wringe notifications@github.com:

In
containers/hawkular-metrics-openshift-integration/src/main/java/org/hawkular/openshift/auth/OpenshiftAuthHandler.java
#481 (comment)
:

+public class OpenshiftAuthHandler implements HttpHandler {

  • private static final String OPENSHIFT_OAUTH = "openshift-oauth";
  • private static final String HTPASSWD = "htpasswd";
  • private static final String DISABLED = "disabled";
  • private static final String SECURITY_OPTION_SYSPROP = "hawkular-metrics.openshift.auth-methods";
  • private static final String SECURITY_OPTION = System.getProperty(SECURITY_OPTION_SYSPROP, OPENSHIFT_OAUTH);
  • private final HttpHandler containerHandler;
  • private final Authenticator authenticator;
  • public OpenshiftAuthHandler(HttpHandler containerHandler) {
  •    this.containerHandler = containerHandler;
    
  •    if (SECURITY_OPTION.contains(OPENSHIFT_OAUTH)) {
    
  •        authenticator = new TokenAuthenticator(containerHandler);
    
  •    } else if (SECURITY_OPTION.contains(HTPASSWD)) {
    
  •        authenticator = new BasicAuthenticator(containerHandler);
    

Not exactly, that code selects which one is based on what the auth header
is for the request and if the corresponding SECURITY_OPTION is specified.
There is nothing in there that prevents multiple SECURITY_OPTION from being
available.

We need to support multiple SECURITY_OPTIONS, and we do use both in our
current implementation:
https://github.com/openshift/origin-metrics/blob/master/deployer/templates/hawkular-metrics.yaml#L75


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
https://github.com/hawkular/hawkular-metrics/pull/481/files/6f1f68379c3a457ff34180821423e68be5469a2b#r61271587

Thomas Segismont
JBoss ON Engineering Team

@burmanm
Copy link
Contributor

burmanm commented Apr 26, 2016

@mwringe I have no problems deploying it.

@mwringe
Copy link
Contributor

mwringe commented Apr 26, 2016

@burmanm Did you update our containers to use it and then run it on OpenShift?

@tsegismont
Copy link
Contributor Author

Looks like something I missed as I was working with a mock server. I will
have a look and will come back to you if I need more info.

2016-04-26 21:37 GMT+02:00 Matthew Wringe notifications@github.com:

This PR does not work for me and it fails to function properly. I just
deployed the war into the deployment directory of WildFly, so if there is
any other steps I need to do, please let me know.
hawkular-metrics.txt
https://github.com/hawkular/hawkular-metrics/files/237329/hawkular-metrics.txt


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#481 (comment)

Thomas Segismont
JBoss ON Engineering Team

@tsegismont
Copy link
Contributor Author

Actually it is a communication level problem. Is there any way for me to
reproduce?

I would also like to understand the difference between your environment and
Micke's.

2016-04-27 15:16 GMT+02:00 Thomas Segismont tsegismo@redhat.com:

Looks like something I missed as I was working with a mock server. I will
have a look and will come back to you if I need more info.

2016-04-26 21:37 GMT+02:00 Matthew Wringe notifications@github.com:

This PR does not work for me and it fails to function properly. I just
deployed the war into the deployment directory of WildFly, so if there is
any other steps I need to do, please let me know.
hawkular-metrics.txt
https://github.com/hawkular/hawkular-metrics/files/237329/hawkular-metrics.txt


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#481 (comment)

Thomas Segismont
JBoss ON Engineering Team

Thomas Segismont
JBoss ON Engineering Team

@mwringe
Copy link
Contributor

mwringe commented Apr 27, 2016

I would also like to understand the difference between your environment and
Micke's.

me too, I am wondering if I made a mistake with my test containers.

@burmanm what was your setup for your test environment?

// com.codahale.metrics.Reporter instances.
metricsService.startUp(session, keyspace, false, false, new MetricRegistry());
MetricRegistry metricRegistry = MetricRegistryProvider.INSTANCE.getMetricRegistry();
jmxReporter = JmxReporter.forRegistry(metricRegistry).inDomain("hawkular.metrics").build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make JmxReporter configurable? So that it does not start unless it is explicitly configured to start?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to add another config var unless it can be useful. Why would we want to have the Metrics collected but not accessible? It is very convenient to start jconsole or Mission control and have a a view of what's going on (threads, MBeans, memory, ... etc).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm late to the game on this PR, but I tend to agree with Stefan. Unless we need to always have JMXReporter running, I would prefer to keep it off or at least be able to turn it off. This made me think of a bug with an early version of the DataStax driver that involved JMXReporter that we hit in RHQ. It may have been a thread leak. I do not recall the specifics. What I do recall though is that disabling the reporter worked around the issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a new config flag in the last commit.

@burmanm
Copy link
Contributor

burmanm commented Apr 28, 2016

Based on your comments @mwringe and @tsegismont I became suspicious and rebuilt everything.. and still can't get those errors?

Newest origin (pulled yesterday), modified the Dockerfile of origin-metrics to use "COPY" instead of fetching from repository (to be able to use locally built .war of PR481).

openjdk version "1.8.0_77"

@tsegismont tsegismont force-pushed the jira/HWKMETRICS-330 branch 2 times, most recently from 08ca1fa to 3d3c5f5 Compare April 28, 2016 08:42
@tsegismont
Copy link
Contributor Author

@burmanm good news

@mwringe
Copy link
Contributor

mwringe commented Apr 28, 2016

@burmanm did you verify that the metrics were showing in the console at all? The components should not have worked since before it was only using one type of authentication, and the system requires 2

@tsegismont
Copy link
Contributor Author

@mwringe the fix for this is in the PR since this morning

2016-04-28 13:20 GMT+02:00 Matthew Wringe notifications@github.com:

@burmanm https://github.com/burmanm did you verify that the metrics
were showing in the console at all? The components should not have worked
since before it was only using one type of authentication, and the system
requires 2


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#481 (comment)

Thomas Segismont
JBoss ON Engineering Team

@burmanm
Copy link
Contributor

burmanm commented Apr 28, 2016

@mwringe No, I did not test console

A bit of context first. Hawkular Metrics is a reactive application and
calls to a blocking API should be avoided as much as possible.

The Openshift integration library had a servlet filter mechanism to
authenticate users. The filter code used the JDK's HTTP client to call
Kubernetes' master server. While this was alright has a first
implementation (in order to provide the feature as quickly as possible),
it is a serious limitation to reach our performance goals.

There were two problems to tackle:
1. choose an async HTTP client API
2. make the filter code asynchronous

For the HTTP client, I considered a few options: Netty, vert.x,
Undertow. I picked undertow because it does not add a new dependency,
we can reuse the io threads (instead of creating yet another thread
pool). The downside is that the Undertow client API is low-level (no pool
implementation) and not well documented.

For the filter code, my first try was to use servlet async API from the
filter. But RestEasy throws an exception because it wants start the
async exchange itself. I wrote to the resteasy-user list but got no
reply. And even if we had a fix, we would have to wait for a new
RestEasy and Wildfly release. So I wrote an Undertow extension which is
executed in io threads  *before* the servlet handler is involved.

The implementation consists in pooling Undertow HTTP client connections and
filering the Metrics client requests (only dispatch to the servlet
handler if the user is authenticated).

While working on the implementation I had to find a way to share the
MetricsRegistry between the webapp code and the authenticator code. So
there's a MetricsRegistry provider class in core-util now.

Also, I enhanced the Gatling scenario file to support multiple
authentication mechanisms (none, Hawkular integration, Openshift
htpasswd file, Openshift token). I took the opportunity to document the
scenario options in the project README.

Note that performance enhancements will be more visible in environments
where Kubernetes reponse time is minimal.
Multiple authenticators can be active at the same time
@tsegismont
Copy link
Contributor Author

@burmanm @mwringe any chance you can test the latest version which fixes authentication selection today?

Disables reporting for both the driver's registry and our registry
@burmanm
Copy link
Contributor

burmanm commented Apr 29, 2016

With the newest version I can see metrics in the console.

@tsegismont
Copy link
Contributor Author

Sorry I'm lost. So do you see the same exceptions as @mwringe or not?

2016-04-29 12:11 GMT+02:00 Michael Burman notifications@github.com:

With the newest version I can see metrics in the console.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#481 (comment)

Thomas Segismont
JBoss ON Engineering Team

@burmanm
Copy link
Contributor

burmanm commented Apr 29, 2016

As said on the IRC, no I did not see any exceptions. I see the metrics charts being drawn correctly on the console.

@stefannegrea
Copy link
Contributor

Thank @tsegismont!

@stefannegrea stefannegrea merged commit 0a1aace into hawkular:master Apr 29, 2016
@tsegismont tsegismont deleted the jira/HWKMETRICS-330 branch May 2, 2016 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants