Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to configure Hystrix dashboard #68

Closed
sureshbabuanguluri opened this issue Jan 3, 2013 · 34 comments
Closed

Unable to configure Hystrix dashboard #68

sureshbabuanguluri opened this issue Jan 3, 2013 · 34 comments

Comments

@sureshbabuanguluri
Copy link

Hi,

I am new to Hystrix and trying to configure Hystrix dashboard for a sample application that i am developing. I had gone thru the documentation that is available in GitHub Wiki and was able to configure the application properly in my local environment. When i hit hystrix stream url "http://hostname:port/application/hystrix.stream", it is returning me the JSON data. But if i use the same URL in Hystrix Dashboard, it is not displaying any data in the dashboard.

I traced the request and found that the below link is not returning any response and getting failed.

http://localhost:8090/hystrix-dashboard/proxy.stream?origin=http://localhost:8090/showcase/hystrix.stream

Error message:

EventSource's response has a MIME type ("text/plain") that is not "text/event-stream". Aborting the connection.
Connection was closed on error: [object Event]

I tried searching in internet for this issue but was not able to find any solution. In addition to this problem, when hystrix dashboard is not able to establish the connection, i observed that it is chewing up all the available connections within no time (may be metrics poller) and displaying a 503 error page.

Can you please help me with the problems described above? Thanks.

-Suresh

@benjchristensen
Copy link
Contributor

Hi Suresh,

That's an odd error and I don't know off the top of my head what would be happening.

I'll try replicating it. If I can't I'll reach out for more information from you. If I can I'll submit a fix.

If I need to discuss with you interactively would you be open to getting onto IRC on the #HystrixOSS channel to chat?

@sureshbabuanguluri
Copy link
Author

Hi Ben,

Thanks for the response. I think you are talking about IRC on Twitter right? I don't have access to twitter from my office network. Is there any alternate option?

@benjchristensen
Copy link
Contributor

I am on IRC at http://freenode.net/irc_servers.shtml

The channel is #HystrixOSS

If you can't get on that we can correspond here :-)

@sureshbabuanguluri
Copy link
Author

Ok. I am unable to connect to it from my office network. I will wait for your further updates here. Thanks.

@benjchristensen
Copy link
Contributor

I haven't yet been able to replicate this scenario. The hystrix-example-webapp and hystrix-dashboard are working together without issue and in the production systems I tried against it is working.

Can you capture the output of http://localhost:8090/showcase/hystrix.stream for 30+ seconds and put it somewhere for me (perhaps a Gist?) so I can replay your stream?

The part about chewing through connections is also interesting ... that suggests that something is in a loop connecting and holding connections (which is the reason for the throttle).

@sureshbabuanguluri
Copy link
Author

Hi Ben,

I am not able to attach the hystrix stream here. I am adding hystrix stream output from "http://localhost:8090/showcase/hystrix.stream" below:

data: {"type":"HystrixCommand","name":"TestService2","group":"TestGroup2","currentTime":1357529435770,"isCircuitBreakerOpen":false,"errorPercentage":0,"errorCount":0,"requestCount":0,"rollingCountCollapsedRequests":0,"rollingCountExceptionsThrown":0,"rollingCountFailure":0,"rollingCountFallbackFailure":0,"rollingCountFallbackRejection":0,"rollingCountFallbackSuccess":0,"rollingCountResponsesFromCache":0,"rollingCountSemaphoreRejected":0,"rollingCountShortCircuited":0,"rollingCountSuccess":0,"rollingCountThreadPoolRejected":0,"rollingCountTimeout":0,"currentConcurrentExecutionCount":0,"latencyExecute_mean":0,"latencyExecute":{"0":0,"25":0,"50":0,"75":0,"90":0,"95":0,"99":0,"99.5":0,"100":0},"latencyTotal_mean":0,"latencyTotal":{"0":0,"25":0,"50":0,"75":0,"90":0,"95":0,"99":0,"99.5":0,"100":0},"propertyValue_circuitBreakerRequestVolumeThreshold":20,"propertyValue_circuitBreakerSleepWindowInMilliseconds":5000,"propertyValue_circuitBreakerErrorThresholdPercentage":50,"propertyValue_circuitBreakerForceOpen":false,"propertyValue_circuitBreakerForceClosed":false,"propertyValue_circuitBreakerEnabled":true,"propertyValue_executionIsolationStrategy":"THREAD","propertyValue_executionIsolationThreadTimeoutInMilliseconds":1000,"propertyValue_executionIsolationThreadInterruptOnTimeout":true,"propertyValue_executionIsolationThreadPoolKeyOverride":null,"propertyValue_executionIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_fallbackIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000,"propertyValue_requestCacheEnabled":true,"propertyValue_requestLogEnabled":true,"reportingHosts":1}

data: {"type":"HystrixCommand","name":"TestService1","group":"TestGroup1","currentTime":1357529435771,"isCircuitBreakerOpen":false,"errorPercentage":0,"errorCount":0,"requestCount":2,"rollingCountCollapsedRequests":0,"rollingCountExceptionsThrown":0,"rollingCountFailure":0,"rollingCountFallbackFailure":0,"rollingCountFallbackRejection":0,"rollingCountFallbackSuccess":0,"rollingCountResponsesFromCache":0,"rollingCountSemaphoreRejected":0,"rollingCountShortCircuited":0,"rollingCountSuccess":2,"rollingCountThreadPoolRejected":0,"rollingCountTimeout":0,"currentConcurrentExecutionCount":0,"latencyExecute_mean":78,"latencyExecute":{"0":0,"25":3,"50":15,"75":273,"90":359,"95":359,"99":359,"99.5":359,"100":359},"latencyTotal_mean":78,"latencyTotal":{"0":0,"25":3,"50":15,"75":273,"90":359,"95":359,"99":359,"99.5":359,"100":359},"propertyValue_circuitBreakerRequestVolumeThreshold":20,"propertyValue_circuitBreakerSleepWindowInMilliseconds":5000,"propertyValue_circuitBreakerErrorThresholdPercentage":50,"propertyValue_circuitBreakerForceOpen":false,"propertyValue_circuitBreakerForceClosed":false,"propertyValue_circuitBreakerEnabled":true,"propertyValue_executionIsolationStrategy":"THREAD","propertyValue_executionIsolationThreadTimeoutInMilliseconds":1000,"propertyValue_executionIsolationThreadInterruptOnTimeout":true,"propertyValue_executionIsolationThreadPoolKeyOverride":null,"propertyValue_executionIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_fallbackIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000,"propertyValue_requestCacheEnabled":true,"propertyValue_requestLogEnabled":true,"reportingHosts":1}

As you mentioned that hystrix dashboard is working fine with the example webapp, tomorrow i will download it and try with that. I will let you know the results.

Note: I am using JBOSS EAP 6 application server.

Thanks,
Suresh

@benjchristensen
Copy link
Contributor

@sureshbabuanguluri Are you using Turbine or pointing directly at a Hystrix instance?

Several people have reported issues with Turbine that we're trying to track down.

@sureshbabuanguluri
Copy link
Author

Hi Ben,

I am using hystrix steam (not Turbine) as i deployed my application in one node.

@sureshbabuanguluri
Copy link
Author

Hi Ben,

Just want to inform you that i tried to configure hystrix stream with a simple Spring MVC application and hystrix dashboard is still not capturing the metrics. However when i hit the stream URL, i am able to view the metrics data in JSON format.

I am really not sure what is going on. I will continue troubleshooting it from my end.

@benjchristensen
Copy link
Contributor

When you access the stream via curl you should see it continually output metrics.

I assume that is what you're seeing and that in your example above you only pasted the first 2 lines of output?

Can you please try the updated dashboard as part of version 1.2.0 that I released a couple hours ago?

It has a fix on a similar bug that I'd like to make sure isn't being triggered by your use case.

You can download at: http://search.maven.org/remotecontent?filepath=com/netflix/hystrix/hystrix-dashboard/1.2.0/hystrix-dashboard-1.2.0.war

Or you can now easily run from command-line for quick testing without deploying a WAR:

$ pwd
/Hystrix/hystrix-dashboard
$ ../gradlew jettyRun

@benjchristensen
Copy link
Contributor

@sureshbabuanguluri By chance is the Spring MVC application doing anything to prevent the servlet outputstream from flushing?

I remember an issue we had in Netflix where an app wasn't working like this and it ended up being that there was a servlet filter configured on the system that intercepted and prevented response.flush() from being called.

One way you can see if this is happening is by accessing the stream and keep it running for 10+ seconds. You should constantly see data being written out every 500-1000ms.

If you only see output every once in a while (when buffers fill) or at the end of the connection when it's closing then this is potentially the issue.

Can you also send me a longer sample of the stream? The previous example only included 2 lines of data. If the stream does continue outputting data let it go until you see 10-20 lines of data and then paste that here.

@sureshbabuanguluri
Copy link
Author

Hi Ben,

I am not seeing more than 2 lines of data whenever i try to access the hystrix stream URL. As you mentioned, i was also expecting to see continuous data in the stream rather than just two lines of data..

The spring MVC example that i created is simple application with just one controller in it:

Controller:

import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.servlet.ModelAndView;

import com.genworth.framework.service.TestService1;
import com.genworth.framework.service.TestService2;

@controller
public class HelloWorldController {

@RequestMapping("/dummy")
public ModelAndView dummy() {
    return new ModelAndView("dummy");
}

@RequestMapping("/testService1")
public ModelAndView testWebService1() {
    new TestService1(3, 4).execute();
    return new ModelAndView("dummy");
}

@RequestMapping("/testService2")
public ModelAndView testWebService2() {
    new TestService2("suresh").execute();
    return new ModelAndView("dummy");
}

}

TestService1:

import java.rmi.RemoteException;

import javax.xml.rpc.ServiceException;

import service.TestWebService1;
import service.TestWebService1ServiceLocator;

import com.netflix.hystrix.HystrixCommand;
import com.netflix.hystrix.HystrixCommandGroupKey;

public class TestService1 extends HystrixCommand {

private final int x;
private final int y;
private int sum;

public TestService1(final int x, final int y) {
    super(HystrixCommandGroupKey.Factory.asKey("TestGroup1"));
    this.x = x;
    this.y = y;
}

@Override
protected Integer run() {
    final TestWebService1ServiceLocator serviceLocator = new TestWebService1ServiceLocator();
    TestWebService1 request;
    try {
        request = serviceLocator.getTestWebService1();
        sum = request.addIntegers(x, y);

    } catch (final ServiceException e1) {
        e1.printStackTrace();
    } catch (final RemoteException e) {
        e.printStackTrace();
    }

    System.out.println(sum);
    return sum;
}

}

TestService2:

import java.rmi.RemoteException;

import javax.xml.rpc.ServiceException;

import service.TestWebService2;
import service.TestWebService2ServiceLocator;

import com.netflix.hystrix.HystrixCommand;
import com.netflix.hystrix.HystrixCommandGroupKey;

public class TestService2 extends HystrixCommand {

private final String name;
private String reversedName;

public TestService2(final String name) {
    super(HystrixCommandGroupKey.Factory.asKey("TestGroup2"));
    this.name = name;
}

@Override
protected String run() {
    final TestWebService2ServiceLocator serviceLocator = new TestWebService2ServiceLocator();
    TestWebService2 request;
    try {
        request = serviceLocator.getTestWebService2();
        System.out.println(name);
        reversedName = request.reverseName(name);
        System.out.println(reversedName);

    } catch (final ServiceException e1) {
        e1.printStackTrace();
    } catch (final RemoteException e) {
        e.printStackTrace();
    }

    System.out.println(reversedName);
    return reversedName;
}

}

web.xml:

<servlet>
    <description></description>
    <display-name>HystrixMetricsStreamServlet</display-name>
    <servlet-name>HystrixMetricsStreamServlet</servlet-name>
    <servlet-class>com.netflix.hystrix.contrib.metrics.eventstream.HystrixMetricsStreamServlet</servlet-class>
</servlet>

<servlet-mapping>
    <servlet-name>HystrixMetricsStreamServlet</servlet-name>
    <url-pattern>/hystrix.stream</url-pattern>
</servlet-mapping>

<servlet>
    <servlet-name>spring</servlet-name>
    <servlet-class>org.springframework.web.servlet.DispatcherServlet</servlet-class>
</servlet>
<servlet-mapping>
    <servlet-name>jsp</servlet-name>
    <url-pattern>/WEB-INF/jsp/*</url-pattern>
</servlet-mapping>
<servlet-mapping>
    <servlet-name>spring</servlet-name>
    <url-pattern>/*</url-pattern>
</servlet-mapping>

I created a dummy page which contains two links to invoke two TestServices that i had written. What i am doing here is, i pull up this page and click on the links to invoke the services and at the same time i am trying to capture the data from the stream. I tried using the hystrix dashboard new WAR file but no luck. I also tried deploying this application in another app server but it didn't work either.

Please let me know if i am missing anything. Thanks.

@benjchristensen
Copy link
Contributor

The code and web.xml config looks fine.

Does the connection close immediately after outputting the 2 lines or does it just sit there with an open connection and never output anything more?

@benjchristensen
Copy link
Contributor

By chance do you have any configurations such as this set?

-Dorg.jboss.ws.domwriter.FlushOnlyOnce=true

Are there any servlet filters configured in your web.xml?

@sureshbabuanguluri
Copy link
Author

It doesn't close the connection immediately.. it tries to fetch the data after getting the first two lines of the JSON data and after sometime, connection is getting closed. If i try to refresh the browser window, it gives me the 503 error saying "reached maximum number of open connections".

@benjchristensen
Copy link
Contributor

And this is all being done with JBoss EAP? What was the other app server you said you tried?

@sureshbabuanguluri
Copy link
Author

Hi Ben,

I have gone thru all the JBOSS configuration files and it doesn't contain the setting that you mentioned above. There are no filters in the sample application that i created. I tried with JBOSS EAP6 and Tomcat 7 application servers

@benjchristensen
Copy link
Contributor

Tomcat 7 should definitely be working as that's what we use. I'm going to implement an app with your code from above and see if I can replicate.

@sureshbabuanguluri
Copy link
Author

Sure Ben. I used Apache Tomcat 7.0.14 version and deployed sample Spring MVC app, hystrix dashboard and the sample web services on this tomcat instance. I observed the same behavior there as well.

@benjchristensen
Copy link
Contributor

I used your sample code above to create a basic webapp without Spring.

I uploaded it here: https://github.com/benjchristensen/HystrixWebPlayground/tree/master/src/testing

You can see the web.xml here: https://github.com/benjchristensen/HystrixWebPlayground/blob/master/WebContent/WEB-INF/web.xml

I access it like this:

curl http://localhost:8888/SampleHystrixApp/testService2?name=benjamin
curl "http://localhost:8888/SampleHystrixApp/testService1?x=5&y=9"

I access the stream like this:

curl http://localhost:8888/SampleHystrixApp/hystrix.stream

Data such as this comes from the stream:

data: {"type":"HystrixThreadPool","name":"TestGroup2","currentTime":1357681038017,"currentActiveCount":0,"currentCompletedTaskCount":7,"currentCorePoolSize":10,"currentLargestPoolSize":7,"currentMaximumPoolSize":10,"currentPoolSize":7,"currentQueueSize":0,"currentTaskCount":7,"rollingCountThreadsExecuted":0,"rollingMaxActiveThreads":0,"propertyValue_queueSizeRejectionThreshold":5,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000}

data: {"type":"HystrixCommand","name":"TestService2","group":"TestGroup2","currentTime":1357681038518,"isCircuitBreakerOpen":false,"errorPercentage":0,"errorCount":0,"requestCount":0,"rollingCountCollapsedRequests":0,"rollingCountExceptionsThrown":0,"rollingCountFailure":0,"rollingCountFallbackFailure":0,"rollingCountFallbackRejection":0,"rollingCountFallbackSuccess":0,"rollingCountResponsesFromCache":0,"rollingCountSemaphoreRejected":0,"rollingCountShortCircuited":0,"rollingCountSuccess":0,"rollingCountThreadPoolRejected":0,"rollingCountTimeout":0,"currentConcurrentExecutionCount":0,"latencyExecute_mean":1,"latencyExecute":{"0":0,"25":0,"50":0,"75":2,"90":9,"95":9,"99":9,"99.5":9,"100":9},"latencyTotal_mean":1,"latencyTotal":{"0":0,"25":0,"50":0,"75":3,"90":9,"95":9,"99":9,"99.5":9,"100":9},"propertyValue_circuitBreakerRequestVolumeThreshold":20,"propertyValue_circuitBreakerSleepWindowInMilliseconds":5000,"propertyValue_circuitBreakerErrorThresholdPercentage":50,"propertyValue_circuitBreakerForceOpen":false,"propertyValue_circuitBreakerForceClosed":false,"propertyValue_circuitBreakerEnabled":true,"propertyValue_executionIsolationStrategy":"THREAD","propertyValue_executionIsolationThreadTimeoutInMilliseconds":1000,"propertyValue_executionIsolationThreadInterruptOnTimeout":true,"propertyValue_executionIsolationThreadPoolKeyOverride":null,"propertyValue_executionIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_fallbackIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000,"propertyValue_requestCacheEnabled":true,"propertyValue_requestLogEnabled":true,"reportingHosts":1}

data: {"type":"HystrixCommand","name":"TestService1","group":"TestGroup1","currentTime":1357681038519,"isCircuitBreakerOpen":false,"errorPercentage":0,"errorCount":0,"requestCount":0,"rollingCountCollapsedRequests":0,"rollingCountExceptionsThrown":0,"rollingCountFailure":0,"rollingCountFallbackFailure":0,"rollingCountFallbackRejection":0,"rollingCountFallbackSuccess":0,"rollingCountResponsesFromCache":0,"rollingCountSemaphoreRejected":0,"rollingCountShortCircuited":0,"rollingCountSuccess":0,"rollingCountThreadPoolRejected":0,"rollingCountTimeout":0,"currentConcurrentExecutionCount":0,"latencyExecute_mean":0,"latencyExecute":{"0":0,"25":0,"50":0,"75":1,"90":1,"95":1,"99":1,"99.5":1,"100":1},"latencyTotal_mean":0,"latencyTotal":{"0":0,"25":0,"50":0,"75":1,"90":1,"95":1,"99":1,"99.5":1,"100":1},"propertyValue_circuitBreakerRequestVolumeThreshold":20,"propertyValue_circuitBreakerSleepWindowInMilliseconds":5000,"propertyValue_circuitBreakerErrorThresholdPercentage":50,"propertyValue_circuitBreakerForceOpen":false,"propertyValue_circuitBreakerForceClosed":false,"propertyValue_circuitBreakerEnabled":true,"propertyValue_executionIsolationStrategy":"THREAD","propertyValue_executionIsolationThreadTimeoutInMilliseconds":1000,"propertyValue_executionIsolationThreadInterruptOnTimeout":true,"propertyValue_executionIsolationThreadPoolKeyOverride":null,"propertyValue_executionIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_fallbackIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000,"propertyValue_requestCacheEnabled":true,"propertyValue_requestLogEnabled":true,"reportingHosts":1}

data: {"type":"HystrixThreadPool","name":"TestGroup1","currentTime":1357681038519,"currentActiveCount":0,"currentCompletedTaskCount":15,"currentCorePoolSize":10,"currentLargestPoolSize":10,"currentMaximumPoolSize":10,"currentPoolSize":10,"currentQueueSize":0,"currentTaskCount":15,"rollingCountThreadsExecuted":0,"rollingMaxActiveThreads":0,"propertyValue_queueSizeRejectionThreshold":5,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000}

data: {"type":"HystrixThreadPool","name":"TestGroup2","currentTime":1357681038519,"currentActiveCount":0,"currentCompletedTaskCount":7,"currentCorePoolSize":10,"currentLargestPoolSize":7,"currentMaximumPoolSize":10,"currentPoolSize":7,"currentQueueSize":0,"currentTaskCount":7,"rollingCountThreadsExecuted":0,"rollingMaxActiveThreads":0,"propertyValue_queueSizeRejectionThreshold":5,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000}

data: {"type":"HystrixCommand","name":"TestService2","group":"TestGroup2","currentTime":1357681039020,"isCircuitBreakerOpen":false,"errorPercentage":0,"errorCount":0,"requestCount":0,"rollingCountCollapsedRequests":0,"rollingCountExceptionsThrown":0,"rollingCountFailure":0,"rollingCountFallbackFailure":0,"rollingCountFallbackRejection":0,"rollingCountFallbackSuccess":0,"rollingCountResponsesFromCache":0,"rollingCountSemaphoreRejected":0,"rollingCountShortCircuited":0,"rollingCountSuccess":0,"rollingCountThreadPoolRejected":0,"rollingCountTimeout":0,"currentConcurrentExecutionCount":0,"latencyExecute_mean":1,"latencyExecute":{"0":0,"25":0,"50":0,"75":2,"90":9,"95":9,"99":9,"99.5":9,"100":9},"latencyTotal_mean":1,"latencyTotal":{"0":0,"25":0,"50":0,"75":3,"90":9,"95":9,"99":9,"99.5":9,"100":9},"propertyValue_circuitBreakerRequestVolumeThreshold":20,"propertyValue_circuitBreakerSleepWindowInMilliseconds":5000,"propertyValue_circuitBreakerErrorThresholdPercentage":50,"propertyValue_circuitBreakerForceOpen":false,"propertyValue_circuitBreakerForceClosed":false,"propertyValue_circuitBreakerEnabled":true,"propertyValue_executionIsolationStrategy":"THREAD","propertyValue_executionIsolationThreadTimeoutInMilliseconds":1000,"propertyValue_executionIsolationThreadInterruptOnTimeout":true,"propertyValue_executionIsolationThreadPoolKeyOverride":null,"propertyValue_executionIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_fallbackIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000,"propertyValue_requestCacheEnabled":true,"propertyValue_requestLogEnabled":true,"reportingHosts":1}

data: {"type":"HystrixCommand","name":"TestService1","group":"TestGroup1","currentTime":1357681039021,"isCircuitBreakerOpen":false,"errorPercentage":0,"errorCount":0,"requestCount":0,"rollingCountCollapsedRequests":0,"rollingCountExceptionsThrown":0,"rollingCountFailure":0,"rollingCountFallbackFailure":0,"rollingCountFallbackRejection":0,"rollingCountFallbackSuccess":0,"rollingCountResponsesFromCache":0,"rollingCountSemaphoreRejected":0,"rollingCountShortCircuited":0,"rollingCountSuccess":0,"rollingCountThreadPoolRejected":0,"rollingCountTimeout":0,"currentConcurrentExecutionCount":0,"latencyExecute_mean":0,"latencyExecute":{"0":0,"25":0,"50":0,"75":1,"90":1,"95":1,"99":1,"99.5":1,"100":1},"latencyTotal_mean":0,"latencyTotal":{"0":0,"25":0,"50":0,"75":1,"90":1,"95":1,"99":1,"99.5":1,"100":1},"propertyValue_circuitBreakerRequestVolumeThreshold":20,"propertyValue_circuitBreakerSleepWindowInMilliseconds":5000,"propertyValue_circuitBreakerErrorThresholdPercentage":50,"propertyValue_circuitBreakerForceOpen":false,"propertyValue_circuitBreakerForceClosed":false,"propertyValue_circuitBreakerEnabled":true,"propertyValue_executionIsolationStrategy":"THREAD","propertyValue_executionIsolationThreadTimeoutInMilliseconds":1000,"propertyValue_executionIsolationThreadInterruptOnTimeout":true,"propertyValue_executionIsolationThreadPoolKeyOverride":null,"propertyValue_executionIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_fallbackIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000,"propertyValue_requestCacheEnabled":true,"propertyValue_requestLogEnabled":true,"reportingHosts":1}

data: {"type":"HystrixThreadPool","name":"TestGroup1","currentTime":1357681039021,"currentActiveCount":0,"currentCompletedTaskCount":15,"currentCorePoolSize":10,"currentLargestPoolSize":10,"currentMaximumPoolSize":10,"currentPoolSize":10,"currentQueueSize":0,"currentTaskCount":15,"rollingCountThreadsExecuted":0,"rollingMaxActiveThreads":0,"propertyValue_queueSizeRejectionThreshold":5,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000}

data: {"type":"HystrixThreadPool","name":"TestGroup2","currentTime":1357681039021,"currentActiveCount":0,"currentCompletedTaskCount":7,"currentCorePoolSize":10,"currentLargestPoolSize":7,"currentMaximumPoolSize":10,"currentPoolSize":7,"currentQueueSize":0,"currentTaskCount":7,"rollingCountThreadsExecuted":0,"rollingMaxActiveThreads":0,"propertyValue_queueSizeRejectionThreshold":5,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000}

Note that the stream will NOT output any data for a HystrixCommand until that command has been invoked at least once.

The only difference between this and what you're doing is the involvement of Spring as far as I can tell.

@sureshbabuanguluri
Copy link
Author

Hi Ben,

Thanks for your response. I created a simple Servlet example as you mentioned above and tried to check the stream. I am still not able to see the continuous data output in hystrix stream even if invoke the hystrix command service multiple times using curl (as shown below) or using the web browser.

curl http://localhost:8080/SampleHystrixWebApp/testService2?name=benjamin
curl "http://localhost:8080/SampleHystrixWebApp/testService1?x=5&y=9"

If i enter the stream URL, http://localhost:8080/SampleHystrixWebApp/hystrix.stream in a web browser, i am seeing the following JSON response.

data: {"type":"HystrixCommand","name":"TestService2","group":"TestGroup2","curre
ntTime":1357705766850,"isCircuitBreakerOpen":false,"errorPercentage":0,"errorCou
nt":0,"requestCount":0,"rollingCountCollapsedRequests":0,"rollingCountExceptions
Thrown":0,"rollingCountFailure":0,"rollingCountFallbackFailure":0,"rollingCountF
allbackRejection":0,"rollingCountFallbackSuccess":0,"rollingCountResponsesFromCa
che":0,"rollingCountSemaphoreRejected":0,"rollingCountShortCircuited":0,"rolling
CountSuccess":0,"rollingCountThreadPoolRejected":0,"rollingCountTimeout":0,"curr
entConcurrentExecutionCount":0,"latencyExecute_mean":0,"latencyExecute":{"0":0,"
25":0,"50":0,"75":0,"90":0,"95":0,"99":0,"99.5":0,"100":0},"latencyTotal_mean":0
,"latencyTotal":{"0":0,"25":0,"50":0,"75":0,"90":0,"95":0,"99":0,"99.5":0,"100":
0},"propertyValue_circuitBreakerRequestVolumeThreshold":20,"propertyValue_circui
tBreakerSleepWindowInMilliseconds":5000,"propertyValue_circuitBreakerErrorThresh
oldPercentage":50,"propertyValue_circuitBreakerForceOpen":false,"propertyValue_c
ircuitBreakerForceClosed":false,"propertyValue_circuitBreakerEnabled":true,"prop
ertyValue_executionIsolationStrategy":"THREAD","propertyValue_executionIsolation
ThreadTimeoutInMilliseconds":1000,"propertyValue_executionIsolationThreadInterru
ptOnTimeout":true,"propertyValue_executionIsolationThreadPoolKeyOverride":null,"
propertyValue_executionIsolationSemaphoreMaxConcurrentRequests":10,"propertyValu
e_fallbackIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_metricsRoll
ingStatisticalWindowInMilliseconds":10000,"propertyValue_requestCacheEnabled":tr
ue,"propertyValue_requestLogEnabled":true,"reportingHosts":1}

data: {"type":"HystrixCommand","name":"TestService1","group":"TestGroup1","curre
ntTime":1357705766866,"isCircuitBreakerOpen":false,"errorPercentage":0,"errorCou
nt":0,"requestCount":0,"rollingCountCollapsedRequests":0,"rollingCountExceptions
Thrown":0,"rollingCountFailure":0,"rollingCountFallbackFailure":0,"rollingCountF
allbackRejection":0,"rollingCountFallbackSuccess":0,"rollingCountResponsesFromCa
che":0,"rollingCountSemaphoreRejected":0,"rollingCountShortCircuited":0,"rolling
CountSuccess":0,"rollingCountThreadPoolRejected":0,"rollingCountTimeout":0,"curr
entConcurrentExecutionCount":0,"latencyExecute_mean":0,"latencyExecute":{"0":0,"
25":0,"50":0,"75":0,"90":0,"95":0,"99":0,"99.5":0,"100":0},"latencyTotal_mean":0
,"latencyTotal":{"0":0,"25":0,"50":0,"75":0,"90":0,"95":0,"99":0,"99.5":0,"100":
0},"propertyValue_circuitBreakerRequestVolumeThreshold":20,"propertyValue_circui
tBreakerSleepWindowInMilliseconds":5000,"propertyValue_circuitBreakerErrorThresh
oldPercentage":50,"propertyValue_circuitBreakerForceOpen":false,"propertyValue_c
ircuitBreakerForceClosed":false,"propertyValue_circuitBreakerEnabled":true,"prop
ertyValue_executionIsolationStrategy":"THREAD","propertyValue_executionIsolation
ThreadTimeoutInMilliseconds":1000,"propertyValue_executionIsolationThreadInterru
ptOnTimeout":true,"propertyValue_executionIsolationThreadPoolKeyOverride":null,"
propertyValue_executionIsolationSemaphoreMaxConcurrentRequests":10,"propertyValu
e_fallbackIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_metricsRoll
ingStatisticalWindowInMilliseconds":10000,"propertyValue_requestCacheEnabled":tr
ue,"propertyValue_requestLogEnabled":true,"reportingHosts":1}

As the stream is not continuous, i tried refreshing the web browser and it gave me maximum number of connections reached (HTTP 503) error. Did you make any changes to tomcat app server configuration before deploying these applications?

Tomorrow, i will try to deploy the same app and hystrix dashboard war file in a different machine just to make sure that this is not an issue with my machine.

Thanks,
Suresh

@sureshbabuanguluri
Copy link
Author

Hi Ben,

I tried to deploy the application in a different machine but it didn't make any difference. It looks like the issue is related to the threading. When i stop the tomcat application server, i see the following log trace:


SEVERE: The web application [/SampleHystrixWebApp] is still processing a request that has yet to finish. This is very likely to create a memory leak. You can control the time allowed for requests to finish by using the unloadDelay attribute of the standard Context implementation.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] is still processing a request that has yet to finish. This is very likely to create a memory leak. You can control the time allowed for requests to finish by using the unloadDelay attribute of the standard Context implementation.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] is still processing a request that has yet to finish. This is very likely to create a memory leak. You can control the time allowed for requests to finish by using the unloadDelay attribute of the standard Context implementation.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup2-1] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] is still processing a request that has yet to finish. This is very likely to create a memory leak. You can control the time allowed for requests to finish by using the unloadDelay attribute of the standard Context implementation.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] is still processing a request that has yet to finish. This is very likely to create a memory leak. You can control the time allowed for requests to finish by using the unloadDelay attribute of the standard Context implementation.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup1-1] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup2-2] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup1-2] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup2-3] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup1-3] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup2-4] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup1-4] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup2-5] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup1-5] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup2-6] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup1-6] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup2-7] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup1-7] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup2-8] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup1-8] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup2-9] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup1-9] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup2-10] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [hystrix-TestGroup1-10] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [HystrixMetricPoller] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [HystrixMetricPoller] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [HystrixMetricPoller] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [HystrixMetricPoller] but has failed to stop it. This is very likely to create a memory leak.
Jan 09, 2013 3:21:38 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/SampleHystrixWebApp] appears to have started a thread named [HystrixMetricPoller] but has failed to stop it. This is very likely to create a memory leak.


It is creating threads for each call to HystrixCommand, i.e. TestService1 and TestService2 and for some reason these threads are not getting closed. In the above log trace, you may see there are around 10 threads created for TestService1, TestService2 and 5 threads are created for HystrixMetricPoller. It seems they are waiting on something and not getting closed properly.

I have gone thru the HystricMetricStreamServlet code and it looks like it is creating an instance of HystrixMetricPoller and adding a delay to it. This Poller continuously checks (with a delay of 500 ms) jsonListener and dumps the json metrics data to HTTPResponse. In this case, only one HystrixMetricsPoller thread should be writing the json data to HTTPResponse.

Do i have to explicitly close the threads that were created by HystrixCommand or HystrixMetricPoller?

@benjchristensen
Copy link
Contributor

The issue with 503s is probably related to this bug: #85

It happens when a system comes up with no traffic but something tries to start monitoring it (such as the dashboard).

We fixed it yesterday and I'll release today in 1.2.3 or you can build from source before I do that.

@benjchristensen
Copy link
Contributor

There are lots of threads created in Hystrix - by default every command is associated with a threadpool that is used for isolation. That's why you see all of the threads.

You can learn more about that here: https://github.com/Netflix/Hystrix/wiki/How-it-Works#wiki-Isolation

No you do not need to explicitly close threads.

I am curious about the Tomcat issue shutting down. Are you trying to do dynamic deploy/shutdown/redeploy without actually terminating the JVM? If so that is honestly not something I or Netflix does so we have not tested those use cases at all and it's very likely we have threads that won't shutdown if only the WAR is removed but not the entire JVM shutdown.

@benjchristensen
Copy link
Contributor

Going back to the root question and your comment after your latest test "the stream is not continuous".

Can you please deploy the hystrix-example-webapp and prove that it's working for you with a continuous stream?

It seems like something about your app (perhaps the involvement of Spring) is affecting how the outputstream is buffering and flushing responses.

You can run the example webapp either using Gradle from command line or by deploying the WAR to your container.

Documentation on the Gradle approach is here: https://github.com/Netflix/Hystrix/tree/master/hystrix-examples-webapp

@benjchristensen
Copy link
Contributor

I just release 1.2.3 to Maven Central so it should show up there in a couple hours.

@sureshbabuanguluri
Copy link
Author

Hi Ben

Sorry for the delay in getting back to you. I downloaded hystrix dashboard 1.2.3 and deployed it to my tomcat app server. I also deployed the hystrix examples web app and it is working fine. I am able to see the continuous stream and in the dashboard, i am able to see the change in success or failure transactions rate or % whenever i hit the "http://localhost:8989/hystrix-examples-webapp/" url. I am not seeing any HTTP 503 errors when i try to hit the stream URL here.

The sample app that i am trying to make doesn't have spring dependancy. It is the simple servlet application that you created a week ago. I used the same code from the links that you provided and created a sample web app example. However, it is not working like the way that you described for me. I am not sure if i am missing any additional configuration steps.

For your question on tomcat shut down process, i am not doing the dynamic deployment. When i get the HTTP 503 error from the stream, i have to restart the tomcat application server. When i try to shutdown and restart the server, i get the errors that i mentioned above.

I have uploaded the sample example that i created here (WAR file includes the source code as well) - https://docs.google.com/file/d/0BzVSq9aruWdNWUxKQXNvOVFGQ3M/edit

URLs to invoke HystrixCommand -

http://localhost:8080/SampleHystrixWebApp/testService2?name=suresh
http://localhost:8080/SampleHystrixWebApp/testService1?x=7&y=10

Hystrix Stream URL -

http://localhost:8080/SampleHystrixWebApp/hystrix.stream

Could you please go thru it and let me know if i am doing anything wrong?

Thanks,
Suresh

@sureshbabuanguluri
Copy link
Author

Hi Ben,

Just want to inform you that i found the solution to the problem i have been facing. I downloaded the hystrix examples web app and compared it with the sample application that i created. Sample application that i created is missing "hystrix-request-servlet-1.1.7.jar".

As per the hystrix wiki documentation, we have to include the following two dependancies:

com.netflix.hystrix hystrix-core 1.2.2 com.netflix.hystrix hystrix-metrics-event-stream 1.1.2

but none of them contains "hystrix-request-servlet-1.1.7.jar" file. Due to this my sample code was not working. Should this be part of "hystrix-core" dependancy in maven?

@benjchristensen
Copy link
Contributor

hystrix-request-servlet doesn't contain anything that hystrix-metrics-event-stream needs. They are independent of each other.

If you included references to it inside web.xml (as documented here https://github.com/Netflix/Hystrix/tree/master/hystrix-contrib/hystrix-request-servlet) then yes, the jar would be needed, but it's not necessary for using Hystrix or using hystrix-request-servlet.

@benjchristensen
Copy link
Contributor

Also, the docs.google.com link is not working so I can't download that file.

@sureshbabuanguluri
Copy link
Author

Hi Ben,

Thanks for providing the information on hystrix-request-servlet. As my web.xml doesn't contain any references to hystrix-request-servlet, i went back and checked what other changes that i made to the project on Jan 16th and realized that i upgraded hystix-core from 1.1.1 to 1.2.2. So, upgrading hystrix-core library should've fixed my problem.

I removed hystrix-request-servlet from the project and also integrated it with Spring MVC. It is working fine. Thank you for your assistance on this problem.

@benjchristensen
Copy link
Contributor

Glad to hear it's working for you now.

mattrjacobs added a commit that referenced this issue Apr 6, 2015
BatchHystrixCommand has no sence and doesn't collapse requests #68
@michael-barker
Copy link

I had this same issue but mine turned out to be caused by nginx with buffering enabled. Setting 'proxy_buffering off' for the location containing the hystrix.stream fixed the issue.

@khushbuv
Copy link

Hi I am trying configure hystrix fo rmy project, Circuitbreaker is working but i am not able to see hystrix.stream endpoint. Have included actuator and spring cloud starter hystrix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants