Wrong version of jolokia #1725

Closed
hchehade opened this Issue Nov 10, 2014 · 52 comments

Projects

None yet

9 participants

@hchehade

Jolokia V122 is causing multitude of errors. Many entries in JMX and Camel are missing.
Setting the dependency to jolokia V121 would resolve these problems.

@davsclaus
Member

What version of hawtio are you using?

And are you sure its jolokia problem?
Maybe check what 1.2.1 returns vs 1.2.2 returns?

@hchehade
  • We are using hawtio 1.4.29
  • jolokia 1.2.2 json is more compact and the return is near half size of the 1.2.1.
    We have made tests using remote agents :
  • Remote connection to a jolokia 1.2.1 resulted in a full set of camel routes and JMX entries.
  • Remote connection to a jolokia 1.2.2 agent resulted in a reduced set of camel routes and JMX entries as well as a lot of angular and javascript errors.
  • Using hawtio with the embedded jolokia agent has the same results of the remote connection to a jolokia 1.2.2 agent. This was expected since 1.4.29 version integrates a 1.2.2 version of jolokia.
@rhuss
Contributor
rhuss commented Nov 10, 2014

That sounds strange since jolokia per design itself is completely agnostic about the MBean registered, i.e. about the content it serves. So I doubt that Jolokia has any influence about the list of MBeans returned (regardless of version).

Could you post somehow (pastebin, gist or so) the output of 'http://host:port/hawtio/jolokia/list' of both, the 1.2.1 and 1.2.2 installations ? (BTW, is it the same server only with the agents exchanged ?)

(BTW, 1.2.3 is out since yesterday)

@jstrachan
Member

I've upgraded hawtio to 1.2.3 BTW (I wonder if that fixes the login to secure jolokia machines using the Google Chrome Extension? :).

We'll be cutting a release soon I expect...

@rhuss
Contributor
rhuss commented Nov 10, 2014

Sorry, not yet. But it is on my Christmas Wishlist (which is the list where I put in things which I'm going tackle in the 'calm' time before christmas ;-)

@jstrachan
Member

@rhuss OK thanks! I'll put it on my list for Santa :)

@hchehade

POST .../hawtio-default-offline-1.4.29-orig/jolokia/?maxDepth=7&maxCollectionSize=500&ignoreErrors=true&canonicalNaming=false
Content-Type: application/json
{"type":"list"}
-- response --
200 OK
Cache-Control: no-cache
Date: Mon, 10 Nov 2014 13:54:08 GMT
Pragma: no-cache
Transfer-Encoding: chunked
Content-Type: text/plain; charset=UTF-8
Expires: Mon, 10 Nov 2014 12:54:10 GMT
X-Frame-Options: SAMEORIGIN
Access-Control-Allow-Origin: *
=> response size = 4015102 Bytes

whereas

POST .../hawtio-default-offline-1.4.29-orig/jolokia/?maxDepth=7&maxCollectionSize=500&ignoreErrors=true&canonicalNaming=false
Content-Type: application/json
{"type":"list"}
-- response --
200 OK
Cache-Control: no-cache
Date: Mon, 10 Nov 2014 13:54:08 GMT
Pragma: no-cache
Transfer-Encoding: chunked
Content-Type: text/plain; charset=UTF-8
Expires: Mon, 10 Nov 2014 12:54:10 GMT
X-Frame-Options: SAMEORIGIN
Access-Control-Allow-Origin: *
=> response size = 16955227 Bytes

@hchehade

sorry :
POST .../hawtio-default-offline-1.4.29-orig/jolokia/?maxDepth=7&maxCollectionSize=500&ignoreErrors=true&canonicalNaming=false
Content-Type: application/json
{"type":"list"}
Response size = 4015102 Bytes

whereas

POST .../hawtio-1.4.29-modified/jolokia/?maxDepth=7&maxCollectionSize=500&ignoreErrors=true&canonicalNaming=false
Content-Type: application/json
{"type":"list"}
Response size 16955227 Bytes

@rhuss
Contributor
rhuss commented Nov 10, 2014

Hmm, it hard to say what happens. You did only exchange jolokia.jar in the modified version ? Could you try with ignoreErrors=false ? Could you please vary the maxDepth param, starting with 1 and see when the output size differ ?

@hchehade

setting ignoreErrors=false didn't change anything in the resulted json

@rhuss
Contributor
rhuss commented Nov 10, 2014

Can we start from the top-level (maxDepth=1) down until there is a difference. Could you post then the content (if is not too large and non-cofidential) somewhere ?

@hchehade

the size is never the same when comparing between versions.
The size of the json response is 16MB , and I can't post it for security reasons.

@hchehade

version 1.4.30 of hawtio which has just been released doesn't resolve the issue of reduced camel set and missing jmx entries although it uses jolokia v 1.2.3.

@gashcrumb
Member

Have you tried bumping up the maxDepth setting in hawtio's preferences? This and the maxCollectionSize setting have a direct correlation to what hawtio fetches when it grabs the JMX tree and other JMX responses. Also, how is your camel context configured, or more importantly can you supply an example that replicates the problem so it can be investigated by the team? Thanks!

@rhuss
Contributor
rhuss commented Nov 12, 2014

@hchehade could please try the folowing approach:

  • i=1
  • Fetch http://host:port/jolokia/list?maxDepth=$i on both variants.
  • Does the output differ ?
  • If yes, could you examine the output of both lists for differences. Should be feasible if $i is small.
  • If no, increase $i and repeat.

Sorry, since your data is confidential I have no other approach yet. As @gashcrumb said, ideally an isolated example would be perfect in order to reproduce the issue.

@hchehade

you can reproduce the issue if you install hawtio on a bare weblogic server.
You will see that there are some entries missing in the JMX page section.

@rhuss
Contributor
rhuss commented Nov 13, 2014

Which version of Weblogic ?

@hchehade

12.1

@davsclaus
Member

I wonder if this is a WebLogic issue only, eg running hawtio on Tomcat et all seems working fine.

@hchehade

it is not an issue of hawtio. It is rather an issue of jolokia. If you install hawtio on tomcat and access a remote jolokia agent on weblogic, you will have the same issue.

@rhuss
Contributor
rhuss commented Nov 13, 2014

At the beginning of next week I can have a look at a Weblogic installation 12.1 and will check this out.

@hchehade

So were you able to reproduce the issue??

@rhuss
Contributor
rhuss commented Nov 22, 2014

Sorry not yet. Still have to install a weblogic 12.1 since I only have a WLS 11g (10.3.6.0) at hand.
For this it works perfectly fine, I get for both 1.2.1 and 1.2.2 the exact same list of MBeans.

Unfortunately I don't have much time right now to install a Weblogic 12.1, so it would be really helpful if you could look in your output and examine what and where the difference is. As I said above, starting with maxDepth = 1 it shouldn't be to difficult to examine the output. Could you please try recipe i described previously ?

@chrislovecnm

I have recreated this with Jetty as well. Downgrading jolokia to 1.2.1 and voila I have routes. With jolokia 1.2.3 I get no routes if I increase the number of routes in Camel. We have a camelContext which contains about 1200 routes, and when I reduce the number to 100 with jolokia 1.2.3 I can see the servers routes. When I load more, nothing under camel at all. With 1.2.1 everything is happy.

@fliot
fliot commented Dec 22, 2014

I observed the same (tomcat 7.0.52 with Hawtio 1.4.27),
With Jolokia 1.2.2 I wasn't seen my 200 camel routes on Hawtio,
Just for test, following this thread suggestion,
I switched back to Jolokia 1.2.1, and here surprize, all my camel routes are now visible in Hawtio !!!!

@rhuss
Contributor
rhuss commented Dec 22, 2014

Ok, ok. I really need a reproducible test case. Can your share your code somehow so that I can reproduce it for me ? It's really hard to fish in muddy waters ....

@rhuss
Contributor
rhuss commented Dec 22, 2014

I will have a thorough look at the diff between 1.2.1 and 1.2.2 but of course having it in my debugger would be far easier .... 'hope we can nail it down till the year changes ;-)

@chrislovecnm

Recreating this is going to be fun.

Here is sorta how we create our 1200 routes :)

We have a bean that is fired by a timer that runs once. The bean itself creates a new class that is a route. That class is then added to the the camelContext. Camel in essence creates it's own routes. The kicker is that the routes are polling consumers, so we are creating a bunch of threads as well.

Here is some example code that runs inside of the bean that a timer could fire off just once, after the camelContext starts:

MyRoute routeBuilder = new MyRoute();
routeBuilder.setRoutId(id); // set it to unique
routeBuilder.setOffet(ms); // offset time so that it does not run at the same time
routeBuilder.setRoute; // set to a unique poller.

try {
  camelContext.addRoutes(routeBuilder);
} catch (Exception e) {
  logger.error("unable to add route", e);
}

You could look at creating timers that fire every few ms and just log messages.

public class MyRoute extends RouteBuilder {

private String id;
private String ms;

private Logger logger = LoggerFactory.getLogger(getClass());

public void configure() {

    from(id) // setup the ms in here ... not the full code
              .to(log:somethingHere);
  }

}

Our code is firing off polling consumers, so a thread is added for each. I am not sure it timers will add enough load, but it is a start. @davsclaus probably is asking wth we are doing, but running this across multiple routes is going to eventually require almost sharding the load, which is not a bridge we want to cross now.

@michaelmeire

In addition to this: we noticed that changing the maxDepth parameter within the Hawtio->preferences->Jolokia is not used in the actual xmlhttprequest to a remote jolokia. We noticed this by using firebug to inspect the params that are sent.
The maxCollectionSize that is changed in the preferences, is correctly used in the xmlhttprequest.

Could this explain the problem, especially when the number of objects at a certain level in the jmxtree is very large?

We are connecting from hawtio version 1.4.37 towards a remote jolokia (running in activemq).

@empie empie referenced this issue in rhuss/jolokia Jan 7, 2015
Closed

maxCollectionSize not used? #176

@davidflam

Same problem as @hchehade, most of the entries inside the org.apache.camel JMX domain are missing. We are running about 1000 routes inside Apache ServiceMix 5.4.0 with hawt.io version 1.4.46. When I downgrade hawt.io to version 1.4.4 it works as expected. One more information, when issuing a GET request to http://:8181/hawtio/jolokia/list I get different response sizes. Jolokia serialization issues?

@davsclaus
Member

Was this problem related to jolokia 1.2.2+ upgraded to a newer json simple

master
https://github.com/rhuss/jolokia/blob/master/pom.xml#L384

1.2.1
https://github.com/rhuss/jolokia/blob/v1.2.1/pom.xml#L385

1.2.2
https://github.com/rhuss/jolokia/blob/v1.2.2/pom.xml#L384

So upgrading from json simple 1.1 to 1.1.1 maybe likely caused this problem?

If anyone able to consistently reproduce, then you are welcome to try with jolokia 1.2.1 and then force it to use json simple 1.1.1 so that would be the only change, and then see if that causes the problem.

And if so we could get it reported to json simple, and maybe jolokia in the mean time could downgrade to 1.1 so it works

@chrislovecnm

Can we get it documented which version of jolokia to use with which version of hawtio :) I will open another boog

@hchehade

actually it is working fine with 1.2.1

@rhuss
Contributor
rhuss commented Mar 4, 2015

I just created a SNAPSHOT from 1.2.2 with json-simple-1.1.1 replaced by json-simple-1.1 (nothing changed otherwise). You can refer to it as jolokia-1.2.2-json-simple-1.1-SNAPSHOT.

The WAR agent can be directly downloaded from maven central.

@chrislovecnm , @davidflam, @hchehade, @fliot: Since I really can't reproduce it myself (sorry, don't have the appropriate amount of Camel routes ;-) I need your help. Could you please try this agent (or the JVM variant from the same location, which also contains the json-1.1-simple.jar) with your setup ? This would be awesome !

Thanks a lot !

@rhuss rhuss self-assigned this Mar 5, 2015
@rhuss
Contributor
rhuss commented Mar 6, 2015

Since this is a Jolokia issue I opened an issue over there (rhuss/jolokia#187). Please let's continue there.

@fliot
fliot commented Mar 6, 2015

Camel routes display (around 200 routes), perfectly works on my build (hawtio 1.4.45, jolokia 1.2.1).

I just tested with jolokia 1.2.2-json-simple-1.1-SNAPSHOT, and unfortunatelly I must advise it doesn't work (as it doesn't work with jolokia 1.2.2).

@rhuss
Contributor
rhuss commented Mar 10, 2015

Thanks for the feedback, this is quite helpful. I will have closer look to the code diff then.

@rhuss
Contributor
rhuss commented Mar 10, 2015

Indeed the serialization stuff has been changed slightly between 1.2.1 and 1.2.2. However, I still can't repdoruce the issue. I introduced a large lists in the AttributeChecking MBean and tried it with really large lists, but they are only truncated if maxCollectionSize is exceeded. BTW, maxCollectionSize for the WAR agent has a hard limit of 1000 which cannot be overridden. But this was already the case for 1.2.1, though this shouldn't be the problem.

Since I'm not that fluent in Camel yet, could someone please create a small, self-contained WAR which creates a large number of routes during init (1000 or so) so that I could examine the situation locally ? 'would be very helpful and shouldn't be much work.

thanks ....

@davsclaus
Member

I created a bit camel project based on the Apache Camel Servlet Tomcat example.

Its hosted in github at
https://github.com/davsclaus/bigcamel

You can build from source and deploy the war to tomcat.

I also noticed that in hawtio the number of routes listed are not all of them, but a bit strange with routes from the begging, and then a chunk of routes from the end etc.

The app has one main route which is the the helloRoute. And then when you add new routes they are named route-1, route-2 ...

@davsclaus
Member

And you can also list all the routes using, and then compare with jolokia
http://localhost:8080/bigcamel-1.0/camel/hello?list=true

@rhuss
Contributor
rhuss commented Mar 13, 2015

Thanks a lot, @davsclaus !

Good news: I could reproduce the problem with 500 routes. Interestingly, the list of route MBeans is not truncated but there are gaps in the list (e.g route-103 is missing ...). Very interesting ;-)

Hope I can nail it down over the weekend so that we finally have a solution for that.

@rhuss
Contributor
rhuss commented Mar 13, 2015

BTW, its start with 167 added routes and is completely deterministic and reproducible (route-92 is missing then in 1.2.2 but not 1.2.1).

Nice somehow.

@davsclaus
Member

@rhuss and I though only OSGi could drive you mad with non deterministic behavior based on factors from the current lunar position and whether its a leap year

@rhuss
Contributor
rhuss commented Mar 13, 2015

Ok, got it finally.

The point is, that between 1.2.1 and 1.2.2 a bug was fixed accidentally (well, doesn't happen very often ;-): Jolokia has some options which influences the JSON serialization. One of them is maxCollectionSize which is used when serializing collections. By default this is set to 1000 so that collections are trunctated at that size. If this value is 0, no truncation happens. This was introduced as a safety net.

This truncation worked for any collections in 1.2.1 except for JSONObject and JSONArray (which is a special map which knows how to serialize to JSON). That was a bug. In 1.2.2 I unified the serialization of JSONObject and Map and only do a serialization for a Map (where truncation works).

Since the LIST operation creates internally a JSONObject for the meta data to return which then gets serialized like any other value to JSON, this bug affects the list operation, too:

As soon as the number of MBeans within a domain exceed maxCollectionSize which has the default of 1000, the list of MBeans is truncated (the point of truncation is arbitrary since it happens during an loop over a map's keys).

This is of course not a good solution. There are several ways how to fix that:

  • Increase the default limit to 10000 (or something smaller). Of course, this problem will occur here as well then. But I wonder whether this is a problem in reality, having more than 10000 MBeans registered in an JMX domain is really insane.
  • No limitation for LIST operations (or a separate limitation).
  • No hard limit for collections so that a "?maxCollectionSize=0" can be added to a single request. Currently the default limit of 1000 (which can be changed of course e.g. in web.xml or during startup of the agent) is hard so that a query parameter can only redruce this number but can not increase it.

Any ideas or suggestions on this ?

@rhuss
Contributor
rhuss commented Mar 13, 2015

For a quick workaround with jolokia 1.2.2 and 1.2.3 you can change the maxCollectionSize hard limit:

  • For the WAR agent, set init parameter maxCollectionSize to 0 (or a very large value)
  • For the JVM agent, start the agent with --maxCollectionSize=0 or use a property maxCollectionSize when using a configuration file with --config
  • For the OSGi agent, set a bundle context property org.jolokia.maxCollectionSize to value 0
@davsclaus
Member

Yay

Yeah I think increasing the default to 10000 is better. IMHO as ppl likely dont know about this limit and then instead they wonder why some data is not there.

With ActiveMQ or Camel they can be verbose in JMX, eg for Camel a route can result in N mbeans, 1 for the route, N for its processors, N for its endpoints.

And for ActiveMQ you have a mbean per queue / topic.

So hitting 1000 is actually common for production usage.

So having a limit of 10000 is better, but still even on the lower side. A broker could potential have many queues.

@davsclaus
Member

I think the LIST should not have any limit by default, eg its there to list all the mbeans. Its a more special operation you know can return a lot of data. And if you want to limit it, maybe the list should have (if not already) a nice filter api so you can filter only the mbean kinds you want. And yeah maybe also a different upper limit option you can use, and then some way to know in the response if the upper limit was hit.

@rhuss
Contributor
rhuss commented Mar 16, 2015

Yes, you are right, one should really exclude LIST operation from the maxCollectionSize and maxObjects from the limit check. The real problem indeed is, that there is no error indication when a collection gets truncated. For the other limits maxDepth and maxObjects a string with the error message is used for the values, but that's not so easy for collections (should one add an extra element ? What's about the type of the error ? etc. ...). (Remember, truncation for collection can happen on any level).

However, restricting LIST in the serialization is not so easy, since the serialization mechanism for a return value is completely decoupled from the actual operation.

What's about the following, super easy to implement solution: Removing the hard limit on the collection size (which is BTW the only hard limit currently set by default):

  • No truncation will happen by default.
  • If someone want to secure a request, she can simply add ?maxCollectionSize=1000 to the request.
  • If someone really needs a hard limit, he can repackage the WAR-agent or start the JVM Agent with the appropriate option. (BTW, as far as I see, it's only the WAR agent which is affected by this issue because this is the only one with an hard limit set in the web.xml).

If there are no objections, I will choose this route, since this is the easiest, backwards compatible, solution for this pain point.

@davsclaus
Member

+1

@davsclaus
Member

@rhuss did you have time to work on jolokia recently?

@rhuss
Contributor
rhuss commented May 4, 2015

I'm just back today and plan to do a release this week with the proposed fix.

@rhuss rhuss added a commit to rhuss/jolokia that referenced this issue May 6, 2015
@rhuss rhuss No default truncation of collections for WAR agent
Truncation of collections can be tuned with the 'maxCollectionSize' config parameter. The default was infinite for the JVM agent and Mule agent but 1000 for the WAR agent. This turned out to be too less (see hawtio/hawtio#1725 (comment)). Now the limitation has been removed, which shouldn't be a big problem for most. You can always add your own limitation by setting the parameter in web.xml. Fixes #187.
339f20e
@davsclaus
Member

Fixed now

@davsclaus davsclaus closed this Aug 21, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment