New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues on embedded ARM (e.g. Raspberry Pi 2) #10

Closed
kaikreuzer opened this Issue Jan 3, 2016 · 44 comments

Comments

Projects
None yet
8 participants
@kaikreuzer
Member

kaikreuzer commented Jan 3, 2016

This is a follow up of openhab/openhab2-addons#514, where we assumed that switching to G1 GC fixes most of the performance issues (see openhab/openhab2-addons#514 (comment)).
Unfortunately, G1 is not supported on ARM, if you run the openHAB runtime on a Raspberry Pi 2, you will get

Java HotSpot(TM) Client VM warning: G1 GC is disabled in this release.

See also https://blogs.oracle.com/henrik/entry/oracle_releases_jdk_for_linux which states:
"Some examples of missing features include the G1 GC".

I did now tests on a Raspberry Pi 2 (with java 1.8.0_40-b27) and measured these times for startup (until the dashboard and all UIs of the demo are available and the CPU usage dropped below 95%):

  • first start: 22 minutes
  • subsequent start: 11 minutes

Even the installation of an additional binding through the Paper UI takes already more than 30 seconds - this is just adding a small bundle to the OSGi container...

The current openHAB 2 distribution (non-karaf) takes less than 30 seconds for startup, so the issue is clearly with Karaf.
It seems that hardly anybody before us has ever tried to use Karaf 4 on a Raspberry Pi. The "plain" Karaf 4.0.3 e.g. sets the "-server" jvm flag, which isn't available on ARM. And even this empty Karaf container alone takes more than a minute for the startup.

THIS IS A SEVERE BLOCKER FOR MOVING AHEAD WITH OPENHAB 2!!!

One of the goals of openHAB 2 was to make it better suited for embedded environments - the current solution is in contrast UNUSABLE on it.

I guess we might need some input and suggestions from the Karaf team in order to take a decision on how to move forward. @ANierbeck, @cschneider, @jbonofre: Was it a bad idea from the start to use Karaf on embedded ARM computers or do you see some hope for us?

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@ANierbeck

This comment has been minimized.

Show comment
Hide comment
@ANierbeck

ANierbeck Jan 3, 2016

This is a rather strange thing. Did you run a vanilla Karaf on a PI before, any measurements there?
I know that @jgoodyear did some testing of OpenDayLight (ODL) on a PI before, afaik he didn't spot such an issue.

ANierbeck commented Jan 3, 2016

This is a rather strange thing. Did you run a vanilla Karaf on a PI before, any measurements there?
I know that @jgoodyear did some testing of OpenDayLight (ODL) on a PI before, afaik he didn't spot such an issue.

@kaikreuzer

This comment has been minimized.

Show comment
Hide comment
@kaikreuzer

kaikreuzer Jan 3, 2016

Member

Did you run a vanilla Karaf on a PI before, any measurements there?

Not before, but the "empty Karaf container alone takes more than a minute for the startup" was a test of vanilla Karaf today.

Member

kaikreuzer commented Jan 3, 2016

Did you run a vanilla Karaf on a PI before, any measurements there?

Not before, but the "empty Karaf container alone takes more than a minute for the startup" was a test of vanilla Karaf today.

@kaikreuzer

This comment has been minimized.

Show comment
Hide comment
@kaikreuzer

kaikreuzer Jan 3, 2016

Member

Ok, the only test I did not do today was the non-karaf openHAB runtime. It turns out that my Raspberry might have some general issues, since even openHAB 1.7.1 demo takes 2-3 minutes for startup.

So I guess I will have to flash my sd card and rerun the tests. @maggu2810, @dvanherbergen: Do you maybe have a board ready where you could briefly test the offline-distro?

Member

kaikreuzer commented Jan 3, 2016

Ok, the only test I did not do today was the non-karaf openHAB runtime. It turns out that my Raspberry might have some general issues, since even openHAB 1.7.1 demo takes 2-3 minutes for startup.

So I guess I will have to flash my sd card and rerun the tests. @maggu2810, @dvanherbergen: Do you maybe have a board ready where you could briefly test the offline-distro?

@ANierbeck

This comment has been minimized.

Show comment
Hide comment

ANierbeck commented Jan 3, 2016

@dvanherbergen

This comment has been minimized.

Show comment
Hide comment
@dvanherbergen

dvanherbergen Jan 3, 2016

Contributor

I will do a test on a Raspberry 1.

Contributor

dvanherbergen commented Jan 3, 2016

I will do a test on a Raspberry 1.

@kaikreuzer

This comment has been minimized.

Show comment
Hide comment
@kaikreuzer

kaikreuzer Jan 3, 2016

Member

I will do a test on a Raspberry 1.

Cool, thanks. Note that the RasPi1 will anyhow be about 4x slower than the RasPi2, so startup times of a few minutes are normal there.

this is a performance measurement by @jgoodyear:

Thanks @ANierbeck, I had seen this blog post, but couldn't really extract any information regarding startup times from it.

Member

kaikreuzer commented Jan 3, 2016

I will do a test on a Raspberry 1.

Cool, thanks. Note that the RasPi1 will anyhow be about 4x slower than the RasPi2, so startup times of a few minutes are normal there.

this is a performance measurement by @jgoodyear:

Thanks @ANierbeck, I had seen this blog post, but couldn't really extract any information regarding startup times from it.

@dvanherbergen

This comment has been minimized.

Show comment
Hide comment
@dvanherbergen

dvanherbergen Jan 3, 2016

Contributor

After a few restarts, I'm seeing about 5 minutes on a pi1 to get to the dashboard started.

Jan 03, 2016 10:48:09 PM org.apache.karaf.main.lock.SimpleFileLock lock
INFO: Trying to lock /home/pi/userdata/lock
Jan 03, 2016 10:48:09 PM org.apache.karaf.main.lock.SimpleFileLock lock
INFO: Lock acquired
Jan 03, 2016 10:48:09 PM org.apache.karaf.main.Main$KarafLockCallback lockAquired
INFO: Lock acquired. Setting startlevel to 100
2016-01-03 22:51:48.847 [INFO ] [smarthome.event.InboxUpdatedEvent   ] - Discovery Result with UID 'ntp:ntp:local' has been updated.
2016-01-03 22:52:00.500 [INFO ] [smarthome.event.InboxUpdatedEvent   ] - Discovery Result with UID 'yahooweather:weather:12817655' has been updated.
2016-01-03 22:52:29.512 [INFO ] [basic.internal.servlet.WebAppServlet] - Started Basic UI at /basicui/app
2016-01-03 22:53:01.458 [INFO ] [arthome.ui.paper.internal.PaperUIApp] - Started Paper UI at /ui
2016-01-03 22:53:01.461 [INFO ] [assic.internal.servlet.WebAppServlet] - Started Classic UI at /classicui/app
2016-01-03 22:53:01.524 [INFO ] [assic.internal.servlet.WebAppServlet] - Stopped Classic UI
2016-01-03 22:53:11.389 [INFO ] [.dashboard.internal.DashboardService] - Started dashboard at /start

My first impression is that the startup is mainly CPU bound. It runs at 100% for while. I'll investigate some more tomorrow.

Contributor

dvanherbergen commented Jan 3, 2016

After a few restarts, I'm seeing about 5 minutes on a pi1 to get to the dashboard started.

Jan 03, 2016 10:48:09 PM org.apache.karaf.main.lock.SimpleFileLock lock
INFO: Trying to lock /home/pi/userdata/lock
Jan 03, 2016 10:48:09 PM org.apache.karaf.main.lock.SimpleFileLock lock
INFO: Lock acquired
Jan 03, 2016 10:48:09 PM org.apache.karaf.main.Main$KarafLockCallback lockAquired
INFO: Lock acquired. Setting startlevel to 100
2016-01-03 22:51:48.847 [INFO ] [smarthome.event.InboxUpdatedEvent   ] - Discovery Result with UID 'ntp:ntp:local' has been updated.
2016-01-03 22:52:00.500 [INFO ] [smarthome.event.InboxUpdatedEvent   ] - Discovery Result with UID 'yahooweather:weather:12817655' has been updated.
2016-01-03 22:52:29.512 [INFO ] [basic.internal.servlet.WebAppServlet] - Started Basic UI at /basicui/app
2016-01-03 22:53:01.458 [INFO ] [arthome.ui.paper.internal.PaperUIApp] - Started Paper UI at /ui
2016-01-03 22:53:01.461 [INFO ] [assic.internal.servlet.WebAppServlet] - Started Classic UI at /classicui/app
2016-01-03 22:53:01.524 [INFO ] [assic.internal.servlet.WebAppServlet] - Stopped Classic UI
2016-01-03 22:53:11.389 [INFO ] [.dashboard.internal.DashboardService] - Started dashboard at /start

My first impression is that the startup is mainly CPU bound. It runs at 100% for while. I'll investigate some more tomorrow.

@maggu2810

This comment has been minimized.

Show comment
Hide comment
@maggu2810

maggu2810 Jan 3, 2016

Contributor

I did now tests on a Raspberry Pi 2 (with java 1.8.0_40-b27) and measured these times for startup (until the dashboard and all UIs of the demo are available and the CPU usage dropped below 95%):

  • first start: 22 minutes
  • subsequent start: 11 minutes

Even the installation of an additional binding through the Paper UI takes already more than 30 seconds - this is just adding a small bundle to the OSGi container...

I have not tested OH2 but another Eclipse SmartHome based product. I have not done a time measurement, but it need perhaps some minutes, but IIRC never above a few minutes.

Contributor

maggu2810 commented Jan 3, 2016

I did now tests on a Raspberry Pi 2 (with java 1.8.0_40-b27) and measured these times for startup (until the dashboard and all UIs of the demo are available and the CPU usage dropped below 95%):

  • first start: 22 minutes
  • subsequent start: 11 minutes

Even the installation of an additional binding through the Paper UI takes already more than 30 seconds - this is just adding a small bundle to the OSGi container...

I have not tested OH2 but another Eclipse SmartHome based product. I have not done a time measurement, but it need perhaps some minutes, but IIRC never above a few minutes.

@cschneider

This comment has been minimized.

Show comment
Hide comment
@cschneider

cschneider Jan 3, 2016

I just did a test to start a plain karaf 4.0.3 on a raspaberry pi 2 with java version build 1.8.0-b132. It takes about 40 seconds. This looks good to me. Will test the openhab2 distro with karaf next.

cschneider commented Jan 3, 2016

I just did a test to start a plain karaf 4.0.3 on a raspaberry pi 2 with java version build 1.8.0-b132. It takes about 40 seconds. This looks good to me. Will test the openhab2 distro with karaf next.

@jgoodyear

This comment has been minimized.

Show comment
Hide comment
@jgoodyear

jgoodyear Jan 4, 2016

Interesting...

Which JVM are you running?
When i worked on getting ODL to run on RPI, I did a number of performance
related tweaks to the system - those are outlined on my git repo:
https://github.com/jgoodyear/ODLHeliumOnRPi/tree/master/rpi-configs

I'd also note that not all SD cards have similar performance - some are
faster read/write than others.

Cheers,
Jamie

On Sun, Jan 3, 2016 at 7:50 PM, Christian Schneider <
notifications@github.com> wrote:

I just did a test to start a plain karaf 4.0.3 on a raspaberry pi 2 with
java version build 1.8.0-b132. It takes about 40 seconds. This looks good
to me. Will test the openhab2 distro with karaf next.


Reply to this email directly or view it on GitHub
#10 (comment)
.

jgoodyear commented Jan 4, 2016

Interesting...

Which JVM are you running?
When i worked on getting ODL to run on RPI, I did a number of performance
related tweaks to the system - those are outlined on my git repo:
https://github.com/jgoodyear/ODLHeliumOnRPi/tree/master/rpi-configs

I'd also note that not all SD cards have similar performance - some are
faster read/write than others.

Cheers,
Jamie

On Sun, Jan 3, 2016 at 7:50 PM, Christian Schneider <
notifications@github.com> wrote:

I just did a test to start a plain karaf 4.0.3 on a raspaberry pi 2 with
java version build 1.8.0-b132. It takes about 40 seconds. This looks good
to me. Will test the openhab2 distro with karaf next.


Reply to this email directly or view it on GitHub
#10 (comment)
.

@ANierbeck

This comment has been minimized.

Show comment
Hide comment
@ANierbeck

ANierbeck Jan 4, 2016

I also tested on a PI 1, took around 60 seconds to start a vanilla Karaf, I also tried to run the minimal distribution to see if it does have an effect it's also starting in about 60s. Switching to Equinox as basis didn't change anything in timing. At this point I'm quite satisfied :-)

ANierbeck commented Jan 4, 2016

I also tested on a PI 1, took around 60 seconds to start a vanilla Karaf, I also tried to run the minimal distribution to see if it does have an effect it's also starting in about 60s. Switching to Equinox as basis didn't change anything in timing. At this point I'm quite satisfied :-)

@dvanherbergen

This comment has been minimized.

Show comment
Hide comment
@dvanherbergen

dvanherbergen Jan 4, 2016

Contributor

I believe this could be related to openhab/openhab2-addons#563. It looks like a lot of time is being spent in binding the httpcontexts. The openhab offline version with the demo package binds about a dozen of them.
@ANierbeck in the JettyServerWrapper.java, it scans all the bundles on the classpath for ServletContainerInitializer services every time a new httpcontext is added. The openhab offline distribution is quite large (120mb), and scanning all the jars on the pi can be time consuming. Is there a way to disable this behaviour for testing purposes?

Contributor

dvanherbergen commented Jan 4, 2016

I believe this could be related to openhab/openhab2-addons#563. It looks like a lot of time is being spent in binding the httpcontexts. The openhab offline version with the demo package binds about a dozen of them.
@ANierbeck in the JettyServerWrapper.java, it scans all the bundles on the classpath for ServletContainerInitializer services every time a new httpcontext is added. The openhab offline distribution is quite large (120mb), and scanning all the jars on the pi can be time consuming. Is there a way to disable this behaviour for testing purposes?

@ANierbeck

This comment has been minimized.

Show comment
Hide comment
@ANierbeck

ANierbeck Jan 4, 2016

@dvanherbergen right now, no this isn't possible. But the scanning of the bundles is only bound to bundle which are in the "classpath" of the current or the jetty bundle, so if you don't import all other bundles, the scanning shouldn't be of much issue at this point.

see: https://github.com/ops4j/org.ops4j.pax.web/blob/master/pax-web-jetty/src/main/java/org/ops4j/pax/web/service/jetty/internal/JettyServerWrapper.java#L283-L287
and: https://github.com/ops4j/org.ops4j.pax.web/blob/master/pax-web-api/src/main/java/org/ops4j/pax/web/utils/ClassPathUtil.java#L140-L201

ANierbeck commented Jan 4, 2016

@dvanherbergen right now, no this isn't possible. But the scanning of the bundles is only bound to bundle which are in the "classpath" of the current or the jetty bundle, so if you don't import all other bundles, the scanning shouldn't be of much issue at this point.

see: https://github.com/ops4j/org.ops4j.pax.web/blob/master/pax-web-jetty/src/main/java/org/ops4j/pax/web/service/jetty/internal/JettyServerWrapper.java#L283-L287
and: https://github.com/ops4j/org.ops4j.pax.web/blob/master/pax-web-api/src/main/java/org/ops4j/pax/web/utils/ClassPathUtil.java#L140-L201

@kaikreuzer kaikreuzer changed the title from Severe performance issues on embedded ARM (e.g. Raspberry Pi 2) to Performance issues on embedded ARM (e.g. Raspberry Pi 2) Jan 4, 2016

@kaikreuzer

This comment has been minimized.

Show comment
Hide comment
@kaikreuzer

kaikreuzer Jan 4, 2016

Member

Thank you all for testing, so it seems indeed that I have some issue with my Pi, which I will have to investigate.
I have removed the "severe" from the title, but as Karaf still adds some burden on the startup behaviour, we must to continue to look for optimizations.

Member

kaikreuzer commented Jan 4, 2016

Thank you all for testing, so it seems indeed that I have some issue with my Pi, which I will have to investigate.
I have removed the "severe" from the title, but as Karaf still adds some burden on the startup behaviour, we must to continue to look for optimizations.

@jbonofre

This comment has been minimized.

Show comment
Hide comment
@jbonofre

jbonofre Jan 4, 2016

I will take a look, but starting from a minimal distribution might help.

jbonofre commented Jan 4, 2016

I will take a look, but starting from a minimal distribution might help.

@dvanherbergen

This comment has been minimized.

Show comment
Hide comment
@dvanherbergen

dvanherbergen Jan 4, 2016

Contributor

so if you don't import all other bundles, the scanning shouldn't be of much issue at this point.

I added a small timer to the addContext method in the JettyServerWrapper. It times from the start of the method until after the scan for the ServletContainerInitializer is completed. These are the results during the startup:

2016-01-04 09:06:59.352 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 6241 ms to check initializer
2016-01-04 09:09:12.615 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 45559 ms to check initializer
2016-01-04 09:09:25.823 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 12192 ms to check initializer
2016-01-04 09:10:23.594 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 37265 ms to check initializer
2016-01-04 09:10:59.443 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 33981 ms to check initializer
2016-01-04 09:11:05.959 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 6391 ms to check initializer
2016-01-04 09:11:09.190 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 2747 ms to check initializer
2016-01-04 09:11:10.043 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 745 ms to check initializer
2016-01-04 09:11:16.973 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 6367 ms to check initializer
2016-01-04 09:11:23.204 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 6116 ms to check initializer
2016-01-04 09:13:44.703 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 135450 ms to check initializer

The time taken varies greatly depending on which httpcontext is being added, but it is adding up to a large amount of time being spent on scanning.

Contributor

dvanherbergen commented Jan 4, 2016

so if you don't import all other bundles, the scanning shouldn't be of much issue at this point.

I added a small timer to the addContext method in the JettyServerWrapper. It times from the start of the method until after the scan for the ServletContainerInitializer is completed. These are the results during the startup:

2016-01-04 09:06:59.352 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 6241 ms to check initializer
2016-01-04 09:09:12.615 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 45559 ms to check initializer
2016-01-04 09:09:25.823 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 12192 ms to check initializer
2016-01-04 09:10:23.594 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 37265 ms to check initializer
2016-01-04 09:10:59.443 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 33981 ms to check initializer
2016-01-04 09:11:05.959 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 6391 ms to check initializer
2016-01-04 09:11:09.190 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 2747 ms to check initializer
2016-01-04 09:11:10.043 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 745 ms to check initializer
2016-01-04 09:11:16.973 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 6367 ms to check initializer
2016-01-04 09:11:23.204 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 6116 ms to check initializer
2016-01-04 09:13:44.703 [WARN ] [ce.jetty.internal.JettyServerWrapper] - # 135450 ms to check initializer

The time taken varies greatly depending on which httpcontext is being added, but it is adding up to a large amount of time being spent on scanning.

@dvanherbergen

This comment has been minimized.

Show comment
Hide comment
@dvanherbergen

dvanherbergen Jan 4, 2016

Contributor

I did another test, this time I removed the scan completely. On my raspberry pi 1, openHAB now starts in under 4 minutes (compared to 8 minutes with the scan).
@ANierbeck any suggestions on how to proceed?

Contributor

dvanherbergen commented Jan 4, 2016

I did another test, this time I removed the scan completely. On my raspberry pi 1, openHAB now starts in under 4 minutes (compared to 8 minutes with the scan).
@ANierbeck any suggestions on how to proceed?

@cschneider

This comment has been minimized.

Show comment
Hide comment
@cschneider

cschneider Jan 4, 2016

I looked a bit into https://github.com/ops4j/org.ops4j.pax.web/blob/master/pax-web-jetty/src/main/java/org/ops4j/pax/web/service/jetty/internal/JettyServerWrapper.java. It only seems to scan for "META-INF/services". There should not be much below that path so I wonder why this seems to have such a big impact.

cschneider commented Jan 4, 2016

I looked a bit into https://github.com/ops4j/org.ops4j.pax.web/blob/master/pax-web-jetty/src/main/java/org/ops4j/pax/web/service/jetty/internal/JettyServerWrapper.java. It only seems to scan for "META-INF/services". There should not be much below that path so I wonder why this seems to have such a big impact.

@dvanherbergen

This comment has been minimized.

Show comment
Hide comment
@dvanherbergen

dvanherbergen Jan 4, 2016

Contributor

If I remember correctly, there is something wrong with the classloaders. The JettyServerWrapper will throw a ClassNotFoundException in some cases when it is trying to instantiate the ServletContainerInitializer class, even though it is available on the classpath. Maybe this has an impact on the time.

Contributor

dvanherbergen commented Jan 4, 2016

If I remember correctly, there is something wrong with the classloaders. The JettyServerWrapper will throw a ClassNotFoundException in some cases when it is trying to instantiate the ServletContainerInitializer class, even though it is available on the classpath. Maybe this has an impact on the time.

@jbonofre

This comment has been minimized.

Show comment
Hide comment
@jbonofre

jbonofre Jan 4, 2016

It could be related to bunch of META-INF/services to seek depending of the classloader.

jbonofre commented Jan 4, 2016

It could be related to bunch of META-INF/services to seek depending of the classloader.

@ANierbeck

This comment has been minimized.

Show comment
Hide comment
@ANierbeck

ANierbeck Jan 4, 2016

@jbonofre I already tested the vanilla minimal distribution, it took as long as the std. distribution on a PI 1. So in that case I doubt that it'll help here.

@dvanherbergen ... about the META-INF/service scanning, there are different ways to approach this.

  1. remove or disable the service scanning
  2. make this configurable
  3. make sure the application itself doesn't import to much unneeded, unwanted bundles
  4. make sure the web-applications use a shared HttpContext

right now I would highly favor 3 and 4 because 1 isn't acceptable at all, and 2 is just another flag which needs to be propagated throughout the application. That'll take some time till the next version does have it :-)

ANierbeck commented Jan 4, 2016

@jbonofre I already tested the vanilla minimal distribution, it took as long as the std. distribution on a PI 1. So in that case I doubt that it'll help here.

@dvanherbergen ... about the META-INF/service scanning, there are different ways to approach this.

  1. remove or disable the service scanning
  2. make this configurable
  3. make sure the application itself doesn't import to much unneeded, unwanted bundles
  4. make sure the web-applications use a shared HttpContext

right now I would highly favor 3 and 4 because 1 isn't acceptable at all, and 2 is just another flag which needs to be propagated throughout the application. That'll take some time till the next version does have it :-)

@dvanherbergen

This comment has been minimized.

Show comment
Hide comment
@dvanherbergen

dvanherbergen Jan 4, 2016

Contributor

@ANierbeck is there any drawback/risk to using a shared HttpContext?

Contributor

dvanherbergen commented Jan 4, 2016

@ANierbeck is there any drawback/risk to using a shared HttpContext?

@ANierbeck

This comment has been minimized.

Show comment
Hide comment
@ANierbeck

ANierbeck Jan 4, 2016

@dvanherbergen, not to my knowledge. It's as if a bunch of servlets are registered from within the same Activator.

ANierbeck commented Jan 4, 2016

@dvanherbergen, not to my knowledge. It's as if a bunch of servlets are registered from within the same Activator.

@kaikreuzer

This comment has been minimized.

Show comment
Hide comment
@kaikreuzer

kaikreuzer Jan 4, 2016

Member

FTR: I have flashed my RPi2 with a new Rasbian image and the latest JDK 8.
(First) Startup time for openHAB-Karaf-offline distro is now 4.5 min compared to 2 minutes of the non-karaf demo distro.
So there is a major issue, which I still consider a blocker - but it seems that the http scanning is a major culprit here, so I am looking forward to a solution of it.

Even the installation of an additional binding through the Paper UI takes already more than 30 seconds - this is just adding a small bundle to the OSGi container...

This is now down to 10 secs, but even this feels like a long time for adding and starting a single bundle. Maybe it is due to some complex dependency resolution that is done by the feature installer?

Member

kaikreuzer commented Jan 4, 2016

FTR: I have flashed my RPi2 with a new Rasbian image and the latest JDK 8.
(First) Startup time for openHAB-Karaf-offline distro is now 4.5 min compared to 2 minutes of the non-karaf demo distro.
So there is a major issue, which I still consider a blocker - but it seems that the http scanning is a major culprit here, so I am looking forward to a solution of it.

Even the installation of an additional binding through the Paper UI takes already more than 30 seconds - this is just adding a small bundle to the OSGi container...

This is now down to 10 secs, but even this feels like a long time for adding and starting a single bundle. Maybe it is due to some complex dependency resolution that is done by the feature installer?

@dvanherbergen

This comment has been minimized.

Show comment
Hide comment
@dvanherbergen

dvanherbergen Jan 5, 2016

Contributor

@ANierbeck, when trying to reuse the same httpcontext, it fails with the following error:

2016-01-05 11:45:28.175 [ERROR] [org.eclipse.smarthome.ui            ] - [org.eclipse.smarthome.ui.chart(101)] The activate method has thrown an exception
java.lang.IllegalStateException: Http context DefaultHttpContext [bundle=org.eclipse.smarthome.io.net_0.8.0.201601050935 [109], contextID=default] is already associated to bundle org.eclipse.smarthome.ui.icon_0.8.0.201601050935 [129]
    at org.ops4j.pax.web.service.spi.model.ServerModel.associateHttpContext(ServerModel.java:278)
    at org.ops4j.pax.web.service.internal.HttpServiceStarted.getOrCreateContext(HttpServiceStarted.java:1003)
    at org.ops4j.pax.web.service.internal.HttpServiceStarted.registerServlet(HttpServiceStarted.java:185)

Is there a certain configuration property that needs to be changed?

Contributor

dvanherbergen commented Jan 5, 2016

@ANierbeck, when trying to reuse the same httpcontext, it fails with the following error:

2016-01-05 11:45:28.175 [ERROR] [org.eclipse.smarthome.ui            ] - [org.eclipse.smarthome.ui.chart(101)] The activate method has thrown an exception
java.lang.IllegalStateException: Http context DefaultHttpContext [bundle=org.eclipse.smarthome.io.net_0.8.0.201601050935 [109], contextID=default] is already associated to bundle org.eclipse.smarthome.ui.icon_0.8.0.201601050935 [129]
    at org.ops4j.pax.web.service.spi.model.ServerModel.associateHttpContext(ServerModel.java:278)
    at org.ops4j.pax.web.service.internal.HttpServiceStarted.getOrCreateContext(HttpServiceStarted.java:1003)
    at org.ops4j.pax.web.service.internal.HttpServiceStarted.registerServlet(HttpServiceStarted.java:185)

Is there a certain configuration property that needs to be changed?

@ANierbeck

This comment has been minimized.

Show comment
Hide comment

ANierbeck commented Jan 5, 2016

@dvanherbergen

This comment has been minimized.

Show comment
Hide comment
@dvanherbergen

dvanherbergen Jan 5, 2016

Contributor

mmm. That's not going to work. I don't think we should add dependencies from eclipse smarthome to pax-web.

Contributor

dvanherbergen commented Jan 5, 2016

mmm. That's not going to work. I don't think we should add dependencies from eclipse smarthome to pax-web.

@kaikreuzer

This comment has been minimized.

Show comment
Hide comment
@kaikreuzer

kaikreuzer Jan 5, 2016

Member

I don't think we should add dependencies from eclipse smarthome to pax-web.

Definitely not. So option 1 might be the better choice for us after all?

Member

kaikreuzer commented Jan 5, 2016

I don't think we should add dependencies from eclipse smarthome to pax-web.

Definitely not. So option 1 might be the better choice for us after all?

@dvanherbergen

This comment has been minimized.

Show comment
Hide comment
@dvanherbergen

dvanherbergen Jan 5, 2016

Contributor

I think so. At least temporarily until option 2 is available.

Contributor

dvanherbergen commented Jan 5, 2016

I think so. At least temporarily until option 2 is available.

@ANierbeck

This comment has been minimized.

Show comment
Hide comment
@ANierbeck

ANierbeck Jan 5, 2016

If you can wait a bit for Pax-Web 6, it'll support the OSGi R6 specification for the WhiteBoard extender, then you can do it the "std." way.

Definitely not. So option 1 might be the better choice for us after all?

Not at all.

It may be possible to implement some caching, but I'm not done with that thought yet.

ANierbeck commented Jan 5, 2016

If you can wait a bit for Pax-Web 6, it'll support the OSGi R6 specification for the WhiteBoard extender, then you can do it the "std." way.

Definitely not. So option 1 might be the better choice for us after all?

Not at all.

It may be possible to implement some caching, but I'm not done with that thought yet.

@kaikreuzer

This comment has been minimized.

Show comment
Hide comment
@kaikreuzer

kaikreuzer Jan 5, 2016

Member

Not at all.

@ANierbeck: We need a solution until end of this week and the scanning adds a few minutes to the startup time. So what other option do you see than to deactivate it for the moment?

Member

kaikreuzer commented Jan 5, 2016

Not at all.

@ANierbeck: We need a solution until end of this week and the scanning adds a few minutes to the startup time. So what other option do you see than to deactivate it for the moment?

@ANierbeck

This comment has been minimized.

Show comment
Hide comment
@ANierbeck

ANierbeck Jan 5, 2016

@kaikreuzer, as openhab isn't the only application using Pax-Web, it just can't be disabled. Every application using JSF needs it. Even the WebSocketServer is registered with ServletContainerInitializers in Jetty.

Make sure to have a reduced dependency tree, which needs to be parsed. Till the end of the week I don't see anything that can be done here.

ANierbeck commented Jan 5, 2016

@kaikreuzer, as openhab isn't the only application using Pax-Web, it just can't be disabled. Every application using JSF needs it. Even the WebSocketServer is registered with ServletContainerInitializers in Jetty.

Make sure to have a reduced dependency tree, which needs to be parsed. Till the end of the week I don't see anything that can be done here.

@kaikreuzer

This comment has been minimized.

Show comment
Hide comment
@kaikreuzer

kaikreuzer Jan 5, 2016

Member

I am not asking to change anything in Pax-Web by the end of this week. But @dvanherbergen suggested to patch it specifically for openHAB as long as we have no other solution.

Member

kaikreuzer commented Jan 5, 2016

I am not asking to change anything in Pax-Web by the end of this week. But @dvanherbergen suggested to patch it specifically for openHAB as long as we have no other solution.

@ANierbeck

This comment has been minimized.

Show comment
Hide comment
@ANierbeck

ANierbeck Jan 5, 2016

In that case, feel free :-)

ANierbeck commented Jan 5, 2016

In that case, feel free :-)

@kaikreuzer

This comment has been minimized.

Show comment
Hide comment
@kaikreuzer

kaikreuzer Feb 18, 2016

Member

@dvanherbergen Any idea why we still have user reports where the startup takes 20 minutes and more? See e.g. #19 (comment) or https://community.openhab.org/t/closed-very-slow-oh2-starting-up-on-synology-dsm6/7447/8. This is pretty weird, isn't it?

Member

kaikreuzer commented Feb 18, 2016

@dvanherbergen Any idea why we still have user reports where the startup takes 20 minutes and more? See e.g. #19 (comment) or https://community.openhab.org/t/closed-very-slow-oh2-starting-up-on-synology-dsm6/7447/8. This is pretty weird, isn't it?

@dvanherbergen

This comment has been minimized.

Show comment
Hide comment
@dvanherbergen

dvanherbergen Feb 18, 2016

Contributor

No idea. I will need to do some testing to verify. I'm a bit overloaded at work at the moment, I should be able to check this during the weekend.

Contributor

dvanherbergen commented Feb 18, 2016

No idea. I will need to do some testing to verify. I'm a bit overloaded at work at the moment, I should be able to check this during the weekend.

digitaldan pushed a commit to digitaldan/openhab-distro that referenced this issue Feb 22, 2016

@dvanherbergen

This comment has been minimized.

Show comment
Hide comment
@dvanherbergen

dvanherbergen Feb 23, 2016

Contributor

I just did a fresh offline install on a raspberry pi 1 using the standard & demo package.

  • initial startup, until "Installed 'openhab-package-demo'" : 8 min 30 secs.
  • second startup, until "Started dashboard at /start" : 3 min 11 secs.

This looks OK to me, given that it's only a pi 1.

Contributor

dvanherbergen commented Feb 23, 2016

I just did a fresh offline install on a raspberry pi 1 using the standard & demo package.

  • initial startup, until "Installed 'openhab-package-demo'" : 8 min 30 secs.
  • second startup, until "Started dashboard at /start" : 3 min 11 secs.

This looks OK to me, given that it's only a pi 1.

@jgoodyear

This comment has been minimized.

Show comment
Hide comment
@jgoodyear

jgoodyear Feb 23, 2016

Have you tried any of the performance tweaks I've outlined on my
OpenDaylight Helium on RPi project?
https://github.com/jgoodyear/ODLHeliumOnRPi/tree/master/rpi-configs

Cheers,
Jamie

On Tue, Feb 23, 2016 at 3:29 PM, dvanherbergen notifications@github.com
wrote:

I just did a fresh offline install on a raspberry pi 1 using the standard
& demo package.

  • initial startup, until "Installed 'openhab-package-demo'" : 8 min 30
    secs.
  • second startup, until "Started dashboard at /start" : 3 min 11 secs.

This looks OK to me, given that it's only a pi 1.


Reply to this email directly or view it on GitHub
#10 (comment)
.

jgoodyear commented Feb 23, 2016

Have you tried any of the performance tweaks I've outlined on my
OpenDaylight Helium on RPi project?
https://github.com/jgoodyear/ODLHeliumOnRPi/tree/master/rpi-configs

Cheers,
Jamie

On Tue, Feb 23, 2016 at 3:29 PM, dvanherbergen notifications@github.com
wrote:

I just did a fresh offline install on a raspberry pi 1 using the standard
& demo package.

  • initial startup, until "Installed 'openhab-package-demo'" : 8 min 30
    secs.
  • second startup, until "Started dashboard at /start" : 3 min 11 secs.

This looks OK to me, given that it's only a pi 1.


Reply to this email directly or view it on GitHub
#10 (comment)
.

@haunma

This comment has been minimized.

Show comment
Hide comment
@haunma

haunma Jun 5, 2016

Just as another data point, this issue is still very real. I installed the "online" version of openHAB2 via the debian jessie package in
deb https://openhab.ci.cloudbees.com/job/openHAB-Distribution/ws/distributions/openhab-online/target/apt-repo/ /
on my Jetson K1 board yesterday. This is a much faster system than any of the RPis (multicore Tegra K1, 2 GBytes RAM, SATA SSD).

Initial launch (until I could bring up the web page) took 20 minutes, 30 seconds. (I haven't timed a second or third launch yet.) Installing any binding takes ~ 5 minutes each. One or more CPUs were maxed out during all of this. The times would have been longer on a single-core machine.

I ssh'ed into the console. That was quite an adventure as the first time it took about one hour of max CPU before calming down. In the meantime the ssh client times out waiting for the password prompt. Googling for the timeout message takes me to the Karaf mailing list!! Not a promising sign. But eventually I was able to get in (it now takes "only" thirty seconds to reach a password prompt). "info" gives

openhab> info
Karaf
  Karaf version               4.0.4
  Karaf home                  /usr/share/openhab2/runtime/karaf
  Karaf base                  /var/lib/openhab2
  OSGi Framework              org.eclipse.osgi-3.10.2.v20150203-1939

JVM
  Java Virtual Machine        OpenJDK Zero VM version 25.91-b14
  Version                     1.8.0_91
  Vendor                      Oracle Corporation
  Pid                         25736
  Uptime                      19 hours 4 minutes
Error executing command: java.lang.NullPointerException

Perhaps the nightly build I'm running has some other issues? I never saw errors in the Paper UI. Well, except for the fact that the spinners never quit spinning until you reload the page. But I take it that is expected?

Coming from an embedded SW background this has been a little discouraging so far. I had no idea openHAB was so heavyweight :(

haunma commented Jun 5, 2016

Just as another data point, this issue is still very real. I installed the "online" version of openHAB2 via the debian jessie package in
deb https://openhab.ci.cloudbees.com/job/openHAB-Distribution/ws/distributions/openhab-online/target/apt-repo/ /
on my Jetson K1 board yesterday. This is a much faster system than any of the RPis (multicore Tegra K1, 2 GBytes RAM, SATA SSD).

Initial launch (until I could bring up the web page) took 20 minutes, 30 seconds. (I haven't timed a second or third launch yet.) Installing any binding takes ~ 5 minutes each. One or more CPUs were maxed out during all of this. The times would have been longer on a single-core machine.

I ssh'ed into the console. That was quite an adventure as the first time it took about one hour of max CPU before calming down. In the meantime the ssh client times out waiting for the password prompt. Googling for the timeout message takes me to the Karaf mailing list!! Not a promising sign. But eventually I was able to get in (it now takes "only" thirty seconds to reach a password prompt). "info" gives

openhab> info
Karaf
  Karaf version               4.0.4
  Karaf home                  /usr/share/openhab2/runtime/karaf
  Karaf base                  /var/lib/openhab2
  OSGi Framework              org.eclipse.osgi-3.10.2.v20150203-1939

JVM
  Java Virtual Machine        OpenJDK Zero VM version 25.91-b14
  Version                     1.8.0_91
  Vendor                      Oracle Corporation
  Pid                         25736
  Uptime                      19 hours 4 minutes
Error executing command: java.lang.NullPointerException

Perhaps the nightly build I'm running has some other issues? I never saw errors in the Paper UI. Well, except for the fact that the spinners never quit spinning until you reload the page. But I take it that is expected?

Coming from an embedded SW background this has been a little discouraging so far. I had no idea openHAB was so heavyweight :(

@haunma

This comment has been minimized.

Show comment
Hide comment
@haunma

haunma Jun 5, 2016

On subsequent startups, things are better... "only" 1 minute 20 seconds of CPU time. Some of the improvement may be due to switching from OpenJDK to Oracle JDK.

haunma commented Jun 5, 2016

On subsequent startups, things are better... "only" 1 minute 20 seconds of CPU time. Some of the improvement may be due to switching from OpenJDK to Oracle JDK.

@kaikreuzer

This comment has been minimized.

Show comment
Hide comment
@kaikreuzer

kaikreuzer Jun 5, 2016

Member

Some of the improvement may be due to switching from OpenJDK to Oracle JDK.

From my experience OpenJDK is not usable at all, because it indeed shows the performance issues you mention. With OracleJDK, the situation is much better.

Member

kaikreuzer commented Jun 5, 2016

Some of the improvement may be due to switching from OpenJDK to Oracle JDK.

From my experience OpenJDK is not usable at all, because it indeed shows the performance issues you mention. With OracleJDK, the situation is much better.

@jgoodyear

This comment has been minimized.

Show comment
Hide comment
@jgoodyear

jgoodyear Jun 5, 2016

Hi Mark,

Just for comparison purposes, could you try installing a plain Karaf
distribution (not OpenHab) just to see what the base container's boot up
time is like?

On Sun, Jun 5, 2016 at 4:14 AM, Kai Kreuzer notifications@github.com
wrote:

Some of the improvement may be due to switching from OpenJDK to Oracle JDK.

From my experience OpenJDK is not usable at all, because it indeed shows
the performance issues you mention. With OracleJDK, the situation is much
better.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAYlIYawgO_dpNCe9hctRoJGjSoIiZS2ks5qInBRgaJpZM4G9xtg
.

jgoodyear commented Jun 5, 2016

Hi Mark,

Just for comparison purposes, could you try installing a plain Karaf
distribution (not OpenHab) just to see what the base container's boot up
time is like?

On Sun, Jun 5, 2016 at 4:14 AM, Kai Kreuzer notifications@github.com
wrote:

Some of the improvement may be due to switching from OpenJDK to Oracle JDK.

From my experience OpenJDK is not usable at all, because it indeed shows
the performance issues you mention. With OracleJDK, the situation is much
better.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAYlIYawgO_dpNCe9hctRoJGjSoIiZS2ks5qInBRgaJpZM4G9xtg
.

@maggu2810

This comment has been minimized.

Show comment
Hide comment
@maggu2810

maggu2810 Jun 5, 2016

Contributor

I ssh'ed into the console. That was quite an adventure as the first time it took about one hour of max CPU before calming down. In the meantime the ssh client times out waiting for the password prompt. Googling for the timeout message takes me to the Karaf mailing list!!

The host key is generated on the first login.
On my embedded systems it took around 5 minutes using the Oracle JVM.
You can use a pre-generated one if you do not want this one-time time consuming step.
But I don't think the distro should generate the same key for all machines.

The Jetty keystore is generated on the first start, too.

Contributor

maggu2810 commented Jun 5, 2016

I ssh'ed into the console. That was quite an adventure as the first time it took about one hour of max CPU before calming down. In the meantime the ssh client times out waiting for the password prompt. Googling for the timeout message takes me to the Karaf mailing list!!

The host key is generated on the first login.
On my embedded systems it took around 5 minutes using the Oracle JVM.
You can use a pre-generated one if you do not want this one-time time consuming step.
But I don't think the distro should generate the same key for all machines.

The Jetty keystore is generated on the first start, too.

@kaikreuzer

This comment has been minimized.

Show comment
Hide comment
@kaikreuzer

kaikreuzer Oct 15, 2016

Member

I would suggest to close this issue as the startup times seem to be acceptible by now - it is mainly the certificate generation on a very first startup that slows things down; nonetheless, I think this is a valuable feature that we should not drop.

Member

kaikreuzer commented Oct 15, 2016

I would suggest to close this issue as the startup times seem to be acceptible by now - it is mainly the certificate generation on a very first startup that slows things down; nonetheless, I think this is a valuable feature that we should not drop.

@kaikreuzer kaikreuzer closed this Oct 15, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment