New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is Logstash so slow to start? #5507

Open
jsvd opened this Issue Jun 16, 2016 · 10 comments

Comments

Projects
None yet
6 participants
@jsvd
Member

jsvd commented Jun 16, 2016

Here are some profilling I did on master, where a simple pipeline takes 9 seconds to start executing on my macbook:

% bin/logstash -e "input { generator { count => 1 } } output { stdout { codec => dots} }"
bash script Thu Jun 16 11:07:41 WEST 2016
started setting up bundler at 2016-06-16 11:07:43 +0100
require bundler took: 0.359
patching bundler took: 0.56
bundler reset took 0.017
bundler setup took 3.763
done setting bundler 2016-06-16 11:07:48 +0100
runner#run 2016-06-16 11:07:50 +0100
agent#execute 2016-06-16 11:07:50 +0100
Pipeline main started
.Pipeline main has been shutdown

This means that, of the 9 seconds it takes to execute the pipeline, we spend:

~2 seconds going from BASH to JRuby
~5 seconds executing LogStash::Bundler.setup!({:without => [:build, :development]}): https://github.com/elastic/logstash/blob/master/lib/bootstrap/environment.rb#L64
~2 seconds processing cli args, creating the agent, compiling the pipeline and other setup tasks

@jsvd jsvd added the discuss label Jun 16, 2016

@jsvd

This comment has been minimized.

Show comment
Hide comment
@jsvd

jsvd Jun 16, 2016

Member

Also, if you're seeing startup times in the magnitude of minutes, you could be hitting https://github.com/jruby/jruby/wiki/Improving-startup-time#ensure-your-system-has-adequate-entropy

Member

jsvd commented Jun 16, 2016

Also, if you're seeing startup times in the magnitude of minutes, you could be hitting https://github.com/jruby/jruby/wiki/Improving-startup-time#ensure-your-system-has-adequate-entropy

@ipconfiger

This comment has been minimized.

Show comment
Hide comment
@ipconfiger

ipconfiger Nov 2, 2016

I used on a 2 core 2G mem VPS, i will table 7min to start 😂😂😂😂😂😂😂

ipconfiger commented Nov 2, 2016

I used on a 2 core 2G mem VPS, i will table 7min to start 😂😂😂😂😂😂😂

@jsvd

This comment has been minimized.

Show comment
Hide comment
@jsvd

jsvd Nov 2, 2016

Member

@ipconfiger see my comment above, this seems to be a lack of entropy issue https://github.com/jruby/jruby/wiki/Improving-startup-time#ensure-your-system-has-adequate-entropy
I use logstash often in amazon's free tier t2.micros and logstash definitely starts under 1 min

Member

jsvd commented Nov 2, 2016

@ipconfiger see my comment above, this seems to be a lack of entropy issue https://github.com/jruby/jruby/wiki/Improving-startup-time#ensure-your-system-has-adequate-entropy
I use logstash often in amazon's free tier t2.micros and logstash definitely starts under 1 min

@lajoie

This comment has been minimized.

Show comment
Hide comment
@lajoie

lajoie Jan 4, 2017

I'm experiencing this as well: startup times are around 5-6 minutes. Here's what info I can provide.

logstash 5.1.1 w/o additional plugins w/ configuration provided below
Java: OpenJDK 1.8.0_111
Host: CentOS 6.7, 1 vCPU, 512M ram, virtual machine

Following some of the recommendations in the previously mentioned wiki page I tried the following items:

  • install haveged to address the possible lack of entropy - no result
  • forced JVM to use only the first tier of the tiered compilation - no result
  • (re)generated the class data archive - no result

For comparison purposes I downloaded logstash 2.4.1 and ran with the same config. Got the same results.

Then I ran the same config directly on my workstation and got a startup time of ~5s, which is more what I'm used to seeing.

Seeing the stark difference between the VM and my laptop I then tried logstash 5.1.1 on one of our AWS instances and saw a startup time of 10-15s.

So clearly there is some sort of environmental component to this. Not sure how to go further to figure out what it might be. But happy to run any tests from, or provide any data to, some one with a better understanding of logstash internals.

My configuration:

input {
  file {
    path => "/var/log/zookeeper/zookeeper.log"
    add_field => { "[@metadata][type]" => "zookeeper" }
    codec => multiline {
      pattern => "^\d\d\d\d"
      negate => true
      what => "previous"
    }
  }
}

filter {
  if [@metadata][type] == "zookeeper" {
    grok {
      match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}\s*%{WORD:level}\s*%{NOTSPACE:category}\s*%{GREEDYDATA:log_message}" }
    }
  }
}

lajoie commented Jan 4, 2017

I'm experiencing this as well: startup times are around 5-6 minutes. Here's what info I can provide.

logstash 5.1.1 w/o additional plugins w/ configuration provided below
Java: OpenJDK 1.8.0_111
Host: CentOS 6.7, 1 vCPU, 512M ram, virtual machine

Following some of the recommendations in the previously mentioned wiki page I tried the following items:

  • install haveged to address the possible lack of entropy - no result
  • forced JVM to use only the first tier of the tiered compilation - no result
  • (re)generated the class data archive - no result

For comparison purposes I downloaded logstash 2.4.1 and ran with the same config. Got the same results.

Then I ran the same config directly on my workstation and got a startup time of ~5s, which is more what I'm used to seeing.

Seeing the stark difference between the VM and my laptop I then tried logstash 5.1.1 on one of our AWS instances and saw a startup time of 10-15s.

So clearly there is some sort of environmental component to this. Not sure how to go further to figure out what it might be. But happy to run any tests from, or provide any data to, some one with a better understanding of logstash internals.

My configuration:

input {
  file {
    path => "/var/log/zookeeper/zookeeper.log"
    add_field => { "[@metadata][type]" => "zookeeper" }
    codec => multiline {
      pattern => "^\d\d\d\d"
      negate => true
      what => "previous"
    }
  }
}

filter {
  if [@metadata][type] == "zookeeper" {
    grok {
      match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}\s*%{WORD:level}\s*%{NOTSPACE:category}\s*%{GREEDYDATA:log_message}" }
    }
  }
}
@ksiv

This comment has been minimized.

Show comment
Hide comment
@ksiv

ksiv Feb 17, 2017

@jsvd thanks a lot ! Solved an issue like this by installing AND running haveged .
Rise in entropia like 2r plus from 34-56 to 2000+
Increase in speed 5 min turned into 15 sec.
http://passbe.com/2016/12/21/logstash-slow-start-up-times-and-exhausting-entropy.html

ksiv commented Feb 17, 2017

@jsvd thanks a lot ! Solved an issue like this by installing AND running haveged .
Rise in entropia like 2r plus from 34-56 to 2000+
Increase in speed 5 min turned into 15 sec.
http://passbe.com/2016/12/21/logstash-slow-start-up-times-and-exhausting-entropy.html

@CowboyTim

This comment has been minimized.

Show comment
Hide comment
@CowboyTim

CowboyTim Jun 9, 2017

That's IMO not a solution. There's no need to have so many entropy needed for an application to just start up,

CowboyTim commented Jun 9, 2017

That's IMO not a solution. There's no need to have so many entropy needed for an application to just start up,

@untergeek

This comment has been minimized.

Show comment
Hide comment
@untergeek

untergeek Jun 9, 2017

Member

@CowboyTim, entropy exhaustion is a very real concern if you are doing anything with encryption. Logstash happens to spin up its encryption bits at initialization, and unfortunately this makes for a bad experience if you have insufficient entropy.

Member

untergeek commented Jun 9, 2017

@CowboyTim, entropy exhaustion is a very real concern if you are doing anything with encryption. Logstash happens to spin up its encryption bits at initialization, and unfortunately this makes for a bad experience if you have insufficient entropy.

@jsvd

This comment has been minimized.

Show comment
Hide comment
@jsvd

jsvd Jun 9, 2017

Member

I haven't tested yet logstash with jruby 1.7.27, which has introduced a backport of a fix around the SecureRandom code. I'm not sure if this can contribute to this problem.

[edit] wrong link, backport pr is jruby/jruby#4149

Member

jsvd commented Jun 9, 2017

I haven't tested yet logstash with jruby 1.7.27, which has introduced a backport of a fix around the SecureRandom code. I'm not sure if this can contribute to this problem.

[edit] wrong link, backport pr is jruby/jruby#4149

@CowboyTim

This comment has been minimized.

Show comment
Hide comment
@CowboyTim

CowboyTim Jun 9, 2017

@untergeek it's indeed a real concern if there's encryption needed. But when it's not, the real concern is the time wasted trying to debug a problem which shouldn't be a problem - and the time wasted during application start for things it never needs. At the very least this can be just logged that the entropy is too low. This would even benefit applications that actually need entropy too.

CowboyTim commented Jun 9, 2017

@untergeek it's indeed a real concern if there's encryption needed. But when it's not, the real concern is the time wasted trying to debug a problem which shouldn't be a problem - and the time wasted during application start for things it never needs. At the very least this can be just logged that the entropy is too low. This would even benefit applications that actually need entropy too.

@untergeek

This comment has been minimized.

Show comment
Hide comment
@untergeek

untergeek Jun 9, 2017

Member

As @jsvd correctly points out, the real problem is upstream from Logstash in the JRuby code. We are migrating to JRuby 9000 soon, where the backported fix originated from.

Member

untergeek commented Jun 9, 2017

As @jsvd correctly points out, the real problem is upstream from Logstash in the JRuby code. We are migrating to JRuby 9000 soon, where the backported fix originated from.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment