Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is Logstash so slow to start? #5507

Open
jsvd opened this issue Jun 16, 2016 · 30 comments
Open

Why is Logstash so slow to start? #5507

jsvd opened this issue Jun 16, 2016 · 30 comments
Labels

Comments

@jsvd
Copy link
Member

@jsvd jsvd commented Jun 16, 2016

Here are some profilling I did on master, where a simple pipeline takes 9 seconds to start executing on my macbook:

% bin/logstash -e "input { generator { count => 1 } } output { stdout { codec => dots} }"
bash script Thu Jun 16 11:07:41 WEST 2016
started setting up bundler at 2016-06-16 11:07:43 +0100
require bundler took: 0.359
patching bundler took: 0.56
bundler reset took 0.017
bundler setup took 3.763
done setting bundler 2016-06-16 11:07:48 +0100
runner#run 2016-06-16 11:07:50 +0100
agent#execute 2016-06-16 11:07:50 +0100
Pipeline main started
.Pipeline main has been shutdown

This means that, of the 9 seconds it takes to execute the pipeline, we spend:

~2 seconds going from BASH to JRuby
~5 seconds executing LogStash::Bundler.setup!({:without => [:build, :development]}): https://github.com/elastic/logstash/blob/master/lib/bootstrap/environment.rb#L64
~2 seconds processing cli args, creating the agent, compiling the pipeline and other setup tasks

@jsvd jsvd added the discuss label Jun 16, 2016
@jsvd
Copy link
Member Author

@jsvd jsvd commented Jun 16, 2016

Also, if you're seeing startup times in the magnitude of minutes, you could be hitting https://github.com/jruby/jruby/wiki/Improving-startup-time#ensure-your-system-has-adequate-entropy

@ipconfiger
Copy link

@ipconfiger ipconfiger commented Nov 2, 2016

I used on a 2 core 2G mem VPS, i will table 7min to start 😂😂😂😂😂😂😂

@jsvd
Copy link
Member Author

@jsvd jsvd commented Nov 2, 2016

@ipconfiger see my comment above, this seems to be a lack of entropy issue https://github.com/jruby/jruby/wiki/Improving-startup-time#ensure-your-system-has-adequate-entropy
I use logstash often in amazon's free tier t2.micros and logstash definitely starts under 1 min

@lajoie
Copy link

@lajoie lajoie commented Jan 4, 2017

I'm experiencing this as well: startup times are around 5-6 minutes. Here's what info I can provide.

logstash 5.1.1 w/o additional plugins w/ configuration provided below
Java: OpenJDK 1.8.0_111
Host: CentOS 6.7, 1 vCPU, 512M ram, virtual machine

Following some of the recommendations in the previously mentioned wiki page I tried the following items:

  • install haveged to address the possible lack of entropy - no result
  • forced JVM to use only the first tier of the tiered compilation - no result
  • (re)generated the class data archive - no result

For comparison purposes I downloaded logstash 2.4.1 and ran with the same config. Got the same results.

Then I ran the same config directly on my workstation and got a startup time of ~5s, which is more what I'm used to seeing.

Seeing the stark difference between the VM and my laptop I then tried logstash 5.1.1 on one of our AWS instances and saw a startup time of 10-15s.

So clearly there is some sort of environmental component to this. Not sure how to go further to figure out what it might be. But happy to run any tests from, or provide any data to, some one with a better understanding of logstash internals.

My configuration:

input {
  file {
    path => "/var/log/zookeeper/zookeeper.log"
    add_field => { "[@metadata][type]" => "zookeeper" }
    codec => multiline {
      pattern => "^\d\d\d\d"
      negate => true
      what => "previous"
    }
  }
}

filter {
  if [@metadata][type] == "zookeeper" {
    grok {
      match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}\s*%{WORD:level}\s*%{NOTSPACE:category}\s*%{GREEDYDATA:log_message}" }
    }
  }
}
@ksiv
Copy link

@ksiv ksiv commented Feb 17, 2017

@jsvd thanks a lot ! Solved an issue like this by installing AND running haveged .
Rise in entropia like 2r plus from 34-56 to 2000+
Increase in speed 5 min turned into 15 sec.
http://passbe.com/2016/12/21/logstash-slow-start-up-times-and-exhausting-entropy.html

@CowboyTim
Copy link

@CowboyTim CowboyTim commented Jun 9, 2017

That's IMO not a solution. There's no need to have so many entropy needed for an application to just start up,

@untergeek
Copy link
Member

@untergeek untergeek commented Jun 9, 2017

@CowboyTim, entropy exhaustion is a very real concern if you are doing anything with encryption. Logstash happens to spin up its encryption bits at initialization, and unfortunately this makes for a bad experience if you have insufficient entropy.

@jsvd
Copy link
Member Author

@jsvd jsvd commented Jun 9, 2017

I haven't tested yet logstash with jruby 1.7.27, which has introduced a backport of a fix around the SecureRandom code. I'm not sure if this can contribute to this problem.

[edit] wrong link, backport pr is jruby/jruby#4149

@CowboyTim
Copy link

@CowboyTim CowboyTim commented Jun 9, 2017

@untergeek it's indeed a real concern if there's encryption needed. But when it's not, the real concern is the time wasted trying to debug a problem which shouldn't be a problem - and the time wasted during application start for things it never needs. At the very least this can be just logged that the entropy is too low. This would even benefit applications that actually need entropy too.

@untergeek
Copy link
Member

@untergeek untergeek commented Jun 9, 2017

As @jsvd correctly points out, the real problem is upstream from Logstash in the JRuby code. We are migrating to JRuby 9000 soon, where the backported fix originated from.

@fenchu
Copy link

@fenchu fenchu commented Oct 12, 2018

logstash 6.4.2:
takes around 15sec to start on my i7workstation under wsl (which is considered a very slow ubuntu)
but on our production servers in azure it takes:

PS C:\dist\logstash-6.4.2\testing> date; C:\dist\logstash-6.4.2\bin\logstash.bat -f .\zcat-proxylogs.grok
Fri Oct 12 12:36:28 CUT 2018
Sending Logstash logs to C:/dist/logstash-6.4.2/logs which is now configured via log4j2.properties
[2018-10-12T12:37:57,041][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2018-10-12T12:38:03,239][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.4.2"}
[2018-10-12T12:38:39,886][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2018-10-12T12:38:44,767][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x350d08f4 run>"}
[2018-10-12T12:38:45,421][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2018-10-12T12:38:51,575][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

~90 sec, is there a workaround for this, because it is scheduled to parse a lot of gz files, and powershell ends up reading in 15GB in the pipeline before logstash/jruby is ready.

@fenchu
Copy link

@fenchu fenchu commented Oct 12, 2018

Also see the very slow stage from jruby ready: 37:57 to logstash started: 38:51 (56Sec), when it runs it is fast parsing around 5000+ lines a sec. Can I have this resident? it is actually faster to copy files across the net (around 20GB/day) and run logstash locally on the ELK-box.

@guylando
Copy link

@guylando guylando commented Oct 25, 2018

Also seeing this slowness on windows, with a simple config it takes 90-120 seconds to start

@jsvd
Copy link
Member Author

@jsvd jsvd commented Oct 25, 2018

@guylando can you run with --log.level=debug and see the time to the first logging message?
Which version of logstash are you running? and what are the specs on the windows machine?

@guylando
Copy link

@guylando guylando commented Oct 25, 2018

@jsvd logstash version 6.4.2 windows server 2012, takes 90 seconds.
I ran process monitor (sysinternals) and it seems that all this time it just loads more and more gems and java jar libraries.

@guylando
Copy link

@guylando guylando commented Oct 25, 2018

I see one problem causing slowness: When it loads ruby gems, it looks for them in all subfolders of \vendor\bundle\jruby\2.3.0\gems\ and finds them only after more than 20 failed lookups. This happens for many gems, for example:
clamp.rb, connection.rb, collection.rb, faraday.rb, mvar.rb, definition.rb

@guylando
Copy link

@guylando guylando commented Oct 25, 2018

@jsvd Its suggested here to try jruby with --dev flag: https://github.com/jruby/jruby/wiki/Improving-startup-time#use-the---dev-flag
Do you know where can I add a flag to the call to jruby in logstash?

@jsvd
Copy link
Member Author

@jsvd jsvd commented Oct 25, 2018

does the majority of the 90 seconds happen before the first log entry?
The use of --dev will slow down logstash processing, and it's not clear that it will help in this situation.
Did you start seeing this after an upgrade or was it always an issue?

@guylando
Copy link

@guylando guylando commented Oct 25, 2018

90 seconds before first log.
seeing this when installing a fresh system, not aware of other situations.

@guylando
Copy link

@guylando guylando commented Oct 25, 2018

changing logstash memory from 3gb to 4gb didnt help.

@guylando
Copy link

@guylando guylando commented Oct 25, 2018

@jsvd I want to add --dev flag to just test if it helps to find where the problem is. How do I add it?

@guylando
Copy link

@guylando guylando commented Oct 25, 2018

logstash uses jruby version 9.1.13 which is more than 1 years old however seems there were major performance improvemenets for jruby on windows in newer versions, see last comments in: jruby/jruby#5060

UPDATE: Tested the 9.1.17 version (had to remove the resolv.rb patch) and it didnt improve the startup time. Still 90 seconds unfortunately.

@fozboz
Copy link

@fozboz fozboz commented Nov 9, 2018

I've noticed this since at least Logstash 6.3.0. It will always take around 40-60 seconds to start.

With version 6.4.2 on Ubuntu 16.04 running with --debug, the delay occurs before any logs appear. As soon as the first line appears, it starts very quickly and then performs fine.

@fozboz
Copy link

@fozboz fozboz commented Nov 9, 2018

@guylando you can recreate --dev flag with these lines in your jvm.options file.

-Djruby.compile.invokedynamic=false
-Djruby.compile.mode=OFF
-XX:+TieredCompilation
-XX:TieredStopAtLevel=1

This improved start times for me considerably, but this is not recommended for long running threads as it will not optimise code before starting.

I am configuring Logstash with Chef and want to test pipelines before they are published using --config.test_and_exit, and having > 30 second start times testing 20 pipelines is not viable. I can at least now change JVM settings just for these tests and get the start time down to ~ 10 seconds.

@guylando
Copy link

@guylando guylando commented Nov 11, 2018

@fozboz thanks
in my case too, it works fine after the first line appears
and from investigation with sysinternals procmon I saw that before the first line appears what happens is multiple libraries loading with many libraries being searched in wrong location slowing down the whole thing

@celesteking
Copy link

@celesteking celesteking commented Dec 10, 2018

30 secs to wait for config syntax check is way too long, you need either to optimize it out, or think out a separate config parser and checker. Look at bison for that.

@untergeek
Copy link
Member

@untergeek untergeek commented Dec 10, 2018

That's a dependency that's not likely to be added. It's a JVM-based install, and everything that Logstash needs must be available within the JVM to make it available on any platform. Sure, bison would be great, if it were running on Linux, but Logstash has to run on Windows, Linux, MacOS, etc.

Future releases may be more tuned to have a JVM start, and then remain up, and then have pipelines added afterward—kind of how centralized pipeline management works now, but in a broader sense.

Or not. It depends on how the development team chooses to tackle it.

@lubobill1990
Copy link

@lubobill1990 lubobill1990 commented Mar 4, 2019

+1 it's hard to debug the pipelines when logstash is slow to start.

@fenchu
Copy link

@fenchu fenchu commented Mar 5, 2019

logstash developers should consider a resident mode, so we can start up logstash and let it parse via sockets. We have stopped using logstash, we now use elastics _bulk api directly, with parsing and preparing in python, we manage 10K rows as sec. Even before logstash is operational we have inserted almost 900K rows.

@M0rdecay
Copy link

@M0rdecay M0rdecay commented Jun 18, 2020

+1 to this. A long Logstash launch creates difficulties in debugging the pipeline and negatively affects production if the ELK servers are periodically restarted.

Logstash version - 7.7.1 in container.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.