Skip to content
This repository has been archived by the owner on Mar 21, 2023. It is now read-only.

hsbeat cause io wait high ! #2

Closed
kkpbbb opened this issue Jan 17, 2016 · 11 comments
Closed

hsbeat cause io wait high ! #2

kkpbbb opened this issue Jan 17, 2016 · 11 comments

Comments

@kkpbbb
Copy link

kkpbbb commented Jan 17, 2016

Hi,I try to use the hsbeat collect the JVM info.
I found it cause the System TOP command CPU wa% from 1% to 70% even more!
My system is centos 6 and java version info is
openjdk version "1.8.0_65"
OpenJDK Runtime Environment (build 1.8.0_65-b17)
OpenJDK 64-Bit Server VM (build 25.65-b01, mixed mode)

can we use the javaagent like hsflowd http://blog.sflow.com/2011/09/java-virtual-machine.html to monitor jvm performance ?

thanks

sorry for my bad english

@YaSuenag
Copy link
Owner

Thank you for reporting!

I've fixed performance issue of file access:
1215256

However, we might encounter performance issue.
When I ran (new) hsbeat with profiler, it spent CPU time in libbeat publisher and JSON processing.

It might be improved with index structure in ES.
I will continue to work for it in performance-improvement branch .

@YaSuenag YaSuenag reopened this Jan 17, 2016
@YaSuenag
Copy link
Owner

I improved performance of HSBeat than before:
4031b8b

profile.zip

I've attached CPU profile callgraph of it.

  • Machine: Fedora23 x86_64 Hyper-V virtual machine (2 vcpu / 2GB mem)
  • Host CPU: Intel Core i3 2367M 1.4GHz
  • HSBeat collection interval: 10 sec

This profile shows that top overhead is GC routine.
To eliminate them, I implemented entry cache of hsperfdata entries.

It is too difficult to improve more performance.
But I will try it :-)

@kkpbbb
Copy link
Author

kkpbbb commented Feb 17, 2016

Thanks
I give it a try and found io wait is low ,but user and system cpu is very high and use so much more memory finally cause the jvm killed by oom!

@YaSuenag
Copy link
Owner

I guess that high CPU usage is caused by Go runtime.

diff --git a/hsbeat/hsbeat.go b/hsbeat/hsbeat.go
index 4dede96..c2a5632 100644
--- a/hsbeat/hsbeat.go
+++ b/hsbeat/hsbeat.go
@@ -135,18 +135,22 @@ func (this *HSBeat) publishCached(b *beat.Beat) error {
 }

 func (this *HSBeat) Run(b *beat.Beat) error {
+/*
   err := this.publishAll(b)
   if err != nil {
     return err
   }
+*/

   for !this.ShouldTerminate {
     time.Sleep(this.Interval)

+/*
     err := this.publishCached(b)
     if err != nil {
       return err
     }
+*/

   }

diff --git a/main.go b/main.go
index f2d0c83..3912898 100644
--- a/main.go
+++ b/main.go
@@ -28,7 +28,7 @@ import (

   hsbeat "github.com/YaSuenag/hsbeat/hsbeat"

-  //"runtime/pprof"
+  "runtime/pprof"
 )


@@ -38,14 +38,18 @@ func main() {
     log.Fatal(err)
   }

-/*
-  prof, err := os.Create("hsbeat.prof")
+  prof, err := os.Create("hsbeat.pprof")
   if err != nil {
     log.Fatal(err)
   }
   pprof.StartCPUProfile(prof)
   defer pprof.StopCPUProfile()
-*/
+
+  mprof, err := os.Create("hsbeat.mprof")
+  if err != nil {
+    log.Fatal(err)
+  }
+  defer pprof.WriteHeapProfile(mprof)

   hb :=&hsbeat.HSBeat{os.Args[1], time.Duration(interval), "",
                                                        false, nil, nil, nil}

I applied above patch (this patch makes empty loop in Beat), but I saw high CPU usage.
top

CPU profiling shows futex call is the most expensive in profiling session.
profile.zip

Conclusion:
We have to work for more performance improvement. However, we might not be able to avoid overhead in Go runtime.

@ruflin
Copy link

ruflin commented Feb 17, 2016

@YaSuenag Which libbeat version are you currently using for hsbeat?

@YaSuenag
Copy link
Owner

@ruflin I use v1.0.0 (be66518)

@ruflin
Copy link

ruflin commented Feb 18, 2016

@YaSuenag I didn't check the issue in detail but in general it could help to upgrade the most recent version of libbeat.

@YaSuenag
Copy link
Owner

I fixed this issue in e02c891 !
I close this issue but kibana,json in current HEAD (above commit) does not support this changes.

So I will fix it soon in another commit.

@YaSuenag
Copy link
Owner

CPU usage in e02c891 is < 1% in my machine. Try it!

@kkpbbb
Copy link
Author

kkpbbb commented Feb 22, 2016

Hi
Thanks for your fix.
I give it a try ,every things is fine.
But there no any data in elasticsearch except the common header!!
Is that correct ?? You have comment too many code ?
image

@YaSuenag
Copy link
Owner

It is correct.
I've disabled _source in dynamic template to reduce cost of Elasticsearch:
c0a9dff

You can see data on Kibana dashboard if you import etc/kibana.json .

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants