Skip to content
This repository has been archived by the owner on Jun 29, 2023. It is now read-only.

Logging stuck in infinite loop with gelf ssl appender #129

Closed
rbjorklin opened this issue Jan 29, 2018 · 4 comments
Closed

Logging stuck in infinite loop with gelf ssl appender #129

rbjorklin opened this issue Jan 29, 2018 · 4 comments
Labels
type: bug A general bug

Comments

@rbjorklin
Copy link

We have seen our Java application get stuck in what seems to be an infinite loop where it consumes 100% CPU on one core when using the below configuration.

<appender name="gelf" class="biz.paluch.logging.gelf.logback.GelfLogbackAppender">                    
    <host>ssl:graylog.company.com</host>   
    <version>1.1</version>                       
    <extractStackTrace>true</extractStackTrace>  
    <additionalFields>service=product</additionalFields>                                               
    <mdcFields>organization,cluster,api-id,x-forwarded-for</mdcFields>                            
</appender>

The last time it happened it correlated somewhat with our Graylog server going offline. We've let it run for awhile to see if it would recover but it doesn't seem like it, is there some kind of request/read/write timeout missing somewhere?

Ran strace on the process and could observe this:

> sudo strace -c -f -p 31000
strace: Process 31000 attached with 32 threads
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 78.11  129.794311       20463      6343      2881 futex
  8.85   14.709935      119593       123           accept
  8.85   14.701845       39101       376           epoll_wait
  3.37    5.603505           3   2210622   2210303 read
  0.78    1.304115      144902         9         7 restart_syscall
  0.02    0.034064           3     12195           clock_gettime
  0.00    0.003425           3      1260           gettimeofday
  0.00    0.001792           6       311           write
  0.00    0.001219           3       369           fcntl
  0.00    0.001136           2       499           epoll_ctl
  0.00    0.000703           3       246           getsockname
  0.00    0.000682           3       246           setsockopt
  0.00    0.000474           4       123           dup2
  0.00    0.000458           4       123           shutdown
  0.00    0.000259           2       116           stat
  0.00    0.000094           1       123           close
  0.00    0.000033           8         4           mmap
  0.00    0.000033          17         2           mprotect
  0.00    0.000032           8         4           sched_getaffinity
  0.00    0.000032          16         2           set_robust_list
  0.00    0.000000           0         8           rt_sigprocmask
  0.00    0.000000           0         2           madvise
  0.00    0.000000           0         2           clone
  0.00    0.000000           0         2           gettid
------ ----------- ----------- --------- --------- ----------------
100.00  166.158147               2233110   2213191 total

Stacktrace

https://gist.github.com/rbjorklin/28cf847c1bb673620c62cf3781b92737

Version

logstash-gelf: 1.11.1
logback: 1.2.3
slf4j: 1.7.25

@mp911de mp911de added the type: bug A general bug label Jan 29, 2018
@mp911de
Copy link
Owner

mp911de commented Jan 29, 2018

Thanks for reporting the issue. The connector got stuck in the SSL handshake attempting to read more data from the socket. Looks like a bug.

mp911de added a commit that referenced this issue Jan 29, 2018
SSL handshake is now guarded against getting stuck because of an invalid socket state by applying the connect timeout. SSL handshake is also canceled if the socket got disconnected in during the handshake.
@mp911de mp911de added this to the logstash-gelf 1.11.2 milestone Jan 29, 2018
@mp911de
Copy link
Owner

mp911de commented Jan 29, 2018

That's fixed now. You can give it a spin with 1.11.2-SNAPSHOT.

@mp911de mp911de closed this as completed Jan 29, 2018
@rbjorklin
Copy link
Author

Looks like it's working fine! Any plans for a new patch release soonish?

@mp911de
Copy link
Owner

mp911de commented Jan 31, 2018

Thanks for trying snapshots. I think I'll do a release today.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: bug A general bug
Projects
None yet
Development

No branches or pull requests

2 participants