Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent closing of UDP channel in input for messages with bogus GELF header. #6263

Merged
merged 1 commit into from Aug 9, 2019

Conversation

@dennisoelkers
Copy link
Member

commented Aug 8, 2019

Description

Motivation and Context

Before this change, whenever a UDP message came in with a 0 or 1 byte header (smaller than the 2 byte GELF header), the thread handling this message throws an exception in GELFMessage#getGELFType which is caught in EnvelopeMessageHandler#exceptionCaught. In there, the channel is closed. This is correct for a TCP-based input, as the corresponding TCP-connection would be closed, as each channel corresponds to a TCP connection. This is not the case for UDP pipelines. Closing a channel of a UDP pipeline leads to the handler/thread losing the overall connection to the socket multiplexing. If this happens for a number of times equaling the number of working threads, the whole input stops processing any UDP messages as there is no thread left which is able to receive channel events.

This change adds EnvelopeMessageAggregationHandler#exceptionCaught which is only logging the exception and increasing the counter for invalid GELF chunks, without propagating the exception to other handlers (preventing the channel closure).

A remaining issue is that for UDP messages with no/invalid GELF headers, the EnvelopeMessageAggregationHandler#channelRead0 is called four times, for correct messages only once. The first three times it is called with a single byte buffer, the last time with a buffer containing the complete message. It is unclear to me why this happens and leads to the exception being logged four times, but this seems to be still way better than causing situations where the complete input ceases to work. Any insight how to fix this is welcome.

Fixes #5701.

How Has This Been Tested?

  • Send 12 messages (actual count depends on the number of configured threads) with no header to the GELF UDP port, e.g. like this:
          for i in `seq 1 12`; do nc -vzu localhost 12201; done
    
  • Send proper GELF UDP message:
      echo '{"version": "1.1","host":"example.org","short_message":"A short message that helps you identify what is going on","full_message":"Backtrace here\n\nmore stuff","level":1,"_user_id":9001,"_some_info":"foo","_some_env_var":"bar"}' | gzip | nc -u -w 1 localhost 12201
    
  • Ensure that message was processed and indexed

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
Prevent closing of UDP channel in input for messages with bogus GELF …
…header.

Before this change, whenever a UDP message came in with a 0 or 1 byte
header (smaller than the 2 byte GELF header), the thread handling this
message throws an exception in `GELFMessage#getGELFType` which is caught
in `EnvelopeMessageHandler#exceptionCaught`. In there, the channel is
closed. This is correct for a TCP-based input, as the corresponding
TCP-connection would be closed, as each channel corresponds to a TCP
connection. This is not the case for UDP pipelines. Closing a channel of
a UDP pipeline leads to the handler/thread losing the overall connection
to the socket multiplexing. If this happens for a number of times
equaling the number of working threads, the whole input stops processing
any UDP messages as there is no thread left which is able to receive
channel events.

This change adds `EnvelopeMessageAggregationHandler#exceptionCaught`
which is only logging the exception and increasing the counter for
invalid GELF chunks, without propagating the exception to other handlers
(preventing the channel closure).

A remaining issue is that for UDP messages with no/invalid GELF headers,
the `EnvelopeMessageAggregationHandler#channelRead0` is called four
times, for correct messages only once. The first three times it is
called with a single byte buffer, the last time with a buffer containing
the complete message. It is unclear to me why this happens and leads to
the exception being logged four times, but this seems to be still way
better than causing situations where the complete input ceases to work.
Any insight how to fix this is welcome.

Fixes #5701.

@dennisoelkers dennisoelkers added this to the 3.1.0 milestone Aug 8, 2019

@dennisoelkers dennisoelkers requested a review from bernd Aug 8, 2019

@bernd bernd self-assigned this Aug 8, 2019

@bernd
bernd approved these changes Aug 9, 2019
Copy link
Member

left a comment

Thanks for the investigation! Good work! 👍

The issue with calling EnvelopeMessageAggregationHandler#channelRead0() four times seems to be related to the nc command you are using for testing. On my Linux machine I actually see it being called 5 times. When using ncat though, I only see a single log message. (printf "\0" | ncat -uv 127.0.0.1 12201)

So I would say it's actually not an issue in the netty stack but some behavior in the necat tool.

@bernd bernd merged commit 2042b29 into master Aug 9, 2019

4 checks passed

ci-web-linter Jenkins build graylog-pr-linter-check 4082 has succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
graylog-project/pr Jenkins build graylog-project-pr-snapshot 5110 has succeeded
Details
license/cla Contributor License Agreement is signed.
Details

@bernd bernd deleted the issue-5701 branch Aug 9, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.