Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle Request Entity Too Large errors in ElasticSearchOutput #7071

Merged
merged 9 commits into from Jan 10, 2020
Merged

Conversation

@mpfz0r
Copy link
Member

mpfz0r commented Jan 2, 2020

If we try to bulk index a batch of messages that exceeds the
elastic search http.max_content_length setting. (default 100MB)
Elastic will respond with an HTTP 413 Entity Too Large error.

In this case we retry the request by splitting the message batch
in half.

When responding with an HTTP 413 error, the server is allowed to close the connection
immediately. This means that our HTTP client (Jest) will simply report
an IOException (Broken pipe) instead of the actual error.
This can be avoided by sending the request with an Expect-Continue
header, which also avoids sending data that will be discarded later on.

Fixes #5091
Fixes #6965

@mpfz0r mpfz0r force-pushed the issue-5091 branch from 4a88d56 to 7facbb2 Jan 3, 2020
@mpfz0r mpfz0r requested a review from bernd Jan 3, 2020
@mpfz0r mpfz0r added ready-for-review and removed in progress labels Jan 3, 2020
Copy link
Member

bernd left a comment

The build failed:

[ERROR] Forbidden method invocation: org.joda.time.DateTime#now() [Constructing a DateTime without a time zone is dangerous]
[ERROR]   in org.graylog2.indexer.messages.MessagesIT (MessagesIT.java:120)
@mpfz0r mpfz0r requested a review from bernd Jan 7, 2020
@bernd bernd self-assigned this Jan 8, 2020
mpfz0r added 7 commits Dec 19, 2019
If we try to bulk index a batch of messages that exceeds the
elastic search `http.max_content_length` setting. (default 100MB)
Elastic will respond with an HTTP 413 Entity Too Large error.

In this case we retry the request by splitting the message batch
in half.

When responding with an HTTP 413 error, the server is allowed to close the connection
immediately. This means that our HTTP client (Jest) will simply report
an IOException (Broken pipe) instead of the actual error.
This can be avoided by sending the request with an Expect-Continue
header, which also avoids sending data that will be discarded later on.

Fixes #5091
If we have a batch where only the messages at the end will
exceed the Entity Too Large limit, we could end up duplicating
messages.
Thus keep track of the already indexed offset and report it within the
EntityTooLargeException.
@mpfz0r mpfz0r force-pushed the issue-5091 branch from 9d4a0e1 to 69edda2 Jan 10, 2020
@mpfz0r mpfz0r requested a review from bernd Jan 10, 2020
@bernd
bernd approved these changes Jan 10, 2020
Copy link
Member

bernd left a comment

LGTM and my tests have been successful. Good job! 👍

@bernd bernd merged commit 085930a into master Jan 10, 2020
4 checks passed
4 checks passed
ci-web-linter Jenkins build graylog-pr-linter-check 5848 has succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
graylog-project/pr Jenkins build graylog-project-pr-snapshot 7824 has succeeded
Details
license/cla Contributor License Agreement is signed.
Details
@bernd bernd deleted the issue-5091 branch Jan 10, 2020
bernd added a commit that referenced this pull request Jan 10, 2020
If we try to bulk index a batch of messages that exceeds the
elastic search `http.max_content_length` setting. (default 100MB)
Elastic will respond with an HTTP 413 Entity Too Large error.

In this case we retry the request by splitting the message batch
in half.

When responding with an HTTP 413 error, the server is allowed to close the connection
immediately. This means that our HTTP client (Jest) will simply report
an IOException (Broken pipe) instead of the actual error.
This can be avoided by sending the request with an Expect-Continue
header, which also avoids sending data that will be discarded later on.

Fixes #5091

* Move JestClient execution with RequestConfig into JestUtils
* Please forbiddenapi checker
* Correctly handle batches with unevenly sized messages
  If we have a batch where only the messages at the end will
  exceed the Entity Too Large limit, we could end up duplicating
  messages.
  Thus keep track of the already indexed offset and report it within the
  EntityTooLargeException.
* Make use of Expect: 100-continue header configurable

(cherry picked from commit 085930a)
mpfz0r added a commit that referenced this pull request Jan 13, 2020
…#7148)

* Handle Request Entity Too Large errors in ElasticSearchOutput (#7071)

If we try to bulk index a batch of messages that exceeds the
elastic search `http.max_content_length` setting. (default 100MB)
Elastic will respond with an HTTP 413 Entity Too Large error.

In this case we retry the request by splitting the message batch
in half.

When responding with an HTTP 413 error, the server is allowed to close the connection
immediately. This means that our HTTP client (Jest) will simply report
an IOException (Broken pipe) instead of the actual error.
This can be avoided by sending the request with an Expect-Continue
header, which also avoids sending data that will be discarded later on.

Fixes #5091

* Move JestClient execution with RequestConfig into JestUtils
* Please forbiddenapi checker
* Correctly handle batches with unevenly sized messages
  If we have a batch where only the messages at the end will
  exceed the Entity Too Large limit, we could end up duplicating
  messages.
  Thus keep track of the already indexed offset and report it within the
  EntityTooLargeException.
* Make use of Expect: 100-continue header configurable

(cherry picked from commit 085930a)

* Adopt MessagesIT to old IT framwork

Also change test to run with a specific index

* Skip memory intensive MessagesIT tests on Travis

* use getenv not getProperty

Co-authored-by: Marco Pfatschbacher <marco@graylog.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.