Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clearing a scroll occasionally raises a 'too_long_frame_exception' #141

Closed
JAndritsch opened this issue Nov 23, 2020 · 7 comments
Closed

Comments

@JAndritsch
Copy link

I am using Logstash to pull documents from Elasticsearch via a cron schedule (see config below). The process to close the scrolls once they're done sometimes results in a too_long_frame_exception error, leaving the scroll(s) open until they time out.

  • Version/Operating System: Using Docker image from docker.elastic.co/logstash/logstash:7.9.2
  • Config File:
input {
	elasticsearch {
		hosts => "elasticsearch"
		index => "log-*"
		query => '{ "terminate_after": 12000, "query": { "bool": { "must_not": { "exists": { "field": "transferred" } } } } }'
		size => 3000
		schedule => "* * * * *"
		scroll => "5m"
		docinfo => true
	}
	elasticsearch {
		hosts => "elasticsearch"
		index => "metric-*"
		query => '{ "terminate_after": 12000, "query": { "bool": { "must_not": { "exists": { "field": "transferred" } } } } }'
		size => 3000
		schedule => "* * * * *"
		scroll => "5m"
		docinfo => true
	}
	elasticsearch {
		hosts => "elasticsearch"
		index => "alarm-*"
		query => '{ "terminate_after": 12000, "query": { "bool": { "must_not": { "exists": { "field": "transferred" } } } } }'
		size => 3000
		schedule => "* * * * *"
		scroll => "5m"
		docinfo => true
	}
}

Below is an example of the error message thrown by Logstash when it tries to close a scroll and fails:

[2020-11-12T21:00:00,950][WARN ][logstash.inputs.elasticsearch][9474ce0fd20e69301dbd6f96eaf8f719e7359fd74546e845755fa76aca96dfff] Ignoring clear_scroll exception {:message=>"[400] {\"error\":{\"root_cause\":[{\"type\":\"too_long_frame_exception\",\"reason\":\"An HTTP line is larger than 4096 bytes.\"}],\"type\":\"too_long_frame_exception\",\"reason\":\"An HTTP line is larger than 4096 bytes.\"},\"status\":400}"},
[2020-11-12T21:01:00,058][WARN ][logstash.inputs.elasticsearch][9474ce0fd20e69301dbd6f96eaf8f719e7359fd74546e845755fa76aca96dfff] Ignoring clear_scroll exception {:message=>"[400] {\"error\":{\"root_cause\":[{\"type\":\"too_long_frame_exception\",\"reason\":\"An HTTP line is larger than 4096 bytes.\"}],\"type\":\"too_long_frame_exception\",\"reason\":\"An HTTP line is larger than 4096 bytes.\"},\"status\":400}"},
[2020-11-12T21:02:00,200][WARN ][logstash.inputs.elasticsearch][9474ce0fd20e69301dbd6f96eaf8f719e7359fd74546e845755fa76aca96dfff] Ignoring clear_scroll exception {:message=>"[400] {\"error\":{\"root_cause\":[{\"type\":\"too_long_frame_exception\",\"reason\":\"An HTTP line is larger than 4096 bytes.\"}],\"type\":\"too_long_frame_exception\",\"reason\":\"An HTTP line is larger than 4096 bytes.\"},\"status\":400}"},

I'm not sure what the exact steps are to reproduce this, aside from opening a bunch of scrolls and seeing if they all close properly.

@juan-domenech
Copy link

I found the same issue while trying to use Logstash (7.10.1) to export documents from an Elasticsearch 5.6 to an Elasticsearch 7.10 for migration purposes.

This plugin constantly stopped hitting the limit of bytes an URL can have. I've tried to change the number of documents pulled in every request but to not avail.

[DEBUG] 2021-01-28 15:50:46.308 [[main]<elasticsearch] headers - http-outgoing-0 >> Connection: Keep-Alive
[DEBUG] 2021-01-28 15:50:46.308 [[main]<elasticsearch] headers - http-outgoing-0 >> Authorization: Basic **REDACTED**
[DEBUG] 2021-01-28 15:50:46.308 [[main]<elasticsearch] headers - http-outgoing-0 >> Content-Type: application/json
[DEBUG] 2021-01-28 15:50:46.308 [[main]<elasticsearch] headers - http-outgoing-0 >> Content-Length: 0
[DEBUG] 2021-01-28 15:50:46.308 [[main]<elasticsearch] headers - http-outgoing-0 >> Host: hostname:9443
[DEBUG] 2021-01-28 15:50:46.308 [[main]<elasticsearch] headers - http-outgoing-0 >> User-Agent: Manticore 0.7.0
[DEBUG] 2021-01-28 15:50:46.308 [[main]<elasticsearch] headers - http-outgoing-0 >> Accept-Encoding: gzip,deflate
[DEBUG] 2021-01-28 15:50:46.309 [[main]<elasticsearch] wire - http-outgoing-0 >> "DELETE /_search/scroll/**REDACTED**4369Bytes** HTTP/1.1[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.309 [[main]<elasticsearch] wire - http-outgoing-0 >> "Connection: Keep-Alive[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.309 [[main]<elasticsearch] wire - http-outgoing-0 >> "Authorization: Basic **REDACTED**[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.309 [[main]<elasticsearch] wire - http-outgoing-0 >> "Content-Type: application/json[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.309 [[main]<elasticsearch] wire - http-outgoing-0 >> "Content-Length: 0[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.309 [[main]<elasticsearch] wire - http-outgoing-0 >> "Host: hostnamet:9443[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.309 [[main]<elasticsearch] wire - http-outgoing-0 >> "User-Agent: Manticore 0.7.0[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.309 [[main]<elasticsearch] wire - http-outgoing-0 >> "Accept-Encoding: gzip,deflate[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.309 [[main]<elasticsearch] wire - http-outgoing-0 >> "[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.313 [[main]<elasticsearch] wire - http-outgoing-0 << "HTTP/1.1 400 Bad Request[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.313 [[main]<elasticsearch] wire - http-outgoing-0 << "Date: Thu, 28 Jan 2021 15:50:46 GMT[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.313 [[main]<elasticsearch] wire - http-outgoing-0 << "Server: Apache[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.313 [[main]<elasticsearch] wire - http-outgoing-0 << "content-type: application/json; charset=UTF-8[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.313 [[main]<elasticsearch] wire - http-outgoing-0 << "content-length: 211[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.313 [[main]<elasticsearch] wire - http-outgoing-0 << "Connection: close[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.313 [[main]<elasticsearch] wire - http-outgoing-0 << "[\r][\n]"
[DEBUG] 2021-01-28 15:50:46.313 [[main]<elasticsearch] wire - http-outgoing-0 << "{"error":{"root_cause":[{"type":"too_long_frame_exception","reason":"An HTTP line is larger than 4096 bytes."}],"type":"too_long_frame_exception","reason":"An HTTP line is larger than 4096 bytes."},"status":400}"
[DEBUG] 2021-01-28 15:50:46.314 [[main]<elasticsearch] headers - http-outgoing-0 << HTTP/1.1 400 Bad Request
[DEBUG] 2021-01-28 15:50:46.314 [[main]<elasticsearch] headers - http-outgoing-0 << Date: Thu, 28 Jan 2021 15:50:46 GMT
[DEBUG] 2021-01-28 15:50:46.314 [[main]<elasticsearch] headers - http-outgoing-0 << Server: Apache
[DEBUG] 2021-01-28 15:50:46.314 [[main]<elasticsearch] headers - http-outgoing-0 << content-type: application/json; charset=UTF-8
[DEBUG] 2021-01-28 15:50:46.314 [[main]<elasticsearch] headers - http-outgoing-0 << content-length: 211
[DEBUG] 2021-01-28 15:50:46.314 [[main]<elasticsearch] headers - http-outgoing-0 << Connection: close

Workaround:
Instead of pulling from v7 I'm running Logstash v5.6 getting documents from Elasticsearch 5.6 and indexing into 7.10. This works fine.

@Dani9814
Copy link

i have the same error. I am using 7.12v

[2021-06-28T20:34:00,424][WARN ][logstash.inputs.elasticsearch][main][7a954ab2ef611db5ce0e3a2fd1dcb214cd1202eeebdaaccfcd90f30b6a87be7f] Ignoring clear_scroll exception {:message=>"[400] {"error":{"root_cause":[{"type":"too_long_frame_exception","reason":"An HTTP line is larger than 4096 bytes."}],"type":"too_long_frame_exception","reason":"An HTTP line is larger than 4096 bytes."},"status":400}"}

Does anyone know how to get the http request that logstash is sending?

@juan-domenech
Copy link

Does anyone know how to get the http request that logstash is sending?

@Dani9814 You can run it in Debug like I did (message above).
Example from command line (your system paths might be different):
/usr/share/logstash/bin/logstash --path.settings /etc/logstash --log.level debug

Logstash is going over the limit (in the output above I've substituted the real URL by "REDACTED**4369Bytes" )

[DEBUG] 2021-01-28 15:50:46.309 [[main]<elasticsearch] wire - http-outgoing-0 >> "DELETE /_search/scroll/**REDACTED**4369Bytes** HTTP/1.1[\r][\n]"

@Dani9814
Copy link

Thanks @juan-domenech

i was able to see the request. I saw the same as you . The delete scroll query has a huge scroll id in the url because of this the url has more than 4096 bytes.

I was searching how to solve it and i see this post https://discuss.elastic.co/t/ridiculously-long-scroll-id/12913/3. In that thread they said that scroll id should go in the body not on the url. So i think this is a bug. The problem is that i got this message in only a index currently

I think i am going to test the version Logstash v5.6

@rgmz
Copy link
Contributor

rgmz commented Oct 11, 2021

I'm encountering this issue as well, using the same Docker image as @JAndritsch.

The error is caused by Logstash sending a DELETE request with an absurdly long id (7,676 characters) in the request path:

[DEBUG][org.apache.http.headers  ][redacted-pipeline-id][REDACTED] http-outgoing-10 >> DELETE /_search/scroll/REDACTED_ID_THAT_IS_7000+_CHARS_LONG HTTP/1.1
[DEBUG][org.apache.http.headers  ][redacted-pipeline-id][REDACTED] http-outgoing-10 << HTTP/1.1 400 Bad Request
[WARN ][logstash.inputs.elasticsearch][redacted-pipeline-id][REDACTED] Ignoring clear_scroll exception {:message=>"[400] {\"error\":{\"root_cause\":[{\"type\":\"too_long_frame_exception\",\"reason\":\"An HTTP line is larger than 4096 bytes.\"}],\"type\":\"too_long_frame_exception\",\"reason\":\"An HTTP line is larger than 4096 bytes.\"},\"status\":400}"}

As mentioned, the fix is to send the scroll ID in the body instead of the URL. I'll poke around to see if there's an easy fix in this repository.

I was searching how to solve it and i see this post https://discuss.elastic.co/t/ridiculously-long-scroll-id/12913/3. In that thread they said that scroll id should go in the body not on the url.

@rgmz
Copy link
Contributor

rgmz commented Oct 11, 2021

As mentioned, the fix is to send the scroll ID in the body instead of the URL. I'll poke around to see if there's an easy fix in this repository.

The current usage of clear_scroll is deprecated [1][2]:

def clear_scroll(scroll_id)
@client.clear_scroll(scroll_id: scroll_id) if scroll_id
rescue => e
# ignore & log any clear_scroll errors
logger.warn("Ignoring clear_scroll exception", message: e.message, exception: e.class)
end
.

Passing the scroll_id in the body instead should fix this issue:

 def clear_scroll(scroll_id) 
-   @client.clear_scroll(scroll_id: scroll_id) if scroll_id 
+   @client.clear_scroll(body: { :scroll_id => scroll_id }) if scroll_id 
 rescue => e 
   # ignore & log any clear_scroll errors 
   logger.warn("Ignoring clear_scroll exception", message: e.message, exception: e.class) 
 end 

jsvd pushed a commit to jsvd/logstash-input-elasticsearch that referenced this issue Oct 15, 2021
This should resolve the `too_long_frame_exception` issue described in logstash-plugins#141.
jsvd pushed a commit that referenced this issue Oct 15, 2021
This should resolve the `too_long_frame_exception` issue described in #141.
@roaksoax
Copy link

Closing this issue as it has been fixed by #159 and released in logstash-input-elasticsearch 4.12.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants