Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to send data to elasticsearch due to error - 'net/http: request canceled' #22

Closed
krishan-agrawal-guavus opened this issue Feb 8, 2019 · 7 comments

Comments

@krishan-agrawal-guavus
Copy link

krishan-agrawal-guavus commented Feb 8, 2019

Error observed in prometheus beat logs:
2019-02-07T13:40:56.848Z INFO [monitoring] log/log.go:124 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":140,"time":141},"total":{"ticks":530,"time":539,"value":530},"user":{"ticks":390,"time":398}},"info":{"ephemeral_id":"16da06d7-11fa-4512-93f0-5fe4eb48ca2b","uptime":{"ms":180010}},"memstats":{"gc_next":22617376,"memory_alloc":18169120,"memory_total":38017664,"rss":1413120}},"libbeat":{"config":{"module":{"running":0}},"pipeline":{"clients":1,"events":{"active":10168,"published":1688,"total":1688}}},"system":{"load":{"1":2.12,"15":2.42,"5":2.29,"norm":{"1":0.0883,"15":0.1008,"5":0.0954}}}}}}
2019-02-07T13:39:29.731Z ERROR elasticsearch/client.go:299 Failed to perform any bulk index operations: Post http://192.168.194.143:9200/_bulk: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2019-02-07T13:39:30.732Z ERROR pipeline/output.go:92 Failed to publish events: Post http://192.168.194.143:9200/_bulk: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

Error observed in prometheus:
Feb 08 05:29:50 miq-sap-prod-mgt-01.gvs.ggn prometheus[16515]: level=warn ts=2019-02-08T05:29:50.593010061Z caller=queue_manager.go:531 component=remote queue=0:http://192.168.162.71:10080/prometheus msg="Error sending samples to remote storage" count=20 err="context deadline exceeded"
Feb 08 05:29:53 miq-sap-prod-mgt-01.gvs.ggn prometheus[16515]: level=warn ts=2019-02-08T05:29:53.342480882Z caller=queue_manager.go:230 component=remote queue=0:http://192.168.162.71:10080/prometheus msg="Remote storage queue full, discarding sample. Multiple subsequent messages of this kind may be suppressed."

Logs prometheus beat in debug mode:
2019-02-08T05:37:47.903Z DEBUG [event] common/event.go:55 Dropped nil value from event where key=name
2019-02-08T05:37:47.903Z DEBUG [event] common/event.go:55 Dropped nil value from event where key=tags
2019-02-08T05:37:47.904Z DEBUG [publish] pipeline/processor.go:275 Publish event: {
"@timestamp": "2019-02-08T05:37:47.780Z",
"@metadata": {
"beat": "prometheusbeat",
"type": "doc",
"version": "6.2.4"
},
"value": 2.147483648e+09,
"labels": {
"name": "jvm_memory_bytes_max",
"environment": "****",
"instance": "",
"job": "kubernetes-pods"
},
"beat": {
"version": "6.2.4",
"name": "
",
"hostname": "*****"
}
}

ES version is also 6.2.4

Can you help with resolving the issue?

@boernd
Copy link
Contributor

boernd commented Feb 8, 2019

Are you able to connect to Elasticsearch in general from within the prometheusbeat container (e.g. issuing curl -XGET http://<ip>:9200/_cluster/health?pretty)?

@krishan-agrawal-guavus
Copy link
Author

krishan-agrawal-guavus commented Feb 8, 2019

The prometheus beat is running as process.. ES is up and other beats are able to send data from same node

{
"cluster_name" : "my-application",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 192,
"active_shards" : 192,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 191,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 50.13054830287206
}

@boernd
Copy link
Contributor

boernd commented Feb 8, 2019

Maybe ES is overloaded, you can check the threadpools: curl -XGET localhost:9200/_cat/thread_pool?v
The write threadpool is the one who is used when prometheusbeat sends the metrics via the _bulk API.

Any suspicious logs in ES itself? Garbage collection logs?

@boernd
Copy link
Contributor

boernd commented Feb 8, 2019

Is there no prometheusbeat data at all in ES? curl -XGET localhost:9200/_cat/indices | grep prometheusbeat

@krishan-agrawal-guavus
Copy link
Author

I am to able to push data in same elasticsearch instance with same configs of prometheusbeat from another setup.
The is the only difference between prometheus instances of the 2 setups is that there is pushgateway enabled in case of non-working setup.

Can that cause issue?

@krishan-agrawal-guavus
Copy link
Author

I am not observation any suspicious logs on ES side

@boernd
Copy link
Contributor

boernd commented Feb 8, 2019

I am to able to push data in same elasticsearch instance with same configs of prometheusbeat from another setup.
The is the only difference between prometheus instances of the 2 setups is that there is pushgateway enabled in case of non-working setup.

Can that cause issue?

I'm not that familiar with the pushgateway but the prometheusbeat logs rather indicate that it cannot connect to ES and send data. So it looks like that the issue is on ES side or network between prometheusbeat and ES.

@boernd boernd closed this as completed Jun 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants