Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ElasticsearchWriter gives "Unexpected response code 400" with Elasticsearch 6.x #5795

Closed
BjoernRobbe opened this issue Nov 23, 2017 · 32 comments · Fixed by #5840
Closed

ElasticsearchWriter gives "Unexpected response code 400" with Elasticsearch 6.x #5795

BjoernRobbe opened this issue Nov 23, 2017 · 32 comments · Fixed by #5840
Labels
area/elastic Events to Elasticsearch bug Something isn't working
Milestone

Comments

@BjoernRobbe
Copy link

Using Elasticsearch 6.0.0 gives me error code 400 on the elasticsearch feature.

My elasticsearch.conf:

library "perfdata"

object ElasticsearchWriter "elasticsearch" {
  host = "srv-log02.example.com"
  port = 9200

  index = "icinga2"

  enable_send_perfdata = false

  flush_threshold = 1024
  flush_interval = 10s
}

Enabling debuglog shows:

 [2017-11-23 15:09:43 +0100] information/ElasticsearchWriter: 'elasticsearch' started.
 [2017-11-23 15:09:43 +0100] notice/WorkQueue: Spawning WorkQueue threads for 'ElasticsearchWriter, elasticsearch'
 [2017-11-23 15:09:43 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:43 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:43 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:43 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:43 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:43 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:46 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:46 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:47 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:47 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:47 +0100] debug/ElasticsearchWriter: Processing notification for 'icinga1.example.com!cluster-ha'
 [2017-11-23 15:09:47 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:47 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:50 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:51 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:51 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:52 +0100] debug/ElasticsearchWriter: Add to fields to message list: '...
 [2017-11-23 15:09:52 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 0, rate: 0.266667/s (16/min 16/5min 16/15min);
 [2017-11-23 15:09:53 +0100] debug/ElasticsearchWriter: Timer expired writing 16 data points
 [2017-11-23 15:09:53 +0100] notice/ElasticsearchWriter: Connecting to Elasticsearch on host 'srv-log02.example.com' port '9200'.
 [2017-11-23 15:09:53 +0100] debug/ElasticsearchWriter: Sending POST request to 'http://srv-log02.example.com:9200/icinga2-2017.11.23/_bulk'.
 [2017-11-23 15:09:53 +0100] warning/ElasticsearchWriter: Unexpected response code 400
 [2017-11-23 15:10:02 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 5, rate: 0.266667/s (16/min 16/5min 16/15min); empty in 9 seconds
 [2017-11-23 15:10:12 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 10, rate: 0.266667/s (16/min 16/5min 16/15min); empty in 19 seconds
 [2017-11-23 15:10:22 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 19, rate: 0.266667/s (16/min 16/5min 16/15min); empty in 21 seconds
 [2017-11-23 15:10:32 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 25, rate: 0.266667/s (16/min 16/5min 16/15min); empty in 41 seconds
 [2017-11-23 15:10:42 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 31, rate: 0.266667/s (16/min 16/5min 16/15min); empty in 51 seconds
 [2017-11-23 15:10:52 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 35, rate: 0.266667/s (16/min 16/5min 16/15min); empty in 1 minute and 27 seconds
 [2017-11-23 15:11:02 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 41, rate: 0.266667/s (16/min 16/5min 16/15min); empty in 1 minute and 8 seconds
 [2017-11-23 15:11:12 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 46, rate: 0.266667/s (16/min 16/5min 16/15min); empty in 1 minute and 31 seconds
 [2017-11-23 15:11:22 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 53, rate: 0.266667/s (16/min 16/5min 16/15min); empty in 1 minute and 15 seconds

Reading the docs says:

Note
Elasticsearch 5.x+ is required.

Does this feature work on Elasticsearch 6.0.0 ?

BTW, first try was with nginx in front of the elastic, but after I get the HTTP error 400, I stop the nginx and put the elastic on the lan.

My Environment

  • Version used (icinga2 --version): icinga2 - The Icinga 2 network monitoring daemon (version: r2.8.0-1)
  • Operating System and version: CentOS 6.9
  • Enabled features (icinga2 feature list): api checker elasticsearch graphite ido-mysql livestatus mainlog notification perfdata
  • Elasticsearch version: 6.0.0
@Crunsher
Copy link
Contributor

I should work, I see nothing in the breaking changes that should affect the elasticsearchwriter. But when we implemented this feature, there was no elasticsearch 6 yet, so it wasn't tested with that version.

@BjoernRobbe
Copy link
Author

BjoernRobbe commented Nov 23, 2017

Dear Crunsher,

thanks for reply.
I looked into the "breaking changes" of 6.0.0 and found that:

Content-Type Auto-detection
  In previous versions of Elasticsearch, having a proper Content-Type for the data in a request
  was not enforced. Elasticsearch 6.0.0 enforces that all requests with a body must have a
  supported Content-Type and this type will be used when parsing the data.

The Bulk API says:

NOTE: the final line of data must end with a newline character \n. Each newline character may
be preceded by a carriage return \r. When sending requests to this endpoint the Content-Type
header should be set to application/x-ndjson

The Content-Type header should be set to application/x-ndjson

Can this be the problem?

@lukasertl
Copy link

The same error happens on ES 5.6, too, so this is not ES 6.0 specific.

@gunnarbeutner gunnarbeutner added bug Something isn't working area/elastic Events to Elasticsearch labels Nov 24, 2017
@Crunsher
Copy link
Contributor

@BjoernRobbe Oh I overlooked this, we are using application/json

When developing I was using 5.4

@BjoernRobbe
Copy link
Author

@Crunsher Looking at the docs for the Bulk API they started with Version 5.3 with this
Content-Type. So, if version 5.6 has this Problem too, what has changed from version 5.4 to
version 5.6?

@Crunsher
Copy link
Contributor

¯_(ツ)_/¯

I'll investigate once I'm on a better connection and can update my elastic vagrant box

@dnsmichi
Copy link
Contributor

5.6.x works for me, I did the final release tests for 2.8 inside the Vagrant box.

[root@icinga2-elastic ~]# curl http://localhost:9200
{
  "name" : "icinga2-elastic-elastic-es",
  "cluster_name" : "elastic",
  "cluster_uuid" : "mONmyI7STiKoiNn4lz7Aag",
  "version" : {
    "number" : "5.6.4",
    "build_hash" : "8bbedf5",
    "build_date" : "2017-10-31T18:55:38.105Z",
    "build_snapshot" : false,
    "lucene_version" : "6.6.1"
  },
  "tagline" : "You Know, for Search"
}

@dnsmichi
Copy link
Contributor

Content-Type: application/json should still be supported, according to this issue it is unclear if the new content-type is really the new way to go.

elastic/elasticsearch#25718

rsyslog reverted the content-type change in rsyslog/rsyslog#1743

I don't think that this is related to this issue then. Still, proper tests with v6 are required.
That's to be in #5801 ... I would guess this issue is about something "different".

It might be related to wrongly parsing the response body. @gunnarbeutner told me about this when looking into the InfluxDBWriter code.

Still, I'd like to see the Elasticsearch log when the status code 400 is returned.

@dnsmichi dnsmichi added the needs feedback We'll only proceed once we hear from you again label Nov 27, 2017
@BjoernRobbe
Copy link
Author

@dnsmichi Setting up a dev ES 6 and enabled debug, the log shows:

[2017-11-28T07:59:38,872][DEBUG][r.suppressed             ] path: /icinga2-2017.11.28/_bulk, params: {index=icinga2-2017.11.28}
java.lang.IllegalArgumentException: The bulk request must be terminated by a newline [
]
        at org.elasticsearch.action.bulk.BulkRequest.findNextMarker(BulkRequest.java:522) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.action.bulk.BulkRequest.add(BulkRequest.java:389) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.rest.action.document.RestBulkAction.prepareRequest(RestBulkAction.java:94) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:80) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:240) [elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.rest.RestController.tryAllHandlers(RestController.java:336) [elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:174) [elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.http.netty4.Netty4HttpServerTransport.dispatchRequest(Netty4HttpServerTransport.java:491) [transport-netty4-6.0.0.jar:6.0.0]
        ...

Hope that helps.

@dnsmichi
Copy link
Contributor

Gotcha. Thanks for the heads up, I was expecting something like that.

@dnsmichi dnsmichi changed the title ElasticsearchWriter gives "Unexpected response code 400" ElasticsearchWriter gives "Unexpected response code 400" with Elasticsearch 6.x Nov 28, 2017
@dnsmichi dnsmichi self-assigned this Nov 28, 2017
@dnsmichi dnsmichi removed the needs feedback We'll only proceed once we hear from you again label Nov 28, 2017
@dnsmichi dnsmichi added this to the 2.8.1 milestone Nov 28, 2017
@dnsmichi
Copy link
Contributor

dnsmichi commented Dec 5, 2017

It is kind of a ride to bring the components together for Elastic Stack 6.x inside Puppet 4, but now I see it too with a local version of Icinga/icinga-vagrant#56.

[2017-12-05 09:06:05 -0800] debug/ElasticsearchWriter: Processing notification for 'icinga2-puppet4!random-005'
[2017-12-05 09:06:05 -0800] debug/ElasticsearchWriter: Add to fields to message list: '{"@timestamp":"2017-12-05T09:06:05.921-0800","author":"","check_command":"random","check_result.check_source":"icinga2-puppet4","check_result.command":null,"check_result.execution_end":"2017-12-05T09:06:05.921-0800","check_result.execution_start":"2017-12-05T09:06:05.920-0800","check_result.execution_time":0.00010204315185546875,"check_result.exit_status":0.0,"check_result.latency":0.0009329319000244140625,"check_result.output":"Hello from icinga2-puppet4","check_result.perfdata.time.value":1512493565.9210278988,"check_result.schedule_end":"2017-12-05T09:06:05.921-0800","check_result.schedule_start":"2017-12-05T09:06:05.920-0800","check_result.state":1.0,"check_result.vars_after":{"attempt":1.0,"reachable":true,"state":1.0,"state_type":1.0},"check_result.vars_before":{"attempt":2.0,"reachable":true,"state":1.0,"state_type":0.0},"host":"icinga2-puppet4","last_hard_state":1.0,"last_state":1.0,"notification_type":"PROBLEM","service":"random-005","state":1.0,"text":"","timestamp":"2017-12-05T09:06:05.921-0800","type":"icinga2.event.notification","users":["icingaadmin"]}'.
[2017-12-05 09:06:06 -0800] debug/ElasticsearchWriter: Add to fields to message list: '{"@timestamp":"2017-12-05T09:06:06.270-0800","check_command":"random","check_result.check_source":"icinga2-puppet4","check_result.command":null,"check_result.execution_end":"2017-12-05T09:06:06.270-0800","check_result.execution_start":"2017-12-05T09:06:06.270-0800","check_result.execution_time":3.79085540771484375e-05,"check_result.exit_status":0.0,"check_result.latency":4.41074371337890625e-05,"check_result.output":"Hello from icinga2-puppet4","check_result.perfdata.time.value":1512493566.2706019878,"check_result.schedule_end":"2017-12-05T09:06:06.270-0800","check_result.schedule_start":"2017-12-05T09:06:06.270-0800","check_result.state":2.0,"check_result.vars_after":{"attempt":1.0,"reachable":true,"state":2.0,"state_type":0.0},"check_result.vars_before":{"attempt":1.0,"reachable":true,"state":0.0,"state_type":1.0},"current_check_attempt":1.0,"host":"icinga2-puppet4","last_hard_state":0.0,"last_state":0.0,"max_check_attempts":3.0,"reachable":true,"service":"random-001","state":2.0,"state_type":0.0,"timestamp":"2017-12-05T09:06:06.270-0800","type":"icinga2.event.checkresult"}'.
[2017-12-05 09:06:06 -0800] debug/ElasticsearchWriter: Add to fields to message list: '{"@timestamp":"2017-12-05T09:06:06.270-0800","check_command":"random","check_result.check_source":"icinga2-puppet4","check_result.command":null,"check_result.execution_end":"2017-12-05T09:06:06.270-0800","check_result.execution_start":"2017-12-05T09:06:06.270-0800","check_result.execution_time":3.79085540771484375e-05,"check_result.exit_status":0.0,"check_result.latency":4.41074371337890625e-05,"check_result.output":"Hello from icinga2-puppet4","check_result.perfdata.time.value":1512493566.2706019878,"check_result.schedule_end":"2017-12-05T09:06:06.270-0800","check_result.schedule_start":"2017-12-05T09:06:06.270-0800","check_result.state":2.0,"check_result.vars_after":{"attempt":1.0,"reachable":true,"state":2.0,"state_type":0.0},"check_result.vars_before":{"attempt":1.0,"reachable":true,"state":0.0,"state_type":1.0},"current_check_attempt":1.0,"host":"icinga2-puppet4","last_hard_state":0.0,"last_state":0.0,"max_check_attempts":3.0,"service":"random-001","state":2.0,"timestamp":"2017-12-05T09:06:06.270-0800","type":"icinga2.event.statechange"}'.
[2017-12-05 09:06:06 -0800] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 0, rate: 0.483333/s (29/min 29/5min 29/15min);
[2017-12-05 09:06:06 -0800] debug/ElasticsearchWriter: Timer expired writing 29 data points
[2017-12-05 09:06:06 -0800] notice/ElasticsearchWriter: Connecting to Elasticsearch on host '127.0.0.1' port '9200'.
[2017-12-05 09:06:06 -0800] debug/ElasticsearchWriter: Sending POST request to 'http://127.0.0.1:9200/icinga2-2017.12.05/_bulk'.
[2017-12-05 09:06:06 -0800] warning/ElasticsearchWriter: Unexpected response code 400

@guytwena
Copy link

guytwena commented Dec 6, 2017

Hi,

Any update on this issue, is there any workaround?

Another question, I tried to send the _bulk through kibana "dev tools" and it worked,
Doesn't kibana uses the rest api to POST data?

Thanks,

Guy

@dnsmichi
Copy link
Contributor

dnsmichi commented Dec 6, 2017

I don't think that there is a workaround, as you need to add a final "\n" to the sent body. I am working on this, still I have other projects atm. Testing a local fix.

Kibana likely adds the missing newline on its own. It might be the case that Kibana developers added this in 5.x already as they probably knew about this breaking change.

Two things for this issue:

  • if the newline solves the issue, it is mandatory to keep compatibility with 5.x intact. Might need a configuration flag if it breaks.
  • the http response handling is buggy, a similar issue is fixed in the influxdbwriter code parts

@dnsmichi
Copy link
Contributor

dnsmichi commented Dec 7, 2017

References

new line terminator

https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

NOTE: the final line of data must end with a newline character \n. Each newline character may be preceded by a carriage return \r. When sending requests to this endpoint the Content-Type header should be set to application/x-ndjson.

While researching on the matter, this newline character terminator was expected to be there in 5.x too. For some reason, the API error handling in 6.x improved and now marks this as visible error.

Could be related: elastic/elasticsearch@c898e8a#diff-09b5159da8eeeb38a2773ab951cfa4ef

This also mentions this:

+                final String lowercaseMediaType = restRequest.header("Content-Type").toLowerCase(Locale.ROOT);
 +                // we also support line-delimited JSON, which isn't official and has a few variations
 +                // http://specs.okfnlabs.org/ndjson/
 +                // https://github.com/ndjson/ndjson-spec/blob/48ea03cea6796b614cfbff4d4eb921f0b1d35c26/specification.md
 +                if (lowercaseMediaType.equals("application/x-ldjson") || lowercaseMediaType.equals("application/x-ndjson")) {
 +                    restRequest.setXContentType(XContentType.JSON);

Which makes me believe that "application/json" is the current safe way to go.

Strict content-type

elastic/elasticsearch@7520a10#diff-09b5159da8eeeb38a2773ab951cfa4ef

also enforces a more strict content-type header. This is turned off in 5.x, but enabled in 6.x.

Further reading: https://www.elastic.co/blog/strict-content-type-checking-for-elasticsearch-rest-requests - cannot be disabled in 6.x, no possible workaround.

So I'd say we were using a bug in Elasticsearch 5.x which now has been fixed. There's no other way around than testing the snapshot packages once the coming PR is merged, and then wait for 2.8.1.

Tests

5.x

$ curl -u icinga:icinga http://192.168.33.50:9200
{
  "name" : "icinga2-puppet4-elastic-es",
  "cluster_name" : "elastic",
  "cluster_uuid" : "u_nlGyWsR82fxrleHEG7mg",
  "version" : {
    "number" : "5.6.4",
    "build_hash" : "8bbedf5",
    "build_date" : "2017-10-31T18:55:38.105Z",
    "build_snapshot" : false,
    "lucene_version" : "6.6.1"
  },
  "tagline" : "You Know, for Search"
}

screen shot 2017-12-06 at 18 00 59

6.x

$ curl -k -u icinga:icinga http://192.168.33.50:9200
{
  "name" : "icinga2-puppet4-elastic-es",
  "cluster_name" : "elastic",
  "cluster_uuid" : "RMvf33YWS0aq_xl5uQVkYQ",
  "version" : {
    "number" : "6.0.0",
    "build_hash" : "8f0685b",
    "build_date" : "2017-11-10T18:41:22.859Z",
    "build_snapshot" : false,
    "lucene_version" : "7.0.1",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

screen shot 2017-12-07 at 10 54 27

@guytwena
Copy link

guytwena commented Dec 7, 2017

At first, I have tried running the CUrl command but with out success, assume it didn't work because this was windows version of CUrl

After your response I have added the \n to my c#, when calling the rest API, and it did solved the issue.

Thanks,

Guy

dnsmichi pushed a commit that referenced this issue Dec 7, 2017
This enables compatibility with 6.x.

This commit also fixes an incorrect HTTP response parsing
which could lead into false positives.

Analysis and fix in #5795 (comment)

fixes #5795
@dnsmichi
Copy link
Contributor

dnsmichi commented Dec 7, 2017

@bobapple @lippserd FYI

dnsmichi pushed a commit that referenced this issue Dec 7, 2017
This enables compatibility with 6.x.

This commit also fixes an incorrect HTTP response parsing
which could lead into false positives.

Analysis and fix in #5795 (comment)

fixes #5795

refs #5840
@dnsmichi
Copy link
Contributor

dnsmichi commented Dec 7, 2017

Please test the snapshot packages once available.

@myownhost
Copy link

myownhost commented Dec 21, 2017

@dnsmichi Thanks for the fix.

I tested the new packege (version: v2.8.0-164-g9b75548) in my test evirenmont.

A few seconds after starting or reloading the Icinga2 service the ElasticsearchWriter discontinue writing. Here my Icinga2 ElasticsearchWriter log:

[2017-12-21 12:57:10 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 283, rate:  0/s (0/min 717/5min 724/15min); empty in 4 minutes and 42 seconds
[2017-12-21 12:57:20 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 287, rate:  0/s (0/min 717/5min 724/15min); empty in 11 minutes and 57 seconds
[2017-12-21 12:57:30 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 300, rate:  0/s (0/min 76/5min 724/15min); empty in 3 minutes and 50 seconds
[2017-12-21 12:57:40 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 316, rate:  0/s (0/min 76/5min 724/15min); empty in 3 minutes and 17 seconds
[2017-12-21 12:57:49 +0100] information/ElasticsearchWriter: 'elasticsearch' stopped.
[2017-12-21 12:57:50 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 329, rate:  0/s (0/min 76/5min 724/15min); empty in 4 minutes and 13 seconds
[2017-12-21 12:58:00 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 335, rate:  0/s (0/min 23/5min 724/15min); empty in 9 minutes and 18 seconds
[2017-12-21 12:58:10 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 335, rate:  0/s (0/min 11/5min 724/15min); empty in -2147483648 days
[2017-12-21 12:58:19 +0100] information/ElasticsearchWriter: 'elasticsearch' started.
[2017-12-21 13:01:19 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 35, rate: 0.616667/s (37/min 149/5min 149/15min); empty in 9 seconds
[2017-12-21 13:01:29 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 66, rate: 0.05/s (3/min 149/5min 149/15min); empty in 21 seconds
[2017-12-21 13:01:39 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 69, rate: 0.05/s (3/min 149/5min 149/15min); empty in 3 minutes and 49 seconds
[2017-12-21 13:01:49 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 69, rate: 0.05/s (3/min 149/5min 149/15min); empty in -2147483648 days
[2017-12-21 13:01:59 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 69, rate: 0.05/s (3/min 149/5min 149/15min); empty in -2147483648 days
[2017-12-21 13:02:09 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 77, rate:  0/s (0/min 149/5min 149/15min); empty in 1 minute and 36 seconds
[2017-12-21 13:02:19 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 108, rate:  0/s (0/min 149/5min 149/15min); empty in 34 seconds
[2017-12-21 13:02:29 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 134, rate:  0/s (0/min 149/5min 149/15min); empty in 51 seconds
[2017-12-21 13:02:39 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 142, rate:  0/s (0/min 149/5min 149/15min); empty in 2 minutes and 57 seconds
[2017-12-21 13:02:49 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 142, rate:  0/s (0/min 149/5min 149/15min); empty in -2147483648 days
[2017-12-21 13:02:59 +0100] information/WorkQueue: #8 (ElasticsearchWriter, elasticsearch) items: 142, rate:  0/s (0/min 149/5min 149/15min); empty in -2147483648 days

cat /etc/icinga2/features-enabled/elasticsearch.conf

library "perfdata"

object ElasticsearchWriter "elasticsearch" {
  host = "x.x.x.x"
  port = 9200
  index = "test-icinga2"
  enable_send_perfdata = true
  flush_threshold = 1024
  flush_interval = 10s
}

@dnsmichi
Copy link
Contributor

Anything interesting in the debug log? Which Elasticsearch version is used?

@BjoernRobbe
Copy link
Author

Hi,
using the last snapshot (icinga2-2.8.0.164.g9b75548) on CentOS6 there seems a delay in writing
the logs and problems in connecting to elastic. I blamed it to the snapshot release?

After starting icinga2, I use

tail -f /var/log/icinga2/icinga2.log

nothing is showing (only after a stop of Icinga2)

tail -f /var/log/icinga2/debug.log

debug.log shows something after a while (like a cache or something)

Using the gdb (found in Icinga2 docs)

gdb --args /usr/lib64/icinga2/sbin/icinga2 daemon -x debug --no-stack-rlimit

the ElasticsearchWriter connects to elastic without any problem and without the bulk-api-error

It's a fresh installation on an updated CentOS6 VM.

@myownhost
Copy link

myownhost commented Dec 21, 2017

Elasticsearch version 6.1.1.

The following entries are found in my debug log:

[2017-12-21 14:00:58 +0100] debug/ElasticsearchWriter: Timer expired writing 34 data points
[2017-12-21 14:00:58 +0100] notice/ElasticsearchWriter: Connecting to Elasticsearch on host 'x.x.x.x' port '9200'.
[2017-12-21 14:00:58 +0100] debug/ElasticsearchWriter: Sending POST request to 'http://x.x.x.x:9200/test-icinga2-2017.12.21/_bulk'.

@dnsmichi
Copy link
Contributor

dnsmichi commented Dec 21, 2017

@BjoernRobbe but you'll get data in Elasticsearch?

@myownhost did not test 6.1 yet, I was still at 6.0.0.

@BjoernRobbe
Copy link
Author

@dnsmichi Sorry, I just noticed that only some sort of data arrived in elastic (version 6.0).

Looking deeper shows error in elastic log:

[2017-12-21T14:45:01,481][DEBUG][o.e.a.b.TransportShardBulkAction] [icinga2-2017.12.21][1] failed to execute bulk item (index) BulkShardRequest [[icinga2-2017.12.21][1]] containing [index {[icinga2-2017.12.21][icinga2.event.notification][3_NReWAB3ViLs-g_8Mj2], source[{"@timestamp":"2017-12-21T14:43:48.864+0100","author":"","check_command":"http","check_result.check_source":"centos6.silber.rb.de","check_result.command":["/usr/lib64/nagios/plugins/check_http","-I","127.0.0.1","-u","/"],"check_result.execution_end":"2017-12-21T14:43:48.864+0100","check_result.execution_start":"2017-12-21T14:43:48.860+0100","check_result.execution_time":0.0038650035858154296875,"check_result.exit_status":2.0,"check_result.latency":0.0009529590606689453125,"check_result.output":"connect to address 127.0.0.1 and port 80: Verbindungsaufbau abgelehnt\nHTTP CRITICAL - Konnte TCP socket nicht öffnen","check_result.schedule_end":"2017-12-21T14:43:48.864+0100","check_result.schedule_start":"2017-12-21T14:43:48.859+0100","check_result.state":2.0,"check_result.vars_after":{"attempt":1.0,"reachable":true,"state":2.0,"state_type":1.0},"check_result.vars_before":{"attempt":1.0,"reachable":true,"state":2.0,"state_type":1.0},"host":"centos6.silber.rb.de","last_hard_state":2.0,"last_state":2.0,"notification_type":"PROBLEM","service":"http","state":2.0,"text":"","timestamp":"2017-12-21T14:43:48.864+0100","type":"icinga2.event.notification","users":["icingaadmin"]}]}]
java.lang.IllegalArgumentException: Rejecting mapping update to [icinga2-2017.12.21] as the final mapping would have more than 1 type: [icinga2.event.notification, icinga2.event.checkresult]
        at org.elasticsearch.index.mapper.MapperService.internalMerge(MapperService.java:494) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.index.mapper.MapperService.internalMerge(MapperService.java:350) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:282) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.applyRequest(MetaDataMappingService.java:311) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.execute(MetaDataMappingService.java:230) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:640) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:270) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:195) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:130) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247) ~[elasticsearch-6.0.0.jar:6.0.0]
        at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:210) ~[elasticsearch-6.0.0.jar:6.0.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_151]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_151]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]

So, I think my last post is incorrect about the difference between normal start and gdb start.
But, the lag of writing log files is true.

Thanks

@BjoernRobbe
Copy link
Author

@dnsmichi
Breaking changes in 6.0:
Multiple mapping types are not supported in indices created in 6.0

https://www.elastic.co/guide/en/elasticsearch/reference/current/breaking-changes-6.0.html

@dnsmichi
Copy link
Contributor

To be honest, I have no idea what would cause a multiple mapping. I did not see such with the events passed to 6.0 either.

@BjoernRobbe
Copy link
Author

BTW, after deleting all indices on elastic and "normal" starting icinga2, nothing arrived in elastic,
no error/debug log at elastic, nothing.

After starting icinga2 with gdb, "message(s)" arrived in elastic and log shows:

[2017-12-21T15:07:44,648][INFO ][o.e.c.m.MetaDataCreateIndexService] [dev-log] [icinga2-2017.12.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings []
[2017-12-21T15:07:44,809][INFO ][o.e.c.m.MetaDataMappingService] [dev-log] [icinga2-2017.12.21/lS3ygE0ORK-LLu1OCoRwDw] create_mapping [icinga2.event.notification]

In Kibana, only one message is shown:
kibana_message

I'm confused.

@dnsmichi
Copy link
Contributor

Please continue with the mapping problem in #5905, this is a whole new problem compared to the bulk new line issue.

@dnsmichi
Copy link
Contributor

@myownhost do you see similar issues in your Elasticsearch log?

@myownhost
Copy link

@dnsmichi yes i have, but i have there the same errors with icingabeat and icingabeat is working fine. Hopefully it helps:

java.lang.IllegalArgumentException: Rejecting mapping update to [test-icinga2-2017.12.21] as the final mapping would have more than 1 type: [icinga2.event.checkresult, icinga2.event.statechange]

java.lang.IllegalArgumentException: Rejecting mapping update to [icingabeat-2017.12.21] as the final mapping would have more than 1 type: [icingabeat.event.statechange, icingabeat.event.checkresult]

@dnsmichi
Copy link
Contributor

@myownhost Ok then please continue in #5905 as well, and add your findings there too as we proceed.

@dnsmichi
Copy link
Contributor

Reverting this for 2.8.1, as #5905 implies breaking changes for Elasticsearch 6.

@dnsmichi dnsmichi modified the milestones: 2.8.1, 2.9.0 Dec 22, 2017
@dnsmichi
Copy link
Contributor

dnsmichi commented Dec 22, 2017

Ups, wrong ticket,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/elastic Events to Elasticsearch bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants