Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

serilog-sinks-elasticsearch ElasticsearchLogShipper has the wrong success criteria for Bulk operations #599

Closed
jarlelin opened this issue Nov 24, 2015 · 16 comments

Comments

@jarlelin
Copy link

When reading from a durable file and uploading bulk actions to the elastic search server, the log shipper checks the HTTP response for a Success Code and if true it writes the bookmarks and moves on.

Likewise we have had an issue where a document does successfully write to the index, but for some (haven't figured out how yet) reason the response code does not indicate success. This leads to the log shipper continiously writing the same document to the index multiple times.

This is the relevant code that seems to be wrong: (starting at line 179 in ElasticsearchLogShipper.cs)

if (count > 0)
{
 var response = _state.Client.Bulk(payload);

 if (response.Success)
 {
    WriteBookmark(bookmark, nextLineBeginsAtOffset, currentFilePath);
 }
 else
 {
     SelfLog.WriteLine("Received failed ElasticSearch shipping result {0}: {1}", response.HttpStatusCode, response.OriginalException);
 }
}

The bulk operation returns success indicators in the body of the result, and this is what should be checked for success status.

@mivano
Copy link
Member

mivano commented Nov 25, 2015

Interesting issue.You will indeed get a success even when almost all the bulk operations failed. @konste is also looking into this (without the durable sink), see here: serilog-contrib/serilog-sinks-elasticsearch#17
Question is what to do with the failing items; offer them again, raise them upwards to another logger? What is your idea?

@konste
Copy link

konste commented Nov 29, 2015

Indeed I've spent substantial time looking at ES response. Attached is my class derived from ElasticsearchSink, which handles that response by reporting the errors to "log of the log" which in my case is implemented using Seq.
ElasticsearchSinkMod.cs.txt

There are two pain points though (that's mostly for @mivano)

  1. I don't use durable sink, because to my surprise it is not a subclass of the regular sink and has different interface!
  2. Response handling looks really clumsy and involve a lot of fiddling with dynamic objects. I attribute it to the fact that ES Sink uses low level ES client instead of Nest. Is there any particular reason for that? If it would use Nest, then response would be nice strongly typed object and all dynamics would go away.

@mivano
Copy link
Member

mivano commented Dec 14, 2015

Sorry for not getting back quicker. I have been looking into this and I agree that it is not optimal. The durable sink should call the underlying Elasticsearch sink instead and act based on the actual result it gets from Elasticsearch. I have been debugging this and yes, the dynamic objects are a bit nasty. I think @Mpdreamz uses the low level client to avoid some overhead.

But still, I can work with the returned data and see that there are errors. If I look at your code it seems that you also just raise all the events in that batch to another sink. I took your code and made it a bit more configurable. See https://github.com/serilog/serilog-sinks-elasticsearch/tree/handlefailures and https://github.com/serilog/serilog-sinks-elasticsearch/blob/handlefailures/src/Serilog.Sinks.Elasticsearch/Sinks/ElasticSearch/ElasticSearchSink.cs#L60 in particular. Not complete yet, but you might have some ideas already how we can improve this. Also the durable sink which I want to tackle next.

@konste
Copy link

konste commented Dec 16, 2015

Thank you for looking into it! I will try to get out of the daily burden and participate as much as I can. Real (technical) answer will follow, but it may take time.

@konste
Copy link

konste commented Dec 18, 2015

It seems to me that ES response can be parsed efficiently without dynamics, something like this:

ElasticsearchResponse elasticsearchResponse = this.client.Bulk(payload);
string strBulkResponse = elasticsearchResponse.Response;
BulkResponse bulkResponse = JsonConvert.DeserializeObject(strBulkResponse);
if (bulkResponse.Errors) ...

Unfortunately it does not work right now, because BulkOperationResponseItem used to be a string, but currently ES returns is as the object and confuses deserializer. I believe it is ES 2.0 breaking change and Elasticsearch.NET for 2.0 has not been released yet, as far as I know.

So I believe we need to wait for it, and then I redo response parsing in a much more sensible way.

@mivano
Copy link
Member

mivano commented Dec 22, 2015

You have the feeling that the way I access the dynamic object in that feature branch is too slow? I do not need to deserialize anything as I can just access the properties of the response. So I just check if there are any errors and if so, start processing which ones.

@guillaume86
Copy link

Is there any progress on this one? Any help needed?

@Mpdreamz
Copy link
Member

Mpdreamz commented Jan 8, 2016

Just FYI the lowlevel client can deserialize straight from the response stream too if you pass a object as T. The sink could ship with a mapping of the bulk response and use that when calling bulk e.g

client.Bulk<BulkResponse>(...)

@konste
Copy link

konste commented Jan 8, 2016

I'm waiting for 2.0 compatible version of the client (stable one - on nuget) to try non-dynamic parsing of ES response.

@mivano
Copy link
Member

mivano commented Jan 8, 2016

I m working on some improvements (see https://github.com/serilog/serilog-sinks-elasticsearch/tree/handlefailures) that allows parsing and handing of the bulk response.
@konste you see an issue in using the current dynamics?

@konste
Copy link

konste commented Jan 8, 2016

I will try to soup together non-dynamic version today. IMHO it is cleaner, safer and faster. But it requires 2.0 version of the libraries.

@Mpdreamz
Copy link
Member

Mpdreamz commented Jan 8, 2016

We can keep using the older lowlevel client for now with a forward compatible mapping:

1.x: https://github.com/elastic/elasticsearch-net/blob/1.x/src/Nest/Domain/Responses/BulkOperationResponseItem.cs

2.x https://github.com/elastic/elasticsearch-net/blob/master/src/Nest/Document/Multiple/Bulk/BulkResponseItem/BulkResponseItem.cs

The actual api around the bulk request and response did not change much other then the structured exceptions on bulk items (which could be dynamic in the sinks mapping?).

@jarlelin
Copy link
Author

It might not be worth fixing this issue until the sink is updated to use NEST 2.0
I made a seperate issue about NEST 2.0 because it's needed to support elastic search >2. Has anyone looked into upgrading ?

@konste
Copy link

konste commented Feb 12, 2016

Thanks for letting us know that NEST 2.0 is finally available! I was waiting for it and missed it. Will look into how hard it is to migrate the sink to it.

@konste
Copy link

konste commented Feb 15, 2016

I just submitted PR for Elasticsearch sink which supposed to upgrade it with NEST 2.0 compatibility.

@nblumhardt
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants