Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please repleace encoding/json with a more efficient JSON package #667

Closed
qinhao opened this issue Dec 28, 2017 · 11 comments
Closed

Please repleace encoding/json with a more efficient JSON package #667

qinhao opened this issue Dec 28, 2017 · 11 comments
Assignees

Comments

@qinhao
Copy link

qinhao commented Dec 28, 2017

Hi, there! I'm using logkit with elastic.v6 to insert data into my ES node, and I use Flame Graphs to find the fact that CPU spends a lot of time dealing with encoding/json Marsh/Unmarsh methods in elastic.v6. I think it will be much better if we replace encoding/json with a more efficient JSON parse package. Here is the Flame Graphs SVGs:

logkit_perf_flame_graphs.zip

In the perf1.svg I replace encoding/json with jsoniter for logkit's json_parse Parse, and the perf2.svg is the original logkit using encoding/json in json_parse.

I'll try to replace encoding/json with jsoniter first myself ;-)

@olivere
Copy link
Owner

olivere commented Dec 28, 2017

Hey there! Thanks for digging into this.

Have you tried to implement alternatives to encoding/json and use a custom Decoder? Does that make a difference?

Besides that I really want to keep using the standard library. So the best way to optimize JSON performance is to solve this at the root level, i.e. the standard library. Anybody would benefit from that, I suppose.

@qinhao
Copy link
Author

qinhao commented Dec 28, 2017

OK, I will try to implement it (replace standard encoding/json) with jsoniter or easyjson in my local repo, and then make a detail comparison. And as to optimize the standard encoding/json package, I think it's too difficult for me at present ;-)

@olivere
Copy link
Owner

olivere commented Dec 28, 2017

If the results are positive and you feel like contributing to Elastic, please add a Wiki page with a detailed setup of what you did to change the JSON encoder/decoder and a short summary of your results. That'd be nice.

@brentahughes
Copy link

We have recently started to do our own encoding using easyjson for large bulk documents by passing the encoded string to Doc(). It has removed a ton of processing time and cpu load. However easyjson and many of the more efficient json encoders require gen'd code. Implementing it into ES would be great but it would likely need to be opt in as it would add more complexity having to gen code for each struct.

@olivere
Copy link
Owner

olivere commented Jan 3, 2018

@bah2830 Do you have any benchmark results so I can dig into this? Maybe, if there is a way to automate it, we can provide a way to opt in.

@brentahughes
Copy link

@olivere Hopefully this will work https://pastebin.com/ZkUJcSRm

This was tested by altering our existing implementation to add timing around only the encoding portion during a bulk add request.

Test was performed on a 2016 Macbook Pro (2.7GHz I7, 16gb Ram).

Looking at these results a normal workload with smaller documents performs just under 2x faster on average. While that performance increases as the document size rises. With large 3MB documents it gets closer to 4x faster.

For a simple system that only injects a few thousand documents every so often will likely not notice the speed increase as it's already so fast that halving that time isn't really noticeable to us humans. But some of us are pushing billions of documents every hour with fairly large documents and this performance increase is incredible.

Json encoding used to be our biggest time hog, after moving to easyjson it is barely a blip on the map.

We can currently only implement this on an index request since it accepts a string that is already encoded. If we could find a way to autogen or use some other system that doesn't require codegen then the query results could potentially be another big benefit.

It may even be worth it to just call MarshalJson on it (If it matches the interface type, otherwise use the old encoding/json) allowing whatever json encoding interface they have implemented. This same could potentially be used by allowing a struct to be sent into the unMarshaller that implements it's own decoding. This would allow users to implement any encoding/decoding system they want as long as it matches the existing encoding/json interface. (MarshalJSON
UnmarshalJSON) and assuming they have the proper structure for the expected returned documents.

@olivere
Copy link
Owner

olivere commented Jan 6, 2018

@bah2830 That's looking very promising. I will try to find a way to enable these optimizations via an opt-in. We can surely focus on the performance critical parts first, e.g. Bulk API and Search API. This work will probably be part of the release-branch.v6, but can surely be backported to v5 later.

@olivere olivere self-assigned this Jan 6, 2018
olivere added a commit that referenced this issue Jan 8, 2018
This commit has a few performance-related changes. The results are
promising, but we need to validate on production load first.

First, I changed the way that JSON got serialized from using a
`map[string]interface{}` to a specific Go type. That already had the
following impact:

```
benchmark                                    old ns/op     new ns/op     delta
BenchmarkBulkIndexRequestSerialization-8     3348          2285          -31.75%

benchmark                                    old allocs     new allocs     delta
BenchmarkBulkIndexRequestSerialization-8     34             18             -47.06%

benchmark                                    old bytes     new bytes     delta
BenchmarkBulkIndexRequestSerialization-8     2464          1744
-29.22%

benchmark                                     old ns/op     new ns/op     delta
BenchmarkBulkUpdateRequestSerialization-8     3131          2091          -33.22%

benchmark                                     old allocs     new allocs     delta
BenchmarkBulkUpdateRequestSerialization-8     39             19             -51.28%

benchmark                                     old bytes     new bytes     delta
BenchmarkBulkUpdateRequestSerialization-8     2624          2624
+0.00%

benchmark                                     old ns/op     new ns/op     delta
BenchmarkBulkDeleteRequestSerialization-8     2233          1158          -48.14%

benchmark                                     old allocs     new allocs     delta
BenchmarkBulkDeleteRequestSerialization-8     28             11             -60.71%

benchmark                                     old bytes     new bytes     delta
BenchmarkBulkDeleteRequestSerialization-8     1744          1664
-4.59%
```

Next, I enable a setting that makes Bulk API use
`github.com/mailru/easyjson` instead of `encoding/json`. This yields
the following results (compared to the baseline):

```
benchmark                                    old ns/op     new ns/op     delta
BenchmarkBulkIndexRequestSerialization-8     3348          1692          -49.46%

benchmark                                    old allocs     new allocs     delta
BenchmarkBulkIndexRequestSerialization-8     34             12             -64.71%

benchmark                                    old bytes     new bytes     delta
BenchmarkBulkIndexRequestSerialization-8     2464          1328
-46.10%

benchmark                                     old ns/op     new ns/op     delta
BenchmarkBulkUpdateRequestSerialization-8     3131          1072          -65.76%

benchmark                                     old allocs     new allocs     delta
BenchmarkBulkUpdateRequestSerialization-8     39             10             -74.36%

benchmark                                     old bytes     new bytes     delta
BenchmarkBulkUpdateRequestSerialization-8     2624          1952
-25.61%

benchmark                                     old ns/op     new ns/op     delta
BenchmarkBulkDeleteRequestSerialization-8     2233          593           -73.44%

benchmark                                     old allocs     new allocs     delta
BenchmarkBulkDeleteRequestSerialization-8     28             5              -82.14%

benchmark                                     old bytes     new bytes     delta
BenchmarkBulkDeleteRequestSerialization-8     1744          1280
-26.61%
```

You can enable serialization with `UseEasyJSON(true)` on the individual
`BulkIndexRequest`, `BulkUpdateRequest`, and `BulkDeleteRequest`. Using
easyjson is experimental for now. As easyjson relies on code
generation, you can use `go generate` to create the underlying code.

See #667 for discussion.
@olivere
Copy link
Owner

olivere commented Jan 8, 2018

Hello!

I made some performance-related changes in the easyjson.issue-667 branch (see 13899f2). The results look rather promising.

First, I optimized usage of encoding/json by switching from map[string]interface{} to a "real" type, which already had some impact:

benchmark                                    old ns/op     new ns/op     delta
BenchmarkBulkIndexRequestSerialization-8     3348          2285          -31.75%

benchmark                                    old allocs     new allocs     delta
BenchmarkBulkIndexRequestSerialization-8     34             18             -47.06%

benchmark                                    old bytes     new bytes     delta
BenchmarkBulkIndexRequestSerialization-8     2464          1744
-29.22%

benchmark                                     old ns/op     new ns/op     delta
BenchmarkBulkUpdateRequestSerialization-8     3131          2091          -33.22%

benchmark                                     old allocs     new allocs     delta
BenchmarkBulkUpdateRequestSerialization-8     39             19             -51.28%

benchmark                                     old bytes     new bytes     delta
BenchmarkBulkUpdateRequestSerialization-8     2624          2624
+0.00%

benchmark                                     old ns/op     new ns/op     delta
BenchmarkBulkDeleteRequestSerialization-8     2233          1158          -48.14%

benchmark                                     old allocs     new allocs     delta
BenchmarkBulkDeleteRequestSerialization-8     28             11             -60.71%

benchmark                                     old bytes     new bytes     delta
BenchmarkBulkDeleteRequestSerialization-8     1744          1664
-4.59%

Then I added experimental support for easyjson in the Bulk API. You can enable easyjson with UseEasyJSON(true) on BulkIndexRequest, BulkUpdateRequest, and BulkDeleteRequest (see e.g. here). The structures that easyjson relies on are generated with go generate. Results are even better then:

benchmark                                    old ns/op     new ns/op     delta
BenchmarkBulkIndexRequestSerialization-8     3348          1692          -49.46%

benchmark                                    old allocs     new allocs     delta
BenchmarkBulkIndexRequestSerialization-8     34             12             -64.71%

benchmark                                    old bytes     new bytes     delta
BenchmarkBulkIndexRequestSerialization-8     2464          1328
-46.10%

benchmark                                     old ns/op     new ns/op     delta
BenchmarkBulkUpdateRequestSerialization-8     3131          1072          -65.76%

benchmark                                     old allocs     new allocs     delta
BenchmarkBulkUpdateRequestSerialization-8     39             10             -74.36%

benchmark                                     old bytes     new bytes     delta
BenchmarkBulkUpdateRequestSerialization-8     2624          1952
-25.61%

benchmark                                     old ns/op     new ns/op     delta
BenchmarkBulkDeleteRequestSerialization-8     2233          593           -73.44%

benchmark                                     old allocs     new allocs     delta
BenchmarkBulkDeleteRequestSerialization-8     28             5              -82.14%

benchmark                                     old bytes     new bytes     delta
BenchmarkBulkDeleteRequestSerialization-8     1744          1280
-26.61%

If any of you have any time to check these changes on a performance load, both in terms of performance and correctness--that would be very much appreciated.

@olivere
Copy link
Owner

olivere commented Jan 8, 2018

Please notice that elastic cannot optimize the document you pass in e.g. BulkIndexRequest. But you can optimize that. If you want to ensure that elastic does not try to marshal your struct with encoding/json, pass a document as e.g. a string or a json.RawMessage. There is special code in elastic that checks for these types, and only uses encoding/json as a fallback.

So with the combination of the above changes and you e.g. serializing the document with easyjson before passing it into BulkIndexRequest, you can be almost sure to not use encoding/json at all with Bulk API.

olivere added a commit that referenced this issue Jan 9, 2018
This is a backport of 13899f2 for
version 5. See #667.
@olivere
Copy link
Owner

olivere commented Jan 9, 2018

Backport for v5 is on the easyjson.issue-667.v5 branch. Benchcmp for v5 are like this:

$ benchcmp tmp/bulk-index.old tmp/bulk-index.new
benchmark                                             old ns/op     new ns/op     delta
BenchmarkBulkIndexRequestSerialization/stdlib-4       3783          2644          -30.11%
BenchmarkBulkIndexRequestSerialization/easyjson-4     3781          1987          -47.45%

benchmark                                             old allocs     new allocs     delta
BenchmarkBulkIndexRequestSerialization/stdlib-4       34             16             -52.94%
BenchmarkBulkIndexRequestSerialization/easyjson-4     34             12             -64.71%

benchmark                                             old bytes     new bytes     delta
BenchmarkBulkIndexRequestSerialization/stdlib-4       2464          1624          -34.09%
BenchmarkBulkIndexRequestSerialization/easyjson-4     2464          1328          -46.10%

$ benchcmp tmp/bulk-update.old tmp/bulk-update.new
benchmark                                              old ns/op     new ns/op     delta
BenchmarkBulkUpdateRequestSerialization/stdlib-4       3599          2444          -32.09%
BenchmarkBulkUpdateRequestSerialization/easyjson-4     3574          1265          -64.61%

benchmark                                              old allocs     new allocs     delta
BenchmarkBulkUpdateRequestSerialization/stdlib-4       39             19             -51.28%
BenchmarkBulkUpdateRequestSerialization/easyjson-4     39             10             -74.36%

benchmark                                              old bytes     new bytes     delta
BenchmarkBulkUpdateRequestSerialization/stdlib-4       2624          2624          +0.00%
BenchmarkBulkUpdateRequestSerialization/easyjson-4     2624          1952          -25.61%

$ benchcmp tmp/bulk-delete.old tmp/bulk-delete.new
benchmark                                              old ns/op     new ns/op     delta
BenchmarkBulkDeleteRequestSerialization/stdlib-4       2476          1526          -38.37%
BenchmarkBulkDeleteRequestSerialization/easyjson-4     2456          804           -67.26%

benchmark                                              old allocs     new allocs     delta
BenchmarkBulkDeleteRequestSerialization/stdlib-4       28             9              -67.86%
BenchmarkBulkDeleteRequestSerialization/easyjson-4     28             5              -82.14%

benchmark                                              old bytes     new bytes     delta
BenchmarkBulkDeleteRequestSerialization/stdlib-4       1744          1576          -9.63%
BenchmarkBulkDeleteRequestSerialization/easyjson-4     1744          1280          -26.61%

olivere added a commit that referenced this issue Jan 12, 2018
This is a backport of 13899f2 for
version 5. See #667.
olivere added a commit that referenced this issue Jan 12, 2018
This commit has a few performance-related changes. It comes in two
flavours.

First, the way that JSON got serialized was changed from using a simple
`map[string]interface{}` to a specific Go type. That had the following impact:

```
benchmark                                    old ns/op     new ns/op     delta
BenchmarkBulkIndexRequestSerialization-8     3348          2285          -31.75%

benchmark                                    old allocs     new allocs     delta
BenchmarkBulkIndexRequestSerialization-8     34             18             -47.06%

benchmark                                    old bytes     new bytes     delta
BenchmarkBulkIndexRequestSerialization-8     2464          1744
-29.22%

benchmark                                     old ns/op     new ns/op     delta
BenchmarkBulkUpdateRequestSerialization-8     3131          2091          -33.22%

benchmark                                     old allocs     new allocs     delta
BenchmarkBulkUpdateRequestSerialization-8     39             19             -51.28%

benchmark                                     old bytes     new bytes     delta
BenchmarkBulkUpdateRequestSerialization-8     2624          2624
+0.00%

benchmark                                     old ns/op     new ns/op     delta
BenchmarkBulkDeleteRequestSerialization-8     2233          1158          -48.14%

benchmark                                     old allocs     new allocs     delta
BenchmarkBulkDeleteRequestSerialization-8     28             11             -60.71%

benchmark                                     old bytes     new bytes     delta
BenchmarkBulkDeleteRequestSerialization-8     1744          1664
-4.59%
```

Next, we enabled a setting that makes Bulk API use
`github.com/mailru/easyjson` instead of `encoding/json`. This yields
the following results (compared to the baseline):

```
benchmark                                    old ns/op     new ns/op     delta
BenchmarkBulkIndexRequestSerialization-8     3348          1692          -49.46%

benchmark                                    old allocs     new allocs     delta
BenchmarkBulkIndexRequestSerialization-8     34             12             -64.71%

benchmark                                    old bytes     new bytes     delta
BenchmarkBulkIndexRequestSerialization-8     2464          1328
-46.10%

benchmark                                     old ns/op     new ns/op     delta
BenchmarkBulkUpdateRequestSerialization-8     3131          1072          -65.76%

benchmark                                     old allocs     new allocs     delta
BenchmarkBulkUpdateRequestSerialization-8     39             10             -74.36%

benchmark                                     old bytes     new bytes     delta
BenchmarkBulkUpdateRequestSerialization-8     2624          1952
-25.61%

benchmark                                     old ns/op     new ns/op     delta
BenchmarkBulkDeleteRequestSerialization-8     2233          593           -73.44%

benchmark                                     old allocs     new allocs     delta
BenchmarkBulkDeleteRequestSerialization-8     28             5              -82.14%

benchmark                                     old bytes     new bytes     delta
BenchmarkBulkDeleteRequestSerialization-8     1744          1280
-26.61%
```

You can enable serialization with `UseEasyJSON(true)` on the individual
`BulkIndexRequest`, `BulkUpdateRequest`, and `BulkDeleteRequest`. Using
easyjson is **experimental** for now. As easyjson relies on code
generation, you need to use `go generate` when structs change, in order
to create the underlying structs that easyjson makes use of.

See #667 for discussion.
@olivere
Copy link
Owner

olivere commented Jan 12, 2018

Released in 5.0.61 and 6.1.2.

@olivere olivere closed this as completed Jan 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants