switch from json-iterator to forked stdlib json decoder #105030

liggitt · 2021-09-14T22:29:34Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

Explores dropping json-iterator in favor of a fork of the stdlib json decoder with case-sensitivity, preservation of int/float distinctions, and strict decoding added

initial benchmarks on small-to-normal-sized inputs show a significant improvement in allocations over json-iterator (~30-45%), slightly improved memory use (~0-10%), and worse CPU time (~100% worse on typed decoding, 15% worse on untyped decoding):

BenchmarkUnmarshalTyped/jsoniter-12            45061     26203 ns/op   13001 B/op     296 allocs/op
BenchmarkUnmarshalTyped/std-unmarshal-12       20564     58080 ns/op   12672 B/op     161 allocs/op
BenchmarkUnmarshalTyped/stdfork-12             20234     59311 ns/op   12704 B/op     161 allocs/op
BenchmarkUnmarshalTyped/stdfork-strict-12      20050     60268 ns/op   13112 B/op     168 allocs/op

BenchmarkUnmarshalUntyped/jsoniter-12          23341     51112 ns/op   30107 B/op     735 allocs/op
BenchmarkUnmarshalUntyped/std-unmarshal-12     20031     59595 ns/op   27031 B/op     521 allocs/op
BenchmarkUnmarshalUntyped/stdfork-12           19938     60242 ns/op   27062 B/op     519 allocs/op
BenchmarkUnmarshalUntyped/stdfork-strict-12    19140     63390 ns/op   27303 B/op     526 allocs/op

At least some of the decreased CPU cost of json-iterator comes at the expense of correctness (e.g. json-iterator/go#413). Combined with the memory-safe implementation, I would argue that the switch is worth it, but I'd like to see what the scale tests here show.

As a side benefit, performance on pathological json inputs is significantly better as well for existing users of our utiljson package, and for users of jsoniter switching to the new unmarshaler:

benchmark                                                           old ns/op     new ns/op     delta
BenchmarkJSONLimits/3MB_of_deeply_nested_slices/utiljson-12         114299        110468        -3.35%
BenchmarkJSONLimits/3MB_of_deeply_nested_slices/jsoniter-12         2293171       108268        -95.28%
BenchmarkJSONLimits/3MB_of_unbalanced_nested_slices/jsoniter-12     2468106       108550        -95.60%
BenchmarkJSONLimits/3MB_of_unbalanced_nested_slices/utiljson-12     122042        108035        -11.48%
BenchmarkJSONLimits/3MB_of_deeply_nested_maps/utiljson-12           233003        196469        -15.68%
BenchmarkJSONLimits/3MB_of_deeply_nested_maps/jsoniter-12           3213641       194241        -93.96%
BenchmarkJSONLimits/3MB_of_unbalanced_nested_maps/utiljson-12       237060        193801        -18.25%
BenchmarkJSONLimits/3MB_of_unbalanced_nested_maps/jsoniter-12       3090086       193626        -93.73%
BenchmarkJSONLimits/3MB_of_empty_slices/jsoniter-12                 174992881     108469250     -38.02%
BenchmarkJSONLimits/3MB_of_empty_slices/utiljson-12                 146149559     106589800     -27.07%
BenchmarkJSONLimits/3MB_of_slices/utiljson-12                       175704073     152824837     -13.02%
BenchmarkJSONLimits/3MB_of_slices/jsoniter-12                       304611177     151610026     -50.23%
BenchmarkJSONLimits/3MB_of_empty_maps/utiljson-12                   130585803     118354588     -9.37%
BenchmarkJSONLimits/3MB_of_empty_maps/jsoniter-12                   164419283     117482512     -28.55%
BenchmarkJSONLimits/3MB_of_maps/utiljson-12                         146393189     112563979     -23.11%
BenchmarkJSONLimits/3MB_of_maps/jsoniter-12                         180325824     111732517     -38.04%
BenchmarkJSONLimits/3MB_of_ints/utiljson-12                         152788760     101553963     -33.53%
BenchmarkJSONLimits/3MB_of_ints/jsoniter-12                         346425489     100988741     -70.85%
BenchmarkJSONLimits/3MB_of_floats/utiljson-12                       150291060     77597318      -48.37%
BenchmarkJSONLimits/3MB_of_floats/jsoniter-12                       242287511     85161081      -64.85%
BenchmarkJSONLimits/3MB_of_bools/utiljson-12                        37381403      33416338      -10.61%
BenchmarkJSONLimits/3MB_of_bools/jsoniter-12                        72874262      34177785      -53.10%
BenchmarkJSONLimits/3MB_of_empty_strings/jsoniter-12                115253244     52435771      -54.50%
BenchmarkJSONLimits/3MB_of_empty_strings/utiljson-12                61241291      53121281      -13.26%
BenchmarkJSONLimits/3MB_of_strings/utiljson-12                      26483025      24183176      -8.68%
BenchmarkJSONLimits/3MB_of_strings/jsoniter-12                      19639819      24540115      +24.95%
BenchmarkJSONLimits/3MB_of_nulls/utiljson-12                        33911457      32239513      -4.93%
BenchmarkJSONLimits/3MB_of_nulls/jsoniter-12                        67738250      31341188      -53.73%

benchmark                                                           old allocs     new allocs     delta
BenchmarkJSONLimits/3MB_of_deeply_nested_slices/utiljson-12         34             28             -17.65%
BenchmarkJSONLimits/3MB_of_deeply_nested_slices/jsoniter-12         40015          28             -99.93%
BenchmarkJSONLimits/3MB_of_unbalanced_nested_slices/jsoniter-12     40015          28             -99.93%
BenchmarkJSONLimits/3MB_of_unbalanced_nested_slices/utiljson-12     34             28             -17.65%
BenchmarkJSONLimits/3MB_of_deeply_nested_maps/utiljson-12           36             28             -22.22%
BenchmarkJSONLimits/3MB_of_deeply_nested_maps/jsoniter-12           30015          28             -99.91%
BenchmarkJSONLimits/3MB_of_unbalanced_nested_maps/utiljson-12       36             28             -22.22%
BenchmarkJSONLimits/3MB_of_unbalanced_nested_maps/jsoniter-12       30015          28             -99.91%
BenchmarkJSONLimits/3MB_of_empty_slices/jsoniter-12                 2097199        1048626        -50.00%
BenchmarkJSONLimits/3MB_of_empty_slices/utiljson-12                 1048643        1048626        -0.00%
BenchmarkJSONLimits/3MB_of_slices/utiljson-12                       2359360        1572912        -33.33%
BenchmarkJSONLimits/3MB_of_slices/jsoniter-12                       4718634        1572912        -66.67%
BenchmarkJSONLimits/3MB_of_empty_maps/utiljson-12                   1048643        1048626        -0.00%
BenchmarkJSONLimits/3MB_of_empty_maps/jsoniter-12                   2097199        1048626        -50.00%
BenchmarkJSONLimits/3MB_of_maps/utiljson-12                         1179709        786477         -33.33%
BenchmarkJSONLimits/3MB_of_maps/jsoniter-12                         2359335        786477         -66.67%
BenchmarkJSONLimits/3MB_of_ints/utiljson-12                         1572931        52             -100.00%
BenchmarkJSONLimits/3MB_of_ints/jsoniter-12                         4718639        52             -100.00%
BenchmarkJSONLimits/3MB_of_floats/utiljson-12                       2359358        786480         -66.67%
BenchmarkJSONLimits/3MB_of_floats/jsoniter-12                       3932203        786480         -80.00%
BenchmarkJSONLimits/3MB_of_bools/utiljson-12                        64             48             -25.00%
BenchmarkJSONLimits/3MB_of_bools/jsoniter-12                        629190         48             -99.99%
BenchmarkJSONLimits/3MB_of_empty_strings/jsoniter-12                1048623        50             -100.00%
BenchmarkJSONLimits/3MB_of_empty_strings/utiljson-12                66             50             -24.24%
BenchmarkJSONLimits/3MB_of_strings/utiljson-12                      209768         209752         -0.01%
BenchmarkJSONLimits/3MB_of_strings/jsoniter-12                      314604         209752         -33.33%
BenchmarkJSONLimits/3MB_of_nulls/utiljson-12                        64             48             -25.00%
BenchmarkJSONLimits/3MB_of_nulls/jsoniter-12                        629190         48             -99.99%

benchmark                                                           old bytes     new bytes     delta
BenchmarkJSONLimits/3MB_of_deeply_nested_slices/utiljson-12         417536        386624        -7.40%
BenchmarkJSONLimits/3MB_of_deeply_nested_slices/jsoniter-12         880949        386624        -56.11%
BenchmarkJSONLimits/3MB_of_unbalanced_nested_slices/jsoniter-12     880951        386624        -56.11%
BenchmarkJSONLimits/3MB_of_unbalanced_nested_slices/utiljson-12     417536        386624        -7.40%
BenchmarkJSONLimits/3MB_of_deeply_nested_maps/utiljson-12           515840        386624        -25.05%
BenchmarkJSONLimits/3MB_of_deeply_nested_maps/jsoniter-12           3521184       386624        -89.02%
BenchmarkJSONLimits/3MB_of_unbalanced_nested_maps/utiljson-12       515840        386624        -25.05%
BenchmarkJSONLimits/3MB_of_unbalanced_nested_maps/jsoniter-12       3521205       386624        -89.02%
BenchmarkJSONLimits/3MB_of_empty_slices/jsoniter-12                 130450518     113671408     -12.86%
BenchmarkJSONLimits/3MB_of_empty_slices/utiljson-12                 122058220     113671436     -6.87%
BenchmarkJSONLimits/3MB_of_slices/utiljson-12                       123164096     102194413     -17.03%
BenchmarkJSONLimits/3MB_of_slices/jsoniter-12                       152527710     102194400     -33.00%
BenchmarkJSONLimits/3MB_of_empty_maps/utiljson-12                   147224008     138837218     -5.70%
BenchmarkJSONLimits/3MB_of_empty_maps/jsoniter-12                   155615984     138837218     -10.78%
BenchmarkJSONLimits/3MB_of_maps/utiljson-12                         182858905     168180685     -8.03%
BenchmarkJSONLimits/3MB_of_maps/jsoniter-12                         193348033     168180662     -13.02%
BenchmarkJSONLimits/3MB_of_ints/utiljson-12                         172045749     138493160     -19.50%
BenchmarkJSONLimits/3MB_of_ints/jsoniter-12                         213992410     138493162     -35.28%
BenchmarkJSONLimits/3MB_of_floats/utiljson-12                       131972131     73253732      -44.49%
BenchmarkJSONLimits/3MB_of_floats/jsoniter-12                       148752819     73253732      -50.75%
BenchmarkJSONLimits/3MB_of_bools/utiljson-12                        64910776      56524008      -12.92%
BenchmarkJSONLimits/3MB_of_bools/jsoniter-12                        66591720      56524008      -15.12%
BenchmarkJSONLimits/3MB_of_empty_strings/jsoniter-12                105284260     88505576      -15.94%
BenchmarkJSONLimits/3MB_of_empty_strings/utiljson-12                96892346      88505576      -8.66%
BenchmarkJSONLimits/3MB_of_strings/utiljson-12                      22667842      14281066      -37.00%
BenchmarkJSONLimits/3MB_of_strings/jsoniter-12                      15959095      14281060      -10.51%
BenchmarkJSONLimits/3MB_of_nulls/utiljson-12                        64910776      56524008      -12.92%
BenchmarkJSONLimits/3MB_of_nulls/jsoniter-12                        66591718      56524013      -15.12%

cc @lavalamp @deads2k @smarterclayton @wojtek-t

Builds/depends on these PRs/issues:

NONE

mkumatag · 2021-09-15T00:46:08Z

@liggitt golang/go#46293 may give us better results post 1.18, but meanwhile I'm wondering if we can merge #104949 to catch any other issue we may hit with the tip version.

liggitt · 2021-09-15T00:50:05Z

Beyond the potential improvements that would let reflect2 drop some unsafe aspects, there have been quite a few correctness issues with json-iterator in the past, and the current allocations benchmarks make me question whether we want to continue to use it long term

liggitt · 2021-09-15T00:51:00Z

I left some comments on #104949, I find it hard to reason about the correctness of the reflect2 changes

wojtek-t · 2021-09-15T06:06:53Z

I have more confidence in the correctness of the stdlib implementation (e.g. json-iterator/go#413). Combined with the memory-safe implementation, I would argue that the switch is worth it, but I'd like to see what the scale tests here show.

Given that most of the load that we have on kube-apiserver is actually protobuf-based (all components talk via protobuf) and the only json-originating load are test-originating requests, this won't be significantly visible there.
For the similar reason, while I'm not huge fan just wearing my scalability hat, from the wider context I think that from the whole project perspective this change makes sense. [i didn't review the code though.]

liggitt · 2021-10-21T15:24:51Z

/retest

liggitt · 2021-10-21T15:38:36Z

/hold cancel
/assign @apelisse @deads2k

k8s-triage-robot · 2021-10-21T16:00:03Z

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

deads2k · 2021-10-21T16:46:18Z

Correctness is worth it.

/lgtm

apelisse · 2021-10-21T17:16:02Z

Thanks!
/lgtm
/approve

We're still using extensively jsoniter in structured-merge-diff, mostly because we don't deserialize into a struct, but into our own datastructure. We should probably fix that but we also need to be super careful about performance. Let's try to talk about that.

k8s-ci-robot · 2021-10-21T17:16:36Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: apelisse, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [liggitt]
~~pkg/api/OWNERS~~ [liggitt]
~~staging/src/k8s.io/api/OWNERS~~ [liggitt]
~~staging/src/k8s.io/apimachinery/pkg/apis/OWNERS~~ [liggitt]
~~staging/src/k8s.io/legacy-cloud-providers/OWNERS~~ [liggitt]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-triage-robot · 2021-10-21T20:25:37Z

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

The PR does have any do-not-merge/* labels
The PR does not have the needs-ok-to-test label
The PR is mergeable (does not have a needs-rebase label)
The PR is approved (has cncf-cla: yes, lgtm, approved labels)
The PR is failing tests required for merge

You can:

Review the full test history for this PR
Prevent this bot from retesting with /lgtm cancel or /hold
Help make our tests less flaky by following our Flaky Tests Guide

/retest

liggitt · 2021-10-21T21:50:30Z

fun... node bootstrap failure on package sha mismatch:

Oct 21 20:50:01.064798 e2e-0b9a92e009-a7d53-minion-group-5t70 configure.sh[8399]: E: Failed to fetch http://us-west1.gce.archive.ubuntu.com/ubuntu/dists/focal-updates/main/binary-amd64/Packages.xz  File has unexpected size (1294184 != 1294112). Mirror sync in progress? [IP: 104.198.11.147 80]
Oct 21 20:50:01.064798 e2e-0b9a92e009-a7d53-minion-group-5t70 configure.sh[8399]:    Hashes of expected file:
Oct 21 20:50:01.064798 e2e-0b9a92e009-a7d53-minion-group-5t70 configure.sh[8399]:     - Filesize:1294112 [weak]
Oct 21 20:50:01.064798 e2e-0b9a92e009-a7d53-minion-group-5t70 configure.sh[8399]:     - SHA256:fb68bda9cb0902b8e2e4e26c8b14b18fe5d2f46c2e6c9a5808fc0292f32c2157
Oct 21 20:50:01.064798 e2e-0b9a92e009-a7d53-minion-group-5t70 configure.sh[8399]:     - SHA1:deedbbed68c9deec554658bcd6383a54e02a32df [weak]
Oct 21 20:50:01.065504 e2e-0b9a92e009-a7d53-minion-group-5t70 configure.sh[8399]:     - MD5Sum:0bebf164fbb5f931c63fef34a379a4e4 [weak]

/retest

k8s-triage-robot · 2021-10-22T03:01:19Z

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

The PR does have any do-not-merge/* labels
The PR does not have the needs-ok-to-test label
The PR is mergeable (does not have a needs-rebase label)
The PR is approved (has cncf-cla: yes, lgtm, approved labels)
The PR is failing tests required for merge

You can:

Review the full test history for this PR
Prevent this bot from retesting with /lgtm cancel or /hold
Help make our tests less flaky by following our Flaky Tests Guide

/retest

That module uses github.com/modern-go/reflect2 which is broken. First, it is brittle because it relies on golang's implementation details that are not stable across golang releases. For example, reflect2 was broken when go 1.18 came out. Second, reflect2 is effectively unmaintained: modern-go/reflect2#24 (comment) Json-iterator itself has had correctness issues like json-iterator/go#413. The Kubernetes project has mostly removed the dependency on json-iterator in patches like kubernetes/kubernetes#105030. Moreover, the Kubernetes authors found it out that json-iterator puts a lot of load on the allocator, so even the performance gains are questionable. Let us remove dependencies on json-iterator and reflect2.

k8s-ci-robot requested review from aveshagarwal, davidopp and a team September 14, 2021 22:30

liggitt mentioned this pull request Sep 14, 2021

reconsider use of json-iterator / reflect2 kubernetes-sigs/structured-merge-diff#202

Open

liggitt removed request for davidopp and aveshagarwal September 14, 2021 22:34

liggitt mentioned this pull request Sep 15, 2021

Switch to stdlib json encoding kubernetes/kube-openapi#246

Merged

liggitt force-pushed the json-stdlib branch from b4c635e to d60b1e7 Compare September 15, 2021 02:27

k8s-ci-robot added sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels Sep 15, 2021

neolit123 mentioned this pull request Sep 15, 2021

Should this library use json-iterator? kubernetes-sigs/yaml#17

Open

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 20, 2021

vendor sigs.k8s.io/json

434ce43

liggitt force-pushed the json-stdlib branch from fbe25a5 to 434ce43 Compare October 20, 2021 17:12

liggitt assigned deads2k Oct 21, 2021

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 21, 2021

k8s-ci-robot assigned apelisse Oct 21, 2021

liggitt mentioned this pull request Oct 21, 2021

fix disallow unknown fields when decoding into object #105822

Closed

k8s-ci-robot merged commit cc25656 into kubernetes:master Oct 22, 2021

SIG Auth Old automation moved this from Needs Triage to Closed / Done Oct 22, 2021

k8s-ci-robot added this to the v1.23 milestone Oct 22, 2021

liggitt mentioned this pull request Oct 22, 2021

Server Side Field Validation liggitt/kubernetes#2

Closed

liggitt deleted the json-stdlib branch October 23, 2021 13:43

liggitt mentioned this pull request Jun 21, 2022

Reduce memory consumption of apiserver by replacing encoding/json with jsoniter #110623

Closed

pacoxu mentioned this pull request Nov 2, 2022

use k8s.io/apimachinery/pkg/util/json instead of json-iterator/go kubernetes/kube-openapi#326

Closed

likakuli mentioned this pull request Feb 26, 2023

feat: performance improvement for init the world k-cloud-labs/kluster-capacity#34

Closed

csuzhangxc mentioned this pull request Aug 17, 2023

bump K8s to v1.23.17 pingcap/tidb-operator#5250

Merged

10 tasks

dominiquelefevre mentioned this pull request Jan 3, 2024

remove the dependency on github.com/json-iterator/go. emicklei/go-restful#539

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

switch from json-iterator to forked stdlib json decoder #105030

switch from json-iterator to forked stdlib json decoder #105030

liggitt commented Sep 14, 2021 •

edited

mkumatag commented Sep 15, 2021

liggitt commented Sep 15, 2021

liggitt commented Sep 15, 2021

wojtek-t commented Sep 15, 2021

liggitt commented Oct 21, 2021

liggitt commented Oct 21, 2021

k8s-triage-robot commented Oct 21, 2021

deads2k commented Oct 21, 2021 •

edited

apelisse commented Oct 21, 2021

k8s-ci-robot commented Oct 21, 2021

k8s-triage-robot commented Oct 21, 2021

liggitt commented Oct 21, 2021

k8s-triage-robot commented Oct 22, 2021

switch from json-iterator to forked stdlib json decoder #105030

switch from json-iterator to forked stdlib json decoder #105030

Conversation

liggitt commented Sep 14, 2021 • edited

What type of PR is this?

What this PR does / why we need it:

mkumatag commented Sep 15, 2021

liggitt commented Sep 15, 2021

liggitt commented Sep 15, 2021

wojtek-t commented Sep 15, 2021

liggitt commented Oct 21, 2021

liggitt commented Oct 21, 2021

k8s-triage-robot commented Oct 21, 2021

deads2k commented Oct 21, 2021 • edited

apelisse commented Oct 21, 2021

k8s-ci-robot commented Oct 21, 2021

k8s-triage-robot commented Oct 21, 2021

liggitt commented Oct 21, 2021

k8s-triage-robot commented Oct 22, 2021

liggitt commented Sep 14, 2021 •

edited

deads2k commented Oct 21, 2021 •

edited