Skip to content

Commit

Permalink
Replace the benchmarking data files and bump version to 0.9.19.
Browse files Browse the repository at this point in the history
Replaced the benchmarking data files that came from chromium.org with
three files obtained from other datasets on GitHub. Since this repo
is vendored into the chromium/src repo it was occasionally confusing
people who thought the data was actually used for non-benchmarking
purposes and thus updating it for whatever reason.

No code changes.
  • Loading branch information
dpranke committed Mar 3, 2024
1 parent 129b88b commit 3deee87
Show file tree
Hide file tree
Showing 12 changed files with 84,820 additions and 119,110 deletions.
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,13 @@ $ python3 -m twine upload dist/*

## Version History / Release Notes

* v0.9.19 (2024-03-03)
* Replaced the benchmarking data files that came from chromium.org with
three files obtained from other datasets on GitHub. Since this repo
is vendored into the chromium/src repo it was occasionally confusing
people who thought the data was actually used for non-benchmarking
purposes and thus updating it for whatever reason.
* No code changes.
* v0.9.18 (2024-02-29)
* Add typing information to the module. This is kind of a big change,
but there should be no functional differences.
Expand Down
2 changes: 2 additions & 0 deletions benchmarks/64KB-min.json

Large diffs are not rendered by default.

67 changes: 67 additions & 0 deletions benchmarks/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
MIT License

Copyright (c) Microsoft Corporation.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE

Code examples from "Python for Data Analysis", 2nd Edition

The MIT License (MIT)

Copyright (c) 2017 Wes McKinney

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

The MIT License (MIT)

Copyright (c) 2014 Milo Yip

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
34 changes: 34 additions & 0 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# JSON5 Benchmarking data

This directory contains a simple command line program that compares the
speed of JSON5 with the builtin JSON decoder.

On a 2018 Mac Mini with a 3 GHz 6 Core Intel Core i5 and 64 GB of memory
running MacOS 14.2.1, JSON5 is from 800-1200x slower than JSON.

The three datasets come from MIT-licensed data grabbed off the web on
Mar 3, 2024 around 21:30 GMT. Their accompanying licenses are contained
in the [LICENSE](license.md) file.

[64KB-min.json](64KB-min.json) was retrieved from
<https://raw.githubusercontent.com/MicrosoftEdge/Demos/e3b81daee151a225c1d8f24bf82d31c464b0f737/json-dummy-data/64KB-min.json>.
It looks like that is part of a set of sample data used to benchmark
Microsoft Edge's JSON viewer, and I'm guessing that the data was
synthetically generated.

[bitly-usa-gov.json](bitly-usa-gov.json) was retrieved from
<https://raw.githubusercontent.com/wesm/pydata-book/b992071876bb4324b0323170061c886760289d4d/datasets/bitly_usagov/example.txt>.
and is a data set that was part of the sample data for *Python Data Analysis,
3rd Edition*. Apparently this data came from bit.ly data for USA.gov, although
I have been unable to find either the original source data or a description
of the schema for it. The book only references the 'tz' field, which appears
to be a timezone, the 'a' field, which loooks like a web browser User-Agent
string. The data was originally a file of newline-separated JSON records,
but I merged them into a single array.

[twitter.json](twitter.json) was retrieved from
<https://raw.githubusercontent.com/miloyip/nativejson-benchmark/5dbf5a933a850652bf059cde64cc1f0c8d2c5d6f/data/twitter.json>.
It comes from what appears to be a a repo used for benchmarking for various
Native C/C++ JSON parsers. I do not know where this file was retrieved from
or what the schema is, but this file has a more complicated schema and contains
data in multiple languages.
Loading

0 comments on commit 3deee87

Please sign in to comment.