Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move more parsing in stage2 #74

Merged
merged 27 commits into from Nov 4, 2019
Merged

Move more parsing in stage2 #74

merged 27 commits into from Nov 4, 2019

Conversation

Licenser
Copy link
Member

We can move some of the parsing logic into stage 2 instead of checking them after that. This simplifies the code for parse() .

Performance wise it looks like a small improvement too.

image

0.2

======= simd_json ======== parse|stringify ===== parse|stringify ====
data/canada.json         450 MB/s   490 MB/s   630 MB/s   350 MB/s
data/citm_catalog.json  1170 MB/s   600 MB/s  1450 MB/s   760 MB/s
data/twitter.json        950 MB/s   700 MB/s   950 MB/s   750 MB/s
data/log.json           1080 MB/s  2170 MB/s  1080 MB/s  1080 MB/s

this

======= simd_json ======== parse|stringify ===== parse|stringify ====
data/canada.json         460 MB/s   490 MB/s   630 MB/s   350 MB/s
data/citm_catalog.json  1120 MB/s   600 MB/s  1450 MB/s   770 MB/s
data/twitter.json        960 MB/s   700 MB/s   970 MB/s   750 MB/s
data/log.json           1080 MB/s  2170 MB/s  1080 MB/s  1080 MB/s

@codecov
Copy link

codecov bot commented Oct 31, 2019

Codecov Report

Merging #74 into 0.2 will decrease coverage by 0.58%.
The diff coverage is 80.31%.

Impacted file tree graph

@@            Coverage Diff             @@
##              0.2      #74      +/-   ##
==========================================
- Coverage   84.67%   84.08%   -0.59%     
==========================================
  Files          34       35       +1     
  Lines        4410     4462      +52     
==========================================
+ Hits         3734     3752      +18     
- Misses        676      710      +34
Flag Coverage Δ
#avx2 79.36% <78.08%> (-0.56%) ⬇️
#avx2KnownKey 79.36% <78.08%> (-0.56%) ⬇️
#e 78.34% <77.66%> (-0.53%) ⬇️
#sseKnownKey 79.44% <77.92%> (-0.56%) ⬇️
Impacted Files Coverage Δ
src/value.rs 87.35% <ø> (ø) ⬆️
src/serde/value/borrowed/se.rs 0% <0%> (ø) ⬆️
src/value/borrowed/cmp.rs 89.36% <100%> (+0.68%) ⬆️
src/value/borrowed/serialize.rs 85.07% <100%> (ø) ⬆️
src/macros.rs 100% <100%> (ø) ⬆️
src/value/borrowed/from.rs 91.83% <100%> (+2.18%) ⬆️
src/error.rs 100% <100%> (ø) ⬆️
src/known_key.rs 97.87% <100%> (ø) ⬆️
src/value/owned/from.rs 88.23% <100%> (-0.1%) ⬇️
src/value/owned/cmp.rs 87.5% <100%> (+0.73%) ⬆️
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ed0b814...0bb7403. Read the comment docs.

@Licenser
Copy link
Member Author

Licenser commented Nov 1, 2019

This starts to look like a worthwhile improvement:
image

======= simd_json ======== parse|stringify ===== parse|stringify ====
data/canada.json         470 MB/s   500 MB/s   650 MB/s   350 MB/s
data/citm_catalog.json  1150 MB/s   600 MB/s  1510 MB/s   770 MB/s
data/twitter.json        990 MB/s   710 MB/s  1000 MB/s   740 MB/s
data/log.json           1080 MB/s  2170 MB/s  1080 MB/s  1080 MB/s

@Licenser
Copy link
Member Author

Licenser commented Nov 1, 2019

image

======= simd_json ======== parse|stringify ===== parse|stringify ====
data/canada.json         500 MB/s   470 MB/s   670 MB/s   360 MB/s
data/citm_catalog.json  1210 MB/s   610 MB/s  1520 MB/s   780 MB/s
data/twitter.json        970 MB/s   720 MB/s   990 MB/s   760 MB/s
data/log.json           1080 MB/s  2170 MB/s  1080 MB/s  1080 MB/s

@Licenser Licenser marked this pull request as ready for review November 1, 2019 22:10
@Licenser
Copy link
Member Author

Licenser commented Nov 1, 2019

@sunnygleason I think this part of 0.2 is ready to take a peek :)

@Licenser
Copy link
Member Author

Licenser commented Nov 2, 2019

The latest change isn't a clear win or loss but it moves the results a little:

======= simd_json ======== parse|stringify ===== parse|stringify ====
data/canada.json         540 MB/s   460 MB/s   730 MB/s   360 MB/s
data/citm_catalog.json  1160 MB/s   610 MB/s  1610 MB/s   770 MB/s
data/twitter.json        920 MB/s   700 MB/s  1020 MB/s   760 MB/s
data/log.json           1080 MB/s  2170 MB/s  1080 MB/s  1080 MB/s

Canada improves (which was our slowest case) but citm and twitter degrade slightly for parsing the Dom. On structs it is a win throughout the bench.

I think even so two of the benchmarks degrade a bit, it's still a win at the end.

@Licenser
Copy link
Member Author

Licenser commented Nov 2, 2019

Seems we're getting some memory corruption with this - any help with bug hunting welcome I've so far come up blank :/

@Licenser
Copy link
Member Author

Licenser commented Nov 3, 2019

Found the error it was a simple <= vs < -.-

@Licenser Licenser added this to To Do in 0.2 Nov 4, 2019
@Licenser
Copy link
Member Author

Licenser commented Nov 4, 2019

merging into 0.2 branch - we can review there.

@Licenser Licenser merged commit d72294c into 0.2 Nov 4, 2019
@Licenser Licenser mentioned this pull request Nov 5, 2019
5 tasks
@Licenser Licenser moved this from To Do to D in 0.2 Dec 1, 2019
@Licenser Licenser deleted the new-parse branch March 2, 2020 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
0.2
  
D
Development

Successfully merging this pull request may close these issues.

None yet

1 participant