Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

128 bit numbers #88

Merged
merged 29 commits into from Jan 26, 2020
Merged

128 bit numbers #88

merged 29 commits into from Jan 26, 2020

Conversation

Licenser
Copy link
Member

@Licenser Licenser commented Dec 4, 2019

See #59

todo:

  • bench
  • implement 128bit parsing
  • remove 128bit as default flag

@Licenser
Copy link
Member Author

Licenser commented Dec 4, 2019

Initial perf (just memory layout updates & functions, no parsing).

Impact is measurable but not terrible, nothing to enable by default but as a flag for those who need it it's acceptable performance I think.

0.2.2 (master)

======= simd_json ======== parse|stringify ===== parse|stringify ====
data/canada.json         530 MB/s   420 MB/s   710 MB/s   320 MB/s
data/citm_catalog.json  1130 MB/s   480 MB/s  1550 MB/s   610 MB/s
data/twitter.json        920 MB/s   600 MB/s  1000 MB/s   640 MB/s
data/log.json           1080 MB/s  1080 MB/s  1080 MB/s  1080 MB/s

===== simd_json_tape ===== parse|stringify ===== parse|stringify ====
data/canada.json         770 MB/s
data/citm_catalog.json  1890 MB/s
data/twitter.json       1670 MB/s
data/log.json           2170 MB/s

this

======= simd_json ======== parse|stringify ===== parse|stringify ====
data/canada.json         510 MB/s   430 MB/s   670 MB/s   330 MB/s
data/citm_catalog.json  1110 MB/s   500 MB/s  1520 MB/s   650 MB/s
data/twitter.json        900 MB/s   610 MB/s   970 MB/s   650 MB/s
data/log.json           1080 MB/s  1080 MB/s  1080 MB/s  1080 MB/s

===== simd_json_tape ===== parse|stringify ===== parse|stringify ====
data/canada.json         760 MB/s
data/citm_catalog.json  1880 MB/s
data/twitter.json       1640 MB/s
data/log.json           2170 MB/s

image

@Licenser Licenser added this to In progress in 0.3 via automation Dec 4, 2019
@Licenser
Copy link
Member Author

Licenser commented Dec 4, 2019

@mstump do you want to give this branch a try and see if it solves your issue?

@Licenser Licenser marked this pull request as ready for review December 4, 2019 22:11
@Licenser
Copy link
Member Author

Licenser commented Dec 4, 2019

(marking this as non draft to get codecov to post it's stuff) - not ready to be merged yet but input is more then welcome

@codecov
Copy link

codecov bot commented Dec 4, 2019

Codecov Report

Merging #88 into master will increase coverage by 1.07%.
The diff coverage is 86.87%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #88      +/-   ##
==========================================
+ Coverage   84.65%   85.73%   +1.07%     
==========================================
  Files          36       39       +3     
  Lines        4385     5284     +899     
==========================================
+ Hits         3712     4530     +818     
- Misses        673      754      +81
Flag Coverage Δ
#avx2 79.93% <86.87%> (+2.26%) ⬆️
#avx2KnownKey 78.68% <83.68%> (+1.04%) ⬆️
#e ?
#sse 80.57% <86.87%> (?)
#sseKnownKey 79.36% <83.68%> (+0.92%) ⬆️
Impacted Files Coverage Δ
tests/jsonchecker.rs 100% <ø> (ø) ⬆️
src/value/borrowed/cmp.rs 89.36% <ø> (ø) ⬆️
src/value.rs 98.19% <ø> (+8.63%) ⬆️
src/value/owned/cmp.rs 87.5% <ø> (ø) ⬆️
src/value/borrowed/serialize.rs 80.28% <0%> (-4.8%) ⬇️
src/serde/value/borrowed/se.rs 0% <0%> (ø) ⬆️
src/value/owned.rs 100% <100%> (+4.48%) ⬆️
src/macros.rs 94.11% <100%> (-5.89%) ⬇️
src/value/owned/from.rs 89.83% <100%> (+1.15%) ⬆️
src/value/borrowed/from.rs 92.72% <100%> (+0.57%) ⬆️
... and 23 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6292255...51a2835. Read the comment docs.

@Licenser
Copy link
Member Author

Licenser commented Dec 4, 2019

image
😂 well no this definetly needs work!

@Licenser
Copy link
Member Author

@sunnygleason this should be ready for review :)

@Licenser
Copy link
Member Author

@sunnygleason could I bug you for a review so we can merge this?

@mstump
Copy link

mstump commented Jan 13, 2020

I finally climbed out from under my pile of demos and I'm sorry to report that I'm still getting weird behavior with this branch.

Cargo snipit:

[dependencies.simd-json]
git = "https://github.com/simd-lite/simdjson-rs"
branch = "128-bit-numbers"
features = ["128bit"]

With the 128bit feature enabled I get the error message:

thread 'reporter_message::raw_message::tests::test_deserialize_elastic_search' panicked at 'JsonSimdError(Error { index: 0, character: '💩', error: Serde("invalid type: i128, expected any value") })', src/reporter_message/raw_message.rs:398:25

Where previously I got:

thread 'reporter_message::raw_message::tests::test_deserialize_elastic_search' panicked at 'JsonSimdError(Error { index: 191048, character: '\n', error: Overflow })', src/reporter_message/raw_message.rs:398:25

The enum that it's decoding is

#[serde(untagged)]
#[derive(Clone, Debug, PartialEq, Deserialize, Serialize)]
pub enum RawMessageMetricValue {
    Boolean(bool),
    Float(f64),
    Long(i128),
    Integer(i64),
    Metric(Vec<RawMessageMetricNode>),
    String(String),
}

The offending JSON is below, it cleanly deserializes with serde_json.

{
  "name": "max_unsafe_auto_id_timestamp",
  "value": -9223372036854776000
}

@Licenser
Copy link
Member Author

Licenser commented Jan 13, 2020 via email

@Licenser
Copy link
Member Author

Turns out this is a bug in serde, it doesn't support untagged fields with 128 bit values :/

https://github.com/serde-rs/serde/blob/ff70409215b62fa8cd138ce47534c432352c85bb/serde/src/private/de.rs#L230

@sunnygleason
Copy link
Member

@Licenser what should the next step for this PR be?

@Licenser
Copy link
Member Author

I opened a ticket with serde: serde-rs/serde#1717

I'm honestly not 100% sure, merging it would already allow 128 bit values outside of enums and should work there once it is fixed in serde.

Since no code changes will be required I think we could still merge it, there is little reason to hold it back just because of a bug in serde.

other then that clippy is throwing spanners in the works :P only 177 new errors! yay

@Licenser
Copy link
Member Author

@sunnygleason can I bug you for a review so we can merge this? The remaining issue is out of our control and it will work once serde allows 128bit integers in untagged enums. So nothing we can do about that, our code is correct.

@sunnygleason
Copy link
Member

@Licenser sure! Were you able to run fuzz testing for the feature? Just curious...

@Licenser
Copy link
Member Author

I haven't it has been broken for a bit, and it feels with #95 in the todo list it's not worth investing time into unbreak it. We'll scrap the current fuzz stuff anyway.

@Licenser Licenser merged commit ad0c959 into master Jan 26, 2020
0.3 automation moved this from In progress to Done Jan 26, 2020
@Licenser Licenser deleted the 128-bit-numbers branch June 17, 2020 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
0.3
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants