-
-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use fast_float to parse reals #2332
Conversation
2e1a30e
to
ff7e46e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to work overall. I have some minor comments only.
@dominiklohmann, could you review the CMake? @dispanser, maybe the nix 🧙?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve of the CMake and Nix changes except for the missing SBOM and docs update.
ff7e46e
to
651f47c
Compare
The summarize unit test fails because of a mismatch. I tried to reproduce but I get very different results: $ build/gcc/debug/bin/vast -N import zeek < libvast_test/artifacts/logs/zeek/conn.log
$ build/gcc/debug/bin/vast -N export json --numeric-durations > all.jsonl
$ cat all.jsonl | jq -s 'map(.ts |= .[:-16]) | group_by(.ts)[] | map(.duration) | add'
0.877989
1909.2635140000007
64567.938712
303.8997809999999
83.041812
238.96152800000002
188.27766699999995
279.100037
105.96806399999998
122.157704
8431.890083999999
19559.80153700003
27274.072281000037
4112.946189000001
4008.919659000001
5156.49655
6213.606477000004
4524.929005999997
278.3215950000001
1519.8252199999986
28236.221959000002
2149.467891
303.1990909999997
355.74103599999984
467.3227489999999
165.38586599999996
246.9962219999997 @dominiklohmann can you clarify what to do to get the correct values? |
Exactly what you did. You just need to go back from the numeric durations to the integral value in nanoseconds. Another easy way to do this for durations is to just change |
651f47c
to
48dfc78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me! Thanks for the rounding bug fix as well in the summarize plugin that we discussed earlier.
Real rounding introduces issues. Consider a summarization to day granularity, real rounding would put events after noon into the bucket of the next day. This is very unintuitive and should be considered a bug.
48dfc78
to
7f71207
Compare
Do not merge before #2321
This replaces the internal float parsing code with the https://github.com/fastfloat/fast_float library. The main benefit is more correct parsing and support for scientific notation.
📝 Checklist
🎯 Review Instructions
File by file.
parseable/numeric/real.{ch}pp
are probably the most interesting starting point. Be aware thereal
is semantically the oldreal_opt_dot
. A parser that requires a dot does not exist infast_float
so we don't have that any more. The only code that really wanted this was thedata
parser, but a small modification was enough to change to the regular real parser.