Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for Node's data constructors #13

Closed
andrewthad opened this issue May 6, 2016 · 7 comments
Closed

Suggestion for Node's data constructors #13

andrewthad opened this issue May 6, 2016 · 7 comments

Comments

@andrewthad
Copy link

The Value data type is currently defined:

data Node = VTable    Table
          | VTArray   [Table]
          | VString   !Text
          | VInteger  !Int64
          | VFloat    !Double
          | VBoolean  !Bool
          | VDatetime !UTCTime
          | VArray    [Node]

The bangs are inconsistent. I would recommend putting them on all of them, as aeson does. Also, like in aeson, I would recommend changing VArray and VTArray to use Vectors instead of lists. This tends to be more performant, although I don't have any benchmarks to verify this.

@cies
Copy link
Owner

cies commented May 6, 2016

Thanks. I have an open TODO item also for going to Vector, I'll give it a try right away.

@cies
Copy link
Owner

cies commented May 6, 2016

Benchmarks are in! The old code (using Lists and less bangs):

benchmarking string/assignment
time                 97.87 μs   (97.68 μs .. 98.13 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 97.79 μs   (97.55 μs .. 98.00 μs)
std dev              733.7 ns   (584.1 ns .. 976.4 ns)

benchmarking string/headers
time                 32.06 μs   (29.92 μs .. 34.11 μs)
                     0.983 R²   (0.970 R² .. 0.999 R²)
mean                 30.30 μs   (29.82 μs .. 31.35 μs)
std dev              2.247 μs   (1.014 μs .. 4.117 μs)
variance introduced by outliers: 75% (severely inflated)

benchmarking string/mixed
time                 81.25 μs   (81.07 μs .. 81.45 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 81.87 μs   (81.30 μs .. 83.67 μs)
std dev              2.967 μs   (1.200 μs .. 5.811 μs)
variance introduced by outliers: 37% (moderately inflated)

benchmarking file/example
time                 742.0 μs   (740.7 μs .. 743.4 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 742.8 μs   (741.3 μs .. 744.9 μs)
std dev              6.286 μs   (4.656 μs .. 9.094 μs)

benchmarking file/repeated-4x
time                 3.656 ms   (3.642 ms .. 3.669 ms)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 3.676 ms   (3.657 ms .. 3.708 ms)
std dev              80.20 μs   (46.92 μs .. 118.9 μs)

The new code (using Vectors and more bangs):

benchmarking string/assignment
time                 99.22 μs   (98.80 μs .. 99.69 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 100.1 μs   (99.19 μs .. 102.3 μs)
std dev              4.581 μs   (1.119 μs .. 8.270 μs)
variance introduced by outliers: 47% (moderately inflated)

benchmarking string/headers
time                 31.20 μs   (31.10 μs .. 31.34 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 31.19 μs   (31.07 μs .. 31.33 μs)
std dev              430.3 ns   (341.0 ns .. 657.8 ns)

benchmarking string/mixed
time                 86.52 μs   (86.09 μs .. 87.19 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 86.31 μs   (86.06 μs .. 86.78 μs)
std dev              1.106 μs   (668.0 ns .. 1.695 μs)

benchmarking file/example
time                 777.5 μs   (763.6 μs .. 802.1 μs)
                     0.997 R²   (0.994 R² .. 1.000 R²)
mean                 768.0 μs   (764.5 μs .. 776.5 μs)
std dev              17.54 μs   (8.454 μs .. 32.88 μs)
variance introduced by outliers: 12% (moderately inflated)

benchmarking file/repeated-4x
time                 3.736 ms   (3.715 ms .. 3.761 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 3.739 ms   (3.727 ms .. 3.756 ms)
std dev              44.57 μs   (33.57 μs .. 63.03 μs)

@cies
Copy link
Owner

cies commented May 6, 2016

It seems that Vectors and more bangs made the code slower! Maybe the real benefit of this approach shows when using also some INLINE pragmas -- which I have no experience with.

Could anyone explain why it got slower?

@cies
Copy link
Owner

cies commented May 6, 2016

I've pushed the changes to a branch: https://github.com/cies/htoml/tree/improve-performance

So we can all have a look at the same code.

@cies
Copy link
Owner

cies commented May 7, 2016

@andrewthad So I had a look at it again, and I guess is that either my benchmark suite is inadequate (very possible, as I never put much thought into it), or the implementation with Vector is somehow slower.

@andrewthad
Copy link
Author

You probably didn't see any increase in speed because your benchmarks only measure parsing speed. And parsing the content into vectors isn't really any faster because the code the arrayOf parser creates a list which is then converted to a vector at the end. I wouldn't recommend changing this because the resulting implementation could be pretty complicated (involving some kind of growable mutable vector), and it might not improve things at all. It looks like even aeson does the Vector.fromList conversion at the end.

If your had other benchmarks that measured the time it took to render a Node, I bet that those would show a speedup. Parsing into a Vector won't be faster, but mapping over it to render its contents is almost certainly faster.

@cies
Copy link
Owner

cies commented May 9, 2016

Since it is in line with how Aeson does it --and that had quite a few eyes-- I'll merge it in and release 1.0 + push to Stackage. Thanks @andrewthad !

We can later maybe tune some things up with UNPACKs or INLINEs -- but that won't affect the pub-API.

@cies cies closed this as completed May 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants