ARROW-6038: [C++] Faster type equality#4983
Conversation
|
@emkornfield @fsaintjacques @wesm Looking for design feedback on this. Faster type equality may allow us to enable more runtime checks and make Arrow more user-friendly (see e.g. ARROW-6038). |
|
(note a similar optimization can be added to |
fsaintjacques
left a comment
There was a problem hiding this comment.
Do we want to keep type.h as small as possible, maybe the ComputeFingerprint could be a seperate VisitorType class?
cpp/src/arrow/type.cc
Outdated
There was a problem hiding this comment.
"I" (capital i) is not something I've seen often, the standard is often u for unsigned and i for signed. I also wonder if we should use bit-width instead of byte. This would allow you to use i1 as bool.
There was a problem hiding this comment.
Or we could simply use the type id... I think it's gonna be a long time before it goes past 128 :-)
cpp/src/arrow/type.cc
Outdated
There was a problem hiding this comment.
Delete the boolean flag, make fingerprint a shared_ptr and use atomic_compare_exchange (upon failure, let the first writer win) in the write path. All read path should use atomic_load.
I'd use unique_ptr, but there's no atomic ops on them.
There was a problem hiding this comment.
That's the plan indeed. This PR is a draft.
There was a problem hiding this comment.
I may use a raw pointer actually, because atomic ops on shared_ptr make copies of the latter.
There was a problem hiding this comment.
Another possibility is to implement our own simplified atomic unique_ptr. What do you think?
There was a problem hiding this comment.
Another solution, why not pay this cost in the type constructor? Types are (should) rarely be created in a frequency where this optimization (lazy evaluation) matter.
There was a problem hiding this comment.
I don't know, I'm a bit wary of piling up costs for optional functionality in the constructor.
There was a problem hiding this comment.
I'm also wary of adding extra work in constructors if it isn't always needed
cpp/src/arrow/type.cc
Outdated
There was a problem hiding this comment.
str, bin are prefixes that are self-explanatory (su/sb are not), albeit one character longer.
cpp/src/arrow/type.cc
Outdated
There was a problem hiding this comment.
ss << nullable_ ? 'n' : 'N'
|
@fsaintjacques try running ursabot benchmark? |
|
@kszucs why? |
|
Simply just to start using it regularly :) |
cpp/src/arrow/type.cc
Outdated
There was a problem hiding this comment.
do stringstreams have some sort of small buffer allocated ahead of time (ala small string optimizations in standard strings?)
There was a problem hiding this comment.
I have no idea. I suspect it depends on the implementation.
cpp/src/arrow/type.cc
Outdated
There was a problem hiding this comment.
this seems like it would make the check expensive again if clients made heavy use of metadata. I wonder if it makes sense to maybe make a first class TypeFingerprint class that can do the comparison incrementally?
There was a problem hiding this comment.
The fingerprint is cached on the field, so I'm not sure what you mean here.
I have no strong preference, but why not. |
4a7a855 to
83a2286
Compare
|
Ok, I've updated the PR to handle |
01c6805 to
782cb85
Compare
|
Note: still need to handle UnionType and ExtensionType, and I think that's all. Edit: done. Fingerprint can be implemented by each ExtensionType subclass individually. |
782cb85 to
8245882
Compare
|
The main downside here is that this adds a bit of complexity and fragility ( |
6f21730 to
b58f8d8
Compare
When checking for type equality, compute and cache a fingerprint of the type so as to avoid costly nested type walking and multiple comparisons. Before: ---------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------- TypeEqualsSimple 13 ns 13 ns 55242976 150.558M items/s TypeEqualsComplex 430 ns 430 ns 1637275 4.43634M items/s TypeEqualsWithMetadata 595 ns 595 ns 1199216 3.20778M items/s SchemaEquals 1465 ns 1465 ns 479512 1.30226M items/s SchemaEqualsWithMetadata 922 ns 922 ns 763752 2.0683M items/s After: ---------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------- TypeEqualsSimple 11 ns 11 ns 65531752 178.723M items/s TypeEqualsComplex 20 ns 20 ns 33939830 95.1497M items/s TypeEqualsWithMetadata 31 ns 31 ns 22979555 62.4052M items/s SchemaEquals 40 ns 40 ns 17786532 48.1683M items/s SchemaEqualsWithMetadata 46 ns 46 ns 15173158 41.3242M items/s
b58f8d8 to
2fdaf4a
Compare
wesm
left a comment
There was a problem hiding this comment.
+1. Not able to scrutinize the details closely but this is a big performance improvement, so thank you for taking great care with it =)
When checking for type equality, compute and cache a fingerprint of the type so as to avoid costly nested type walking and multiple comparisons.
Before:
After: