Improved performance in cast Primitive to Binary/String again (4x) #651

sundy-li · 2021-12-01T04:29:07Z

Memcpy style write, no extra copy.

cast int32 to binary 512                                                                             
                        time:   [2.9503 us 2.9581 us 2.9664 us]
                        change: [-70.283% -70.127% -69.980%] (p = 0.00 < 0.05)
                        Performance has improved.

codecov · 2021-12-01T04:38:15Z

Codecov Report

Merging #651 (52be4d7) into main (8300684) will decrease coverage by 0.30%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #651      +/-   ##
==========================================
- Coverage   69.89%   69.59%   -0.31%     
==========================================
  Files         299      299              
  Lines       16634    16746     +112     
==========================================
+ Hits        11626    11654      +28     
- Misses       5008     5092      +84

Impacted Files	Coverage Δ
src/array/binary/mod.rs	`81.48% <100.00%> (ø)`
src/compute/cast/primitive_to.rs	`79.68% <100.00%> (+3.44%)`	⬆️
src/compute/arithmetics/mod.rs	`69.04% <0.00%> (-27.62%)`	⬇️
src/compute/arithmetics/time.rs	`26.60% <0.00%> (-17.63%)`	⬇️
src/compute/arithmetics/decimal/mul.rs	`75.00% <0.00%> (-17.31%)`	⬇️
src/compute/arithmetics/decimal/div.rs	`75.94% <0.00%> (-16.36%)`	⬇️
src/array/binary/mutable.rs	`79.35% <0.00%> (-0.81%)`	⬇️
src/array/utf8/mutable.rs	`85.09% <0.00%> (-0.68%)`	⬇️
src/bitmap/immutable.rs	`86.48% <0.00%> (ø)`
src/io/csv/read/deserialize.rs	`100.00% <0.00%> (ø)`
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8300684...52be4d7. Read the comment docs.

sundy-li · 2021-12-01T10:12:21Z

Why this pr makes MIRI tests fail?

jorgecarleitao · 2021-12-01T16:37:55Z

it is unrelated. There is something going on on miri dependencies that are causing some CIs to fail. Could you change the "key" parameter in the cache action on the .github/test.yaml for a cache miss? I need to investigate what is causing this.

sundy-li · 2021-12-02T04:21:00Z

mergifiy is a good bot to have.

jorgecarleitao · 2021-12-02T04:57:41Z

I will take a bit more to review this since it uses unsafe.

jorgecarleitao · 2021-12-02T19:56:28Z

src/compute/cast/primitive_to.rs

-    let mut buffer = vec![];
    let builder = from.iter().fold(
        MutableBinaryArray::<O>::with_capacity(from.len()),
        |mut builder, x| {
            match x {
-                Some(x) => {
-                    lexical_to_bytes_mut(*x, &mut buffer);
-                    builder.push(Some(buffer.as_slice()));
-                }
+                Some(x) => unsafe {
+                    builder.reserve(1, T::FORMATTED_SIZE_DECIMAL);
+                    builder.write_values(|bytes| lexical_core::write(*x, bytes).len());
+                },
                None => builder.push_null(),


this is a really cool idea!

I think we may go a step further, though: since the size is constant, the offsets will be [0, N, 2N, ..., M*N] and the values can be constructed directly from lexical_core::write, e.g. via extend. We also do not need to check for utf8 below because lexical_core guarantees this. We can even ignore the validity of the primitive array and continue writing whatever is in the null slot, and clone the validity.

I think this implementation is best done without a MutableBinaryArray, though: we benefit from operating on the buffers directly in this case.

since the size is constant, the offsets will be [0, N, 2N, ..., M*N]

It's not constant, T::FORMATTED_SIZE_DECIMAL is the maximum size to reverse.

jorgecarleitao · 2021-12-02T19:57:38Z

src/array/utf8/mutable.rs

+    {
+        // ensure values has enough capacity and size to write
+        self.values.set_len(self.values.capacity());
+        let buffer = &mut self.values.as_mut_slice()[self.offsets.last().unwrap().to_usize()..];


I think that this is unsound, even in unsafe code: a slice must always have initialized data on it. I propose a different implementation below that avoids introducing another API to the MutableBuffer. LMK what you think.

Yes, I do agree with you. But now MutableBuffer is the only way to construct a BinaryArray, maybe we should expose values, offsets to outside.

We can have temp values, offset vectors in the cast kernel and then construct the MutableBuffer by these two vectors.

Refer to clickhouse's style:
https://github.com/ClickHouse/ClickHouse/blob/515cc74530d11e1b2b18a63141b66a15b94748ba/src/Columns/ColumnString.h

…nto primitive-cast3

sundy-li · 2021-12-05T11:27:34Z

Performance improved again:

cast int32 to binary 512                                                                             
        time:   [3.4333 us 3.4434 us 3.4536 us]
        change: [-52.301% -52.104% -51.908%] (p = 0.00 < 0.05)
        Performance has improved.

jorgecarleitao

Looks great! Left some minor comments, but overall is ready to merge.

I think that the windows does not support some of the dev dependencies, unfortunately :(

src/compute/cast/primitive_to.rs

src/array/binary/mutable.rs

sundy-li · 2021-12-05T12:12:02Z

After using from_data_unchecked

cast int32 to binary 512                                                                             
                        time:   [2.9751 us 2.9841 us 2.9944 us]
                        change: [-12.521% -11.660% -10.883%] (p = 0.00 < 0.05)
                        Performance has improved.

sundy-li · 2021-12-05T12:44:07Z

Some thoughts: during every XXXToBinary or XXXToUtf8, we must write these unsafe codes? :(

In databend, there are many functions to be implemented to work with strings, I think it better to have a wrap function to simplify the logic.

jorgecarleitao · 2021-12-05T13:14:44Z

Yes, that would be ideal, so that we only have to write (and test) the unsafe bit once. An inline function should be enough, though.

jorgecarleitao · 2021-12-05T13:16:03Z

The PR description is outdated, it is more like -75% now ^_^

Improved performance in cast Primitive to Binary/String again (2x)

f81fb27

Add key to rust cache

219fa2c

sundy-li force-pushed the primitive-cast3 branch from 0cabcc6 to 219fa2c Compare December 1, 2021 22:30

Merge branch 'main' into primitive-cast3

fcccb35

jorgecarleitao reviewed Dec 2, 2021

View reviewed changes

sundy-li added 2 commits December 5, 2021 19:25

Refactor by comments

e6192f5

Merge branch 'primitive-cast3' of github.com:datafuse-extras/arrow2 i…

7b22720

…nto primitive-cast3

jorgecarleitao approved these changes Dec 5, 2021

View reviewed changes

src/compute/cast/primitive_to.rs Outdated Show resolved Hide resolved

src/compute/cast/primitive_to.rs Outdated Show resolved Hide resolved

src/array/binary/mutable.rs Outdated Show resolved Hide resolved

Refactor by comments

7e5d73c

sundy-li changed the title ~~Improved performance in cast Primitive to Binary/String again (2x)~~ Improved performance in cast Primitive to Binary/String again (4x) Dec 5, 2021

Remove pprof dev-dep

52be4d7

jorgecarleitao merged commit a18555c into jorgecarleitao:main Dec 5, 2021

sundy-li mentioned this pull request Dec 6, 2021

Improve performance in functions working with String datafuselabs/databend#3259

Closed

Veeupup mentioned this pull request Dec 9, 2021

And I think it's time to add transfrom_from_primitive function. datafuselabs/databend#3304

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved performance in cast Primitive to Binary/String again (4x) #651

Improved performance in cast Primitive to Binary/String again (4x) #651

sundy-li commented Dec 1, 2021 •

edited

codecov bot commented Dec 1, 2021 •

edited

sundy-li commented Dec 1, 2021

jorgecarleitao commented Dec 1, 2021

sundy-li commented Dec 2, 2021

jorgecarleitao commented Dec 2, 2021

jorgecarleitao Dec 2, 2021

sundy-li Dec 3, 2021

jorgecarleitao Dec 2, 2021

sundy-li Dec 3, 2021

sundy-li commented Dec 5, 2021

jorgecarleitao left a comment

sundy-li commented Dec 5, 2021

sundy-li commented Dec 5, 2021

jorgecarleitao commented Dec 5, 2021

jorgecarleitao commented Dec 5, 2021

Improved performance in cast Primitive to Binary/String again (4x) #651

Improved performance in cast Primitive to Binary/String again (4x) #651

Conversation

sundy-li commented Dec 1, 2021 • edited

codecov bot commented Dec 1, 2021 • edited

Codecov Report

sundy-li commented Dec 1, 2021

jorgecarleitao commented Dec 1, 2021

sundy-li commented Dec 2, 2021

jorgecarleitao commented Dec 2, 2021

jorgecarleitao Dec 2, 2021

Choose a reason for hiding this comment

sundy-li Dec 3, 2021

Choose a reason for hiding this comment

jorgecarleitao Dec 2, 2021

Choose a reason for hiding this comment

sundy-li Dec 3, 2021

Choose a reason for hiding this comment

sundy-li commented Dec 5, 2021

jorgecarleitao left a comment

Choose a reason for hiding this comment

sundy-li commented Dec 5, 2021

sundy-li commented Dec 5, 2021

jorgecarleitao commented Dec 5, 2021

jorgecarleitao commented Dec 5, 2021

sundy-li commented Dec 1, 2021 •

edited

codecov bot commented Dec 1, 2021 •

edited