Skip to content

Conversation

@sbinet
Copy link
Contributor

@sbinet sbinet commented Dec 3, 2018

This CL enables the encoding/csv reader to reuse the memory used by
records, from row to row, and thus reduce memory pressure on Go's GC.

$> benchstat old.txt new.txt
name                                    old time/op    new time/op    delta
Read/rows=10_cols=1_chunks=10-8           39.4µs ±18%    42.6µs ±24%     ~     (p=0.218 n=10+10)
Read/rows=10_cols=10_chunks=10-8           293µs ±23%     280µs ±24%     ~     (p=0.400 n=10+9)
Read/rows=10_cols=100_chunks=10-8         2.72ms ±24%    2.56ms ±20%     ~     (p=0.353 n=10+10)
Read/rows=10_cols=1000_chunks=10-8        24.3ms ± 2%    24.0ms ± 3%     ~     (p=0.059 n=8+9)
Read/rows=100_cols=1_chunks=10-8          74.9µs ±11%    62.1µs ±19%  -17.21%  (p=0.004 n=10+10)
Read/rows=100_cols=10_chunks=10-8          559µs ±21%     474µs ±21%  -15.12%  (p=0.009 n=10+10)
Read/rows=100_cols=100_chunks=10-8        5.53ms ±21%    4.36ms ±16%  -21.27%  (p=0.000 n=10+9)
Read/rows=100_cols=1000_chunks=10-8       41.9ms ± 3%    42.2ms ±13%     ~     (p=0.684 n=10+10)
Read/rows=1000_cols=1_chunks=10-8          421µs ±13%     320µs ±10%  -23.98%  (p=0.000 n=10+10)
Read/rows=1000_cols=10_chunks=10-8        3.24ms ±24%    2.63ms ±15%  -18.77%  (p=0.007 n=10+10)
Read/rows=1000_cols=100_chunks=10-8       33.0ms ±17%    27.0ms ±19%  -18.09%  (p=0.001 n=10+10)
Read/rows=1000_cols=1000_chunks=10-8       219ms ± 1%     211ms ± 2%   -3.81%  (p=0.000 n=9+10)
Read/rows=10000_cols=1_chunks=10-8        3.66ms ±11%    2.91ms ±10%  -20.27%  (p=0.000 n=10+10)
Read/rows=10000_cols=10_chunks=10-8       31.8ms ±16%    25.6ms ±15%  -19.66%  (p=0.000 n=10+10)
Read/rows=10000_cols=100_chunks=10-8       192ms ± 1%     182ms ± 1%   -5.19%  (p=0.000 n=10+10)
Read/rows=10000_cols=1000_chunks=10-8      1.99s ± 1%     1.93s ± 2%   -3.26%  (p=0.000 n=9+9)
Read/rows=100000_cols=1_chunks=10-8       32.9ms ± 4%    26.1ms ± 4%  -20.75%  (p=0.000 n=10+10)
Read/rows=100000_cols=10_chunks=10-8       203ms ± 1%     198ms ± 7%     ~     (p=0.123 n=10+10)
Read/rows=100000_cols=100_chunks=10-8      2.00s ± 1%     1.92s ± 1%   -4.24%  (p=0.000 n=10+8)
Read/rows=100000_cols=1000_chunks=10-8     22.7s ± 2%     22.0s ± 2%   -3.31%  (p=0.000 n=9+10)

name                                    old alloc/op   new alloc/op   delta
Read/rows=10_cols=1_chunks=10-8           32.7kB ± 0%    32.2kB ± 0%   -1.32%  (p=0.000 n=10+10)
Read/rows=10_cols=10_chunks=10-8           281kB ± 0%     277kB ± 0%   -1.54%  (p=0.000 n=10+10)
Read/rows=10_cols=100_chunks=10-8         2.77MB ± 0%    2.73MB ± 0%   -1.58%  (p=0.000 n=10+10)
Read/rows=10_cols=1000_chunks=10-8        27.8MB ± 0%    27.3MB ± 0%   -1.59%  (p=0.000 n=9+9)
Read/rows=100_cols=1_chunks=10-8          44.0kB ± 0%    39.3kB ± 0%  -10.80%  (p=0.000 n=10+10)
Read/rows=100_cols=10_chunks=10-8          381kB ± 0%     333kB ± 0%  -12.48%  (p=0.000 n=10+10)
Read/rows=100_cols=100_chunks=10-8        3.78MB ± 0%    3.29MB ± 0%  -12.75%  (p=0.000 n=10+10)
Read/rows=100_cols=1000_chunks=10-8       37.9MB ± 0%    33.1MB ± 0%  -12.83%  (p=0.000 n=10+9)
Read/rows=1000_cols=1_chunks=10-8          200kB ± 0%     152kB ± 0%  -23.99%  (p=0.000 n=10+10)
Read/rows=1000_cols=10_chunks=10-8        1.84MB ± 0%    1.36MB ± 0%  -26.08%  (p=0.000 n=10+9)
Read/rows=1000_cols=100_chunks=10-8       18.4MB ± 0%    13.5MB ± 0%  -26.44%  (p=0.000 n=9+10)
Read/rows=1000_cols=1000_chunks=10-8       184MB ± 0%     135MB ± 0%  -26.62%  (p=0.000 n=10+10)
Read/rows=10000_cols=1_chunks=10-8        1.65MB ± 0%    1.17MB ± 0%  -29.02%  (p=0.000 n=10+10)
Read/rows=10000_cols=10_chunks=10-8       15.7MB ± 0%    10.9MB ± 0%  -30.65%  (p=0.000 n=10+10)
Read/rows=10000_cols=100_chunks=10-8       156MB ± 0%     108MB ± 0%  -31.12%  (p=0.000 n=10+8)
Read/rows=10000_cols=1000_chunks=10-8     1.58GB ± 0%    1.09GB ± 0%  -31.06%  (p=0.000 n=10+10)
Read/rows=100000_cols=1_chunks=10-8       20.1MB ± 0%    15.3MB ± 0%  -23.93%  (p=0.000 n=10+9)
Read/rows=100000_cols=10_chunks=10-8       197MB ± 0%     149MB ± 0%  -24.39%  (p=0.000 n=10+8)
Read/rows=100000_cols=100_chunks=10-8     1.96GB ± 0%    1.47GB ± 0%  -24.86%  (p=0.000 n=10+10)
Read/rows=100000_cols=1000_chunks=10-8    19.7GB ± 0%    14.7GB ± 0%  -25.00%  (p=0.000 n=10+10)

name                                    old allocs/op  new allocs/op  delta
Read/rows=10_cols=1_chunks=10-8              319 ± 0%       310 ± 0%   -2.82%  (p=0.000 n=10+10)
Read/rows=10_cols=10_chunks=10-8           2.63k ± 0%     2.62k ± 0%   -0.34%  (p=0.000 n=10+10)
Read/rows=10_cols=100_chunks=10-8          25.7k ± 0%     25.7k ± 0%   -0.04%  (p=0.000 n=10+10)
Read/rows=10_cols=1000_chunks=10-8          256k ± 0%      256k ± 0%   -0.00%  (p=0.000 n=10+10)
Read/rows=100_cols=1_chunks=10-8             524 ± 0%       425 ± 0%  -18.89%  (p=0.000 n=10+10)
Read/rows=100_cols=10_chunks=10-8          3.02k ± 0%     2.92k ± 0%   -3.27%  (p=0.000 n=10+10)
Read/rows=100_cols=100_chunks=10-8         28.0k ± 0%     27.9k ± 0%   -0.35%  (p=0.000 n=10+10)
Read/rows=100_cols=1000_chunks=10-8         277k ± 0%      277k ± 0%   -0.04%  (p=0.000 n=10+10)
Read/rows=1000_cols=1_chunks=10-8          2.43k ± 0%     1.44k ± 0%  -41.04%  (p=0.000 n=10+10)
Read/rows=1000_cols=10_chunks=10-8         5.92k ± 0%     4.92k ± 0%  -16.87%  (p=0.000 n=10+10)
Read/rows=1000_cols=100_chunks=10-8        40.8k ± 0%     39.8k ± 0%   -2.45%  (p=0.000 n=10+10)
Read/rows=1000_cols=1000_chunks=10-8        389k ± 0%      388k ± 0%   -0.26%  (p=0.000 n=10+10)
Read/rows=10000_cols=1_chunks=10-8         20.6k ± 0%     10.6k ± 0%  -48.58%  (p=0.000 n=10+10)
Read/rows=10000_cols=10_chunks=10-8        25.4k ± 0%     15.4k ± 0%  -39.33%  (p=0.000 n=10+10)
Read/rows=10000_cols=100_chunks=10-8       73.8k ± 0%     63.8k ± 0%  -13.56%  (p=0.000 n=10+10)
Read/rows=10000_cols=1000_chunks=10-8       557k ± 0%      547k ± 0%   -1.79%  (p=0.000 n=10+10)
Read/rows=100000_cols=1_chunks=10-8         201k ± 0%      101k ± 0%  -49.78%  (p=0.000 n=10+10)
Read/rows=100000_cols=10_chunks=10-8        208k ± 0%      108k ± 0%  -48.02%  (p=0.000 n=10+10)
Read/rows=100000_cols=100_chunks=10-8       282k ± 0%      182k ± 0%  -35.49%  (p=0.000 n=10+10)
Read/rows=100000_cols=1000_chunks=10-8     1.02M ± 0%     0.92M ± 0%   -9.83%  (p=0.000 n=10+10)

This CL adds a set of benchmarks for the CSV reader type.
E.g.:

```
$> go test -run=NONE -bench=Read/rows=.*_cols=.*_chunks=-1 -benchmem
goos: linux
goarch: amd64
pkg: github.com/apache/arrow/go/arrow/csv
BenchmarkRead/rows=10_cols=1_chunks=-1-8         	  200000	     10219 ns/op	    9560 B/op	      73 allocs/op
BenchmarkRead/rows=10_cols=10_chunks=-1-8        	   30000	     75434 ns/op	   47264 B/op	     368 allocs/op
BenchmarkRead/rows=10_cols=100_chunks=-1-8       	    3000	    489027 ns/op	  426960 B/op	    3255 allocs/op
BenchmarkRead/rows=10_cols=1000_chunks=-1-8      	     200	   5400913 ns/op	 4308912 B/op	   32072 allocs/op
BenchmarkRead/rows=100_cols=1_chunks=-1-8        	   50000	     45297 ns/op	   30552 B/op	     268 allocs/op
BenchmarkRead/rows=100_cols=10_chunks=-1-8       	    5000	    333999 ns/op	  195520 B/op	     661 allocs/op
BenchmarkRead/rows=100_cols=100_chunks=-1-8      	     500	   2660322 ns/op	 1869777 B/op	    4538 allocs/op
BenchmarkRead/rows=100_cols=1000_chunks=-1-8     	      50	  25683147 ns/op	18805425 B/op	   43256 allocs/op
BenchmarkRead/rows=1000_cols=1_chunks=-1-8       	    5000	    423213 ns/op	  218968 B/op	    2086 allocs/op
BenchmarkRead/rows=1000_cols=10_chunks=-1-8      	     500	   2420959 ns/op	 1591808 B/op	    2614 allocs/op
BenchmarkRead/rows=1000_cols=100_chunks=-1-8     	      50	  21765485 ns/op	15474384 B/op	    7841 allocs/op
BenchmarkRead/rows=1000_cols=1000_chunks=-1-8    	       5	 222083917 ns/op	154949808 B/op	   60060 allocs/op
BenchmarkRead/rows=10000_cols=1_chunks=-1-8      	     500	   3938427 ns/op	 3083224 B/op	   20123 allocs/op
BenchmarkRead/rows=10000_cols=10_chunks=-1-8     	      50	  22066971 ns/op	20298368 B/op	   20903 allocs/op
BenchmarkRead/rows=10000_cols=100_chunks=-1-8    	       5	 209542066 ns/op	193038672 B/op	   28651 allocs/op
BenchmarkRead/rows=10000_cols=1000_chunks=-1-8   	       1	2696959353 ns/op	1939814576 B/op	  106070 allocs/op
BenchmarkRead/rows=100000_cols=1_chunks=-1-8     	      30	  35208837 ns/op	31869150 B/op	  200155 allocs/op
BenchmarkRead/rows=100000_cols=10_chunks=-1-8    	       5	 219030269 ns/op	183553152 B/op	  201125 allocs/op
BenchmarkRead/rows=100000_cols=100_chunks=-1-8   	       1	2421018029 ns/op	1692336464 B/op	  210762 allocs/op
BenchmarkRead/rows=100000_cols=1000_chunks=-1-8  	       1	28196721844 ns/op	16891740336 B/op	  307082 allocs/op
PASS
ok  	github.com/apache/arrow/go/arrow/csv	107.802s
```
This CL enables the encoding/csv reader to reuse the memory used by
records, from row to row, and thus reduce memory pressure on Go's GC.

```
$> go test -run=NONE -bench='Read/rows=.*_cols=.*_chunks=10$' -benchmem -count=10 |& tee old.txt
$> go test -run=NONE -bench='Read/rows=.*_cols=.*_chunks=10$' -benchmem -count=10 |& tee new.txt
$> benchstat old.txt new.txt
name                                    old time/op    new time/op    delta
Read/rows=10_cols=1_chunks=10-8           39.4µs ±18%    42.6µs ±24%     ~     (p=0.218 n=10+10)
Read/rows=10_cols=10_chunks=10-8           293µs ±23%     280µs ±24%     ~     (p=0.400 n=10+9)
Read/rows=10_cols=100_chunks=10-8         2.72ms ±24%    2.56ms ±20%     ~     (p=0.353 n=10+10)
Read/rows=10_cols=1000_chunks=10-8        24.3ms ± 2%    24.0ms ± 3%     ~     (p=0.059 n=8+9)
Read/rows=100_cols=1_chunks=10-8          74.9µs ±11%    62.1µs ±19%  -17.21%  (p=0.004 n=10+10)
Read/rows=100_cols=10_chunks=10-8          559µs ±21%     474µs ±21%  -15.12%  (p=0.009 n=10+10)
Read/rows=100_cols=100_chunks=10-8        5.53ms ±21%    4.36ms ±16%  -21.27%  (p=0.000 n=10+9)
Read/rows=100_cols=1000_chunks=10-8       41.9ms ± 3%    42.2ms ±13%     ~     (p=0.684 n=10+10)
Read/rows=1000_cols=1_chunks=10-8          421µs ±13%     320µs ±10%  -23.98%  (p=0.000 n=10+10)
Read/rows=1000_cols=10_chunks=10-8        3.24ms ±24%    2.63ms ±15%  -18.77%  (p=0.007 n=10+10)
Read/rows=1000_cols=100_chunks=10-8       33.0ms ±17%    27.0ms ±19%  -18.09%  (p=0.001 n=10+10)
Read/rows=1000_cols=1000_chunks=10-8       219ms ± 1%     211ms ± 2%   -3.81%  (p=0.000 n=9+10)
Read/rows=10000_cols=1_chunks=10-8        3.66ms ±11%    2.91ms ±10%  -20.27%  (p=0.000 n=10+10)
Read/rows=10000_cols=10_chunks=10-8       31.8ms ±16%    25.6ms ±15%  -19.66%  (p=0.000 n=10+10)
Read/rows=10000_cols=100_chunks=10-8       192ms ± 1%     182ms ± 1%   -5.19%  (p=0.000 n=10+10)
Read/rows=10000_cols=1000_chunks=10-8      1.99s ± 1%     1.93s ± 2%   -3.26%  (p=0.000 n=9+9)
Read/rows=100000_cols=1_chunks=10-8       32.9ms ± 4%    26.1ms ± 4%  -20.75%  (p=0.000 n=10+10)
Read/rows=100000_cols=10_chunks=10-8       203ms ± 1%     198ms ± 7%     ~     (p=0.123 n=10+10)
Read/rows=100000_cols=100_chunks=10-8      2.00s ± 1%     1.92s ± 1%   -4.24%  (p=0.000 n=10+8)
Read/rows=100000_cols=1000_chunks=10-8     22.7s ± 2%     22.0s ± 2%   -3.31%  (p=0.000 n=9+10)

name                                    old alloc/op   new alloc/op   delta
Read/rows=10_cols=1_chunks=10-8           32.7kB ± 0%    32.2kB ± 0%   -1.32%  (p=0.000 n=10+10)
Read/rows=10_cols=10_chunks=10-8           281kB ± 0%     277kB ± 0%   -1.54%  (p=0.000 n=10+10)
Read/rows=10_cols=100_chunks=10-8         2.77MB ± 0%    2.73MB ± 0%   -1.58%  (p=0.000 n=10+10)
Read/rows=10_cols=1000_chunks=10-8        27.8MB ± 0%    27.3MB ± 0%   -1.59%  (p=0.000 n=9+9)
Read/rows=100_cols=1_chunks=10-8          44.0kB ± 0%    39.3kB ± 0%  -10.80%  (p=0.000 n=10+10)
Read/rows=100_cols=10_chunks=10-8          381kB ± 0%     333kB ± 0%  -12.48%  (p=0.000 n=10+10)
Read/rows=100_cols=100_chunks=10-8        3.78MB ± 0%    3.29MB ± 0%  -12.75%  (p=0.000 n=10+10)
Read/rows=100_cols=1000_chunks=10-8       37.9MB ± 0%    33.1MB ± 0%  -12.83%  (p=0.000 n=10+9)
Read/rows=1000_cols=1_chunks=10-8          200kB ± 0%     152kB ± 0%  -23.99%  (p=0.000 n=10+10)
Read/rows=1000_cols=10_chunks=10-8        1.84MB ± 0%    1.36MB ± 0%  -26.08%  (p=0.000 n=10+9)
Read/rows=1000_cols=100_chunks=10-8       18.4MB ± 0%    13.5MB ± 0%  -26.44%  (p=0.000 n=9+10)
Read/rows=1000_cols=1000_chunks=10-8       184MB ± 0%     135MB ± 0%  -26.62%  (p=0.000 n=10+10)
Read/rows=10000_cols=1_chunks=10-8        1.65MB ± 0%    1.17MB ± 0%  -29.02%  (p=0.000 n=10+10)
Read/rows=10000_cols=10_chunks=10-8       15.7MB ± 0%    10.9MB ± 0%  -30.65%  (p=0.000 n=10+10)
Read/rows=10000_cols=100_chunks=10-8       156MB ± 0%     108MB ± 0%  -31.12%  (p=0.000 n=10+8)
Read/rows=10000_cols=1000_chunks=10-8     1.58GB ± 0%    1.09GB ± 0%  -31.06%  (p=0.000 n=10+10)
Read/rows=100000_cols=1_chunks=10-8       20.1MB ± 0%    15.3MB ± 0%  -23.93%  (p=0.000 n=10+9)
Read/rows=100000_cols=10_chunks=10-8       197MB ± 0%     149MB ± 0%  -24.39%  (p=0.000 n=10+8)
Read/rows=100000_cols=100_chunks=10-8     1.96GB ± 0%    1.47GB ± 0%  -24.86%  (p=0.000 n=10+10)
Read/rows=100000_cols=1000_chunks=10-8    19.7GB ± 0%    14.7GB ± 0%  -25.00%  (p=0.000 n=10+10)

name                                    old allocs/op  new allocs/op  delta
Read/rows=10_cols=1_chunks=10-8              319 ± 0%       310 ± 0%   -2.82%  (p=0.000 n=10+10)
Read/rows=10_cols=10_chunks=10-8           2.63k ± 0%     2.62k ± 0%   -0.34%  (p=0.000 n=10+10)
Read/rows=10_cols=100_chunks=10-8          25.7k ± 0%     25.7k ± 0%   -0.04%  (p=0.000 n=10+10)
Read/rows=10_cols=1000_chunks=10-8          256k ± 0%      256k ± 0%   -0.00%  (p=0.000 n=10+10)
Read/rows=100_cols=1_chunks=10-8             524 ± 0%       425 ± 0%  -18.89%  (p=0.000 n=10+10)
Read/rows=100_cols=10_chunks=10-8          3.02k ± 0%     2.92k ± 0%   -3.27%  (p=0.000 n=10+10)
Read/rows=100_cols=100_chunks=10-8         28.0k ± 0%     27.9k ± 0%   -0.35%  (p=0.000 n=10+10)
Read/rows=100_cols=1000_chunks=10-8         277k ± 0%      277k ± 0%   -0.04%  (p=0.000 n=10+10)
Read/rows=1000_cols=1_chunks=10-8          2.43k ± 0%     1.44k ± 0%  -41.04%  (p=0.000 n=10+10)
Read/rows=1000_cols=10_chunks=10-8         5.92k ± 0%     4.92k ± 0%  -16.87%  (p=0.000 n=10+10)
Read/rows=1000_cols=100_chunks=10-8        40.8k ± 0%     39.8k ± 0%   -2.45%  (p=0.000 n=10+10)
Read/rows=1000_cols=1000_chunks=10-8        389k ± 0%      388k ± 0%   -0.26%  (p=0.000 n=10+10)
Read/rows=10000_cols=1_chunks=10-8         20.6k ± 0%     10.6k ± 0%  -48.58%  (p=0.000 n=10+10)
Read/rows=10000_cols=10_chunks=10-8        25.4k ± 0%     15.4k ± 0%  -39.33%  (p=0.000 n=10+10)
Read/rows=10000_cols=100_chunks=10-8       73.8k ± 0%     63.8k ± 0%  -13.56%  (p=0.000 n=10+10)
Read/rows=10000_cols=1000_chunks=10-8       557k ± 0%      547k ± 0%   -1.79%  (p=0.000 n=10+10)
Read/rows=100000_cols=1_chunks=10-8         201k ± 0%      101k ± 0%  -49.78%  (p=0.000 n=10+10)
Read/rows=100000_cols=10_chunks=10-8        208k ± 0%      108k ± 0%  -48.02%  (p=0.000 n=10+10)
Read/rows=100000_cols=100_chunks=10-8       282k ± 0%      182k ± 0%  -35.49%  (p=0.000 n=10+10)
Read/rows=100000_cols=1000_chunks=10-8     1.02M ± 0%     0.92M ± 0%   -9.83%  (p=0.000 n=10+10)
```
@codecov-io
Copy link

Codecov Report

Merging #3073 into master will decrease coverage by 19.33%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #3073       +/-   ##
===========================================
- Coverage   87.06%   67.73%   -19.34%     
===========================================
  Files         489       58      -431     
  Lines       68974     3766    -65208     
===========================================
- Hits        60055     2551    -57504     
+ Misses       8818     1114     -7704     
  Partials      101      101
Impacted Files Coverage Δ
go/arrow/csv/csv.go 81.51% <100%> (+0.08%) ⬆️
python/pyarrow/ipc.pxi
cpp/src/parquet/column_page.h
cpp/src/parquet/bloom_filter-test.cc
cpp/src/plasma/client.cc
cpp/src/arrow/io/test-common.h
cpp/src/gandiva/function_registry.h
cpp/src/arrow/util/int-util-test.cc
cpp/src/arrow/python/io.cc
python/pyarrow/hdfs.py
... and 422 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 98bdde8...67a3272. Read the comment docs.

@sbinet
Copy link
Contributor Author

sbinet commented Dec 3, 2018

PTAL @alexandreyc @stuartcarnie

needs #3071

Copy link
Contributor

@stuartcarnie stuartcarnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @sbinet 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants