PARQUET-2204: [parquet-cpp] TypedColumnReaderImpl::Skip should reuse scratch space #14509

fatemehp · 2022-10-25T23:10:31Z

TypedColumnReaderImpl::Skip allocates scratch space on every call. The scratch space is used to read rep/def levels and values and throw them away. The memory allocation slows down the skip based on microbenchmarks. The scratch space can be allocated once and re-used.

fatemehp · 2022-10-25T23:11:15Z

@emkornfield could you take a look at this pull request? Thanks!

github-actions · 2022-10-25T23:15:29Z

https://issues.apache.org/jira/browse/PARQUET-2204

github-actions · 2022-10-25T23:15:31Z

⚠️ Ticket has no components in JIRA, make sure you assign one.

github-actions · 2022-10-25T23:15:33Z

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

fatemehp · 2022-10-25T23:23:33Z

I am planning to check in the microbenchmark as a separate pull request.

emkornfield · 2022-10-26T19:33:01Z

Looks like this change is the same as in the new PR with microbenchmark, can we close in favor of that?

fatemehp · 2022-10-26T21:31:27Z

I removed these commits from the benchmark pull request.

fatemehp · 2022-10-26T21:44:13Z

Here are benchmark results before and after the change proposed in this pull request. Only including the Skip since this change does not affect the read performance. We get up to 15X reduction when the batch size (last parameter) is 1, which means we are repeatedly re-allocating the scratch space.

PARAMETERS:
1) definition level
2) repetition level
3) 1 for Skip, 0 for ReadBatch (here we only have Skip)
4) batch size

BEFORE
-------------------------------------------------------------------------------
Benchmark                Time             CPU              Iterations
-------------------------------------------------------------------------------
BM_Skip/0/0/1/1       150319842 ns    149567587 ns         1000
BM_Skip/0/0/1/1000       244565 ns       244931 ns         1000
BM_Skip/0/0/1/10000      115395 ns       115924 ns         1000
BM_Skip/0/0/1/100000     115241 ns       115916 ns         1000
BM_Skip/1/0/1/1       149224507 ns    148683644 ns         1000
BM_Skip/1/0/1/1000       805812 ns       805417 ns         1000
BM_Skip/1/0/1/10000      702999 ns       700108 ns         1000
BM_Skip/1/0/1/100000     654163 ns       651947 ns         1000
BM_Skip/1/1/1/1       165600118 ns    164864530 ns         1000
BM_Skip/1/1/1/1000      1130975 ns      1130252 ns         1000
BM_Skip/1/1/1/10000     1009628 ns      1009589 ns         1000
BM_Skip/1/1/1/100000    1029064 ns      1028726 ns         1000

AFTER
-------------------------------------------------------------------------------
Benchmark                 Time             CPU             Iterations
-------------------------------------------------------------------------------
BM_Skip/0/0/1/1        10280337 ns     10234495 ns         1000
BM_Skip/0/0/1/1000       101228 ns       101436 ns         1000
BM_Skip/0/0/1/10000       96565 ns        96648 ns         1000
BM_Skip/0/0/1/100000      96598 ns        96814 ns         1000
BM_Skip/1/0/1/1        25771605 ns     25718891 ns         1000
BM_Skip/1/0/1/1000       651609 ns       650940 ns         1000
BM_Skip/1/0/1/10000      660890 ns       654217 ns         1000
BM_Skip/1/0/1/100000     640417 ns       636855 ns         1000
BM_Skip/1/1/1/1        36639124 ns     36433537 ns         1000
BM_Skip/1/1/1/1000       978058 ns       976403 ns         1000
BM_Skip/1/1/1/10000      997193 ns       996529 ns         1000
BM_Skip/1/1/1/100000     999080 ns       993296 ns         1000

fatemehp · 2022-10-28T17:47:56Z

We need to consider the memory implications of this change. This means that if at least one Skip is requested, the scratch space will be allocated on the heap and kept until the column reader is destroyed. The scratch space can be as big as 12 KB. If we have 1024 column readers open at one time, that means a 12 MB overhead. Is it common to have this many readers open at the same time? If yes, is this overhead acceptable?

emkornfield · 2022-10-31T20:27:29Z

cpp/src/parquet/column_reader.cc

+  // value type for batch_size.
+  void InitScratchForSkip(int64_t batch_size);
+
+  // Scrtach space for reading and throwing away rep/def levels and values when


Suggested change

// Scrtach space for reading and throwing away rep/def levels and values when

// Scratch space for reading and throwing away rep/def levels and values when

emkornfield · 2022-10-31T20:37:59Z

We need to consider the memory implications of this change. This means that if at least one Skip is requested, the scratch space will be allocated on the heap and kept until the column reader is destroyed. The scratch space can be as big as 12 KB. If we have 1024 column readers open at one time, that means a 12 MB overhead. Is it common to have this many readers open at the same time? If yes, is this overhead acceptable?

That seems OK to me, another option would be to read the feature flag. This breaks only after a second call to skip?

fatemehp · 2022-11-01T16:52:07Z

@emkornfield Could you give some context about when we normally use the feature flags? If we want to control this using a flag, that means we will support both cases of 1) allocating within the Skip function and 2) in the column reader. I am wondering if that is doing more than what we need.

Also, I don't fully understand your question here: "This breaks only after a second call to skip?"
The scratch space is allocated on the "first" call to skip and will be retained until the column reader is destroyed.

pitrou · 2022-11-15T16:28:08Z

cpp/src/parquet/column_reader.cc

@@ -1151,6 +1159,14 @@ int64_t TypedColumnReaderImpl<DType>::ReadBatchSpaced(
  return total_values;
 }

+template <typename DType>
+void TypedColumnReaderImpl<DType>::InitScratchForSkip(int64_t batch_size) {


What if the batch size is not the same between calls?

I removed this argument and clarified this in the code. The batch size is constant and will not change.

pitrou · 2022-11-15T16:31:26Z

Can you merge the latest changes from git master?

emkornfield · 2022-11-29T19:16:32Z

@emkornfield Could you give some context about when we normally use the feature flags? If we want to control this using a flag, that means we will support both cases of 1) allocating within the Skip function and 2) in the column reader. I am wondering if that is doing more than what we need.

Also, I don't fully understand your question here: "This breaks only after a second call to skip?"
The scratch space is allocated on the "first" call to skip and will be retained until the column reader is destroyed.

sorry I think this answers the question, I think I meant help instead of breaks.

the batch size for throwing away values is constant.

fatemehp · 2022-12-07T21:05:24Z

@pitrou, @emkornfield I have pulled in the latest changes, and updated the code so that the RecordReader uses the same scratch space. Please take a look.

pitrou · 2022-12-08T15:29:58Z

I get the following benchmark numbers:

before:

-------------------------------------------------------------------------------------------------------------------
Benchmark                                                         Time             CPU   Iterations UserCounters...
-------------------------------------------------------------------------------------------------------------------
ColumnReaderSkipInt32/Repetition:0/BatchSize:100            2256949 ns      2256574 ns          310 bytes_per_second=2.1131G/s
ColumnReaderSkipInt32/Repetition:0/BatchSize:1000            322274 ns       322302 ns         2147 bytes_per_second=14.7947G/s
ColumnReaderSkipInt32/Repetition:0/BatchSize:10000           123075 ns       123135 ns         5695 bytes_per_second=38.7247G/s
ColumnReaderSkipInt32/Repetition:0/BatchSize:100000           34654 ns        34707 ns        19814 bytes_per_second=137.388G/s

ColumnReaderSkipInt32/Repetition:1/BatchSize:100            4483059 ns      4482271 ns          159 bytes_per_second=579.976M/s
ColumnReaderSkipInt32/Repetition:1/BatchSize:1000            974701 ns       974606 ns          719 bytes_per_second=2.60483G/s
ColumnReaderSkipInt32/Repetition:1/BatchSize:10000           630965 ns       630913 ns         1111 bytes_per_second=4.02382G/s
ColumnReaderSkipInt32/Repetition:1/BatchSize:100000          191333 ns       191349 ns         3553 bytes_per_second=13.2672G/s

ColumnReaderSkipInt32/Repetition:2/BatchSize:100            5919031 ns      5917853 ns          119 bytes_per_second=465.768M/s
ColumnReaderSkipInt32/Repetition:2/BatchSize:1000           1422720 ns      1422501 ns          488 bytes_per_second=1.89226G/s
ColumnReaderSkipInt32/Repetition:2/BatchSize:10000          1008342 ns      1008178 ns          698 bytes_per_second=2.66991G/s
ColumnReaderSkipInt32/Repetition:2/BatchSize:100000          306069 ns       306087 ns         2281 bytes_per_second=8.79405G/s

after:

ColumnReaderSkipInt32/Repetition:0/BatchSize:100             246589 ns       246512 ns         2869 bytes_per_second=19.3434G/s
ColumnReaderSkipInt32/Repetition:0/BatchSize:1000            114571 ns       114636 ns         6122 bytes_per_second=41.5957G/s
ColumnReaderSkipInt32/Repetition:0/BatchSize:10000            98790 ns        98844 ns         7101 bytes_per_second=48.2414G/s
ColumnReaderSkipInt32/Repetition:0/BatchSize:100000           32585 ns        32662 ns        22074 bytes_per_second=145.992G/s

ColumnReaderSkipInt32/Repetition:1/BatchSize:100            1716748 ns      1716514 ns          408 bytes_per_second=1.47897G/s
ColumnReaderSkipInt32/Repetition:1/BatchSize:1000            644526 ns       644500 ns         1091 bytes_per_second=3.93899G/s
ColumnReaderSkipInt32/Repetition:1/BatchSize:10000           567605 ns       567578 ns         1241 bytes_per_second=4.47283G/s
ColumnReaderSkipInt32/Repetition:1/BatchSize:100000          180075 ns       180121 ns         3862 bytes_per_second=14.0943G/s

ColumnReaderSkipInt32/Repetition:2/BatchSize:100            2880737 ns      2880209 ns          243 bytes_per_second=956.995M/s
ColumnReaderSkipInt32/Repetition:2/BatchSize:1000           1040618 ns      1040539 ns          676 bytes_per_second=2.58687G/s
ColumnReaderSkipInt32/Repetition:2/BatchSize:10000           916319 ns       916171 ns          767 bytes_per_second=2.93803G/s
ColumnReaderSkipInt32/Repetition:2/BatchSize:100000          288829 ns       288867 ns         2418 bytes_per_second=9.31826G/s

Nice improvement!

pitrou

+1, thank you @fatemehp !

Allocate the scratch space for skipping only once.

29463bc

github-actions bot added Component: C++ Component: Parquet labels Oct 25, 2022

Fix a typo and add a check

00e9338

emkornfield reviewed Oct 31, 2022

View reviewed changes

Fix typo.

052c72f

fatemehp force-pushed the master branch from 95d89ed to 052c72f Compare November 1, 2022 18:19

fatemehp added 2 commits November 4, 2022 21:09

Merge branch 'apache:master' into master

de1e212

Merge branch 'apache:master' into master

f9018fc

fatemehp mentioned this pull request Nov 14, 2022

PARQUET-2206: [parquet-cpp] Microbenchmark for ColumnReader ReadBatch and Skip #14523

Merged

pitrou reviewed Nov 15, 2022

View reviewed changes

fatemehp and others added 2 commits December 7, 2022 12:12

Merge branch 'apache:master' into master

58d3a4f

Use scratch space for RecordReader too. Refactor and make it clear that

5d67d00

the batch size for throwing away values is constant.

pitrou approved these changes Dec 8, 2022

View reviewed changes

pitrou merged commit 4bc9355 into apache:master Dec 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PARQUET-2204: [parquet-cpp] TypedColumnReaderImpl::Skip should reuse scratch space #14509

PARQUET-2204: [parquet-cpp] TypedColumnReaderImpl::Skip should reuse scratch space #14509

fatemehp commented Oct 25, 2022

fatemehp commented Oct 25, 2022

github-actions bot commented Oct 25, 2022

github-actions bot commented Oct 25, 2022

github-actions bot commented Oct 25, 2022

fatemehp commented Oct 25, 2022

emkornfield commented Oct 26, 2022

fatemehp commented Oct 26, 2022

fatemehp commented Oct 26, 2022 •

edited

fatemehp commented Oct 28, 2022

emkornfield Oct 31, 2022

fatemehp Oct 31, 2022

emkornfield commented Oct 31, 2022

fatemehp commented Nov 1, 2022

pitrou Nov 15, 2022

fatemehp Dec 7, 2022

pitrou commented Nov 15, 2022

emkornfield commented Nov 29, 2022

fatemehp commented Dec 7, 2022

pitrou commented Dec 8, 2022

pitrou left a comment

	// Scrtach space for reading and throwing away rep/def levels and values when
	// Scratch space for reading and throwing away rep/def levels and values when

PARQUET-2204: [parquet-cpp] TypedColumnReaderImpl::Skip should reuse scratch space #14509

PARQUET-2204: [parquet-cpp] TypedColumnReaderImpl::Skip should reuse scratch space #14509

Conversation

fatemehp commented Oct 25, 2022

fatemehp commented Oct 25, 2022

github-actions bot commented Oct 25, 2022

github-actions bot commented Oct 25, 2022

github-actions bot commented Oct 25, 2022

fatemehp commented Oct 25, 2022

emkornfield commented Oct 26, 2022

fatemehp commented Oct 26, 2022

fatemehp commented Oct 26, 2022 • edited

fatemehp commented Oct 28, 2022

emkornfield Oct 31, 2022

Choose a reason for hiding this comment

fatemehp Oct 31, 2022

Choose a reason for hiding this comment

emkornfield commented Oct 31, 2022

fatemehp commented Nov 1, 2022

pitrou Nov 15, 2022

Choose a reason for hiding this comment

fatemehp Dec 7, 2022

Choose a reason for hiding this comment

pitrou commented Nov 15, 2022

emkornfield commented Nov 29, 2022

fatemehp commented Dec 7, 2022

pitrou commented Dec 8, 2022

pitrou left a comment

Choose a reason for hiding this comment

fatemehp commented Oct 26, 2022 •

edited