Skip to content

Commit

Permalink
ARROW-8311: [C++] Add push style stream format reader
Browse files Browse the repository at this point in the history
This change adds the following push style reader classes:

  * ipc::MessageEmitter
  * ipc::RecordBatchStreamEmitter

Push style readers don't read data from stream directly. They receive
already read data by users. This style is useful with event driven
style IO API. We can't read data from stream directly in event driven
style IO API. We just receive already read data from event driven style
IO API like:

    void on_read(const uint8_t* data, size_t data_size) {
       process_data(data, data_size);
    }
    register_read_event(on_read);
    run_event_loop();

We can't use the current reader API with event driven style IO API but
we can use this push style reader with event driven style IO API.

The current Message reader is changed to use ipc::MessageEmitter
internally. So we don't have duplicated reader implementation. And no
performance regression with our benchmark.

Before:

    Running release/arrow-ipc-read-write-benchmark
    Run on (12 X 4600 MHz CPU s)
    CPU Caches:
      L1 Data 32K (x6)
      L1 Instruction 32K (x6)
      L2 Unified 256K (x6)
      L3 Unified 12288K (x1)
    Load Average: 0.85, 0.84, 0.65
    -----------------------------------------------------------------------------------------
    Benchmark                               Time             CPU   Iterations UserCounters...
    -----------------------------------------------------------------------------------------
    ReadRecordBatch/1/real_time           886 ns          886 ns       774286 bytes_per_second=1102.15G/s
    ReadRecordBatch/4/real_time          1601 ns         1601 ns       436258 bytes_per_second=610.078G/s
    ReadRecordBatch/16/real_time         4819 ns         4820 ns       143568 bytes_per_second=202.663G/s
    ReadRecordBatch/64/real_time        18291 ns        18296 ns        38586 bytes_per_second=53.3893G/s
    ReadRecordBatch/256/real_time       84852 ns        84872 ns         8317 bytes_per_second=11.5091G/s
    ReadRecordBatch/1024/real_time     341091 ns       341168 ns         2049 bytes_per_second=2.86306G/s
    ReadRecordBatch/4096/real_time    1368049 ns      1368361 ns          511 bytes_per_second=730.968M/s
    ReadRecordBatch/8192/real_time    2676778 ns      2677341 ns          265 bytes_per_second=373.584M/s

After:

    Running release/arrow-ipc-read-write-benchmark
    Run on (12 X 4600 MHz CPU s)
    CPU Caches:
      L1 Data 32K (x6)
      L1 Instruction 32K (x6)
      L2 Unified 256K (x6)
      L3 Unified 12288K (x1)
    Load Average: 0.88, 0.85, 0.66
    -----------------------------------------------------------------------------------------
    Benchmark                               Time             CPU   Iterations UserCounters...
    -----------------------------------------------------------------------------------------
    ReadRecordBatch/1/real_time           891 ns          891 ns       769579 bytes_per_second=1095.57G/s
    ReadRecordBatch/4/real_time          1599 ns         1599 ns       435756 bytes_per_second=610.746G/s
    ReadRecordBatch/16/real_time         4834 ns         4835 ns       144374 bytes_per_second=202.027G/s
    ReadRecordBatch/64/real_time        18204 ns        18206 ns        38190 bytes_per_second=53.6465G/s
    ReadRecordBatch/256/real_time       84142 ns        84154 ns         8309 bytes_per_second=11.6061G/s
    ReadRecordBatch/1024/real_time     343105 ns       343148 ns         2035 bytes_per_second=2.84625G/s
    ReadRecordBatch/4096/real_time    1399287 ns      1399484 ns          511 bytes_per_second=714.65M/s
    ReadRecordBatch/8192/real_time    2641529 ns      2641845 ns          263 bytes_per_second=378.569M/s

Closes #6804 from kou/cpp-record-batch-emitter

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
  • Loading branch information
kou authored and wesm committed Apr 10, 2020
1 parent e570db9 commit 866e6a8
Show file tree
Hide file tree
Showing 6 changed files with 1,421 additions and 147 deletions.
Loading

0 comments on commit 866e6a8

Please sign in to comment.