[C++] [Parquet] FLBA reader does not pre-reserve memory #39122

Hattonuri · 2023-12-07T11:06:16Z

Describe the enhancement requested

We can reserve memory before running loops in reading.
Also we can put check on zero null count not to check validity bit mask when there are no nulls.

arrow/cpp/src/parquet/column_reader.cc

Lines 2074 to 2090 in f7286a9

    
           void ReadValuesSpaced(int64_t values_to_read, int64_t null_count) override { 
        
             uint8_t* valid_bits = valid_bits_->mutable_data(); 
        
             const int64_t valid_bits_offset = values_written_; 
        
             auto values = ValuesHead<FLBA>(); 
        
             int64_t num_decoded = this->current_decoder_->DecodeSpaced( 
        
                 values, static_cast<int>(values_to_read), static_cast<int>(null_count), 
        
                 valid_bits, valid_bits_offset); 
        
             ARROW_DCHECK_EQ(num_decoded, values_to_read); 
        
             for (int64_t i = 0; i < num_decoded; i++) { 
        
               if (::arrow::bit_util::GetBit(valid_bits, valid_bits_offset + i)) { 
        
                 PARQUET_THROW_NOT_OK(builder_->Append(values[i].ptr)); 
        
               } else { 
        
                 PARQUET_THROW_NOT_OK(builder_->AppendNull()); 
        
               } 
        
             }

We can get this situation when we have optional fields in a batch without having nulls here

arrow/cpp/src/parquet/column_reader.cc

Lines 77 to 93 in ef3797d

    
           inline bool HasSpacedValues(const ColumnDescriptor* descr) { 
        
             if (descr->max_repetition_level() > 0) { 
        
               // repeated+flat case 
        
               return !descr->schema_node()->is_required(); 
        
             } else { 
        
               // non-repeated+nested case 
        
               // Find if a node forces nulls in the lowest level along the hierarchy 
        
               const schema::Node* node = descr->schema_node().get(); 
        
               while (node) { 
        
                 if (node->is_optional()) { 
        
                   return true; 
        
                 } 
        
                 node = node->parent(); 
        
               } 
        
               return false; 
        
             } 
        
           }

Component(s)

C++, Parquet

### Rationale for this change The FLBA implementation of RecordReader is suboptimal: * it doesn't preallocate the output array * it reads the decoded validity bitmap one bit at a time and recreates it, one bit at a time ### What changes are included in this PR? Optimize the FLBA implementation of RecordReader so as to avoid the aforementioned inefficiencies. I did a quick-and-dirty benchmark on a Parquet file with two columns: * column 1: uncompressed, PLAIN-encoded, FLBA<3> with no nulls * column 2: uncompressed, PLAIN-encoded, FLBA<3> with 25% nulls With git main, the file can be read at 465 MB/s. With this PR, the file can be read at 700 MB/s. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * Closes: #39122 Lead-authored-by: Antoine Pitrou <antoine@python.org> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Signed-off-by: Antoine Pitrou <antoine@python.org>

) ### Rationale for this change The FLBA implementation of RecordReader is suboptimal: * it doesn't preallocate the output array * it reads the decoded validity bitmap one bit at a time and recreates it, one bit at a time ### What changes are included in this PR? Optimize the FLBA implementation of RecordReader so as to avoid the aforementioned inefficiencies. I did a quick-and-dirty benchmark on a Parquet file with two columns: * column 1: uncompressed, PLAIN-encoded, FLBA<3> with no nulls * column 2: uncompressed, PLAIN-encoded, FLBA<3> with 25% nulls With git main, the file can be read at 465 MB/s. With this PR, the file can be read at 700 MB/s. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * Closes: apache#39122 Lead-authored-by: Antoine Pitrou <antoine@python.org> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Signed-off-by: Antoine Pitrou <antoine@python.org>

Hattonuri added the Type: enhancement label Dec 7, 2023

github-actions bot added Component: Parquet Component: C++ labels Dec 7, 2023

github-actions bot assigned Hattonuri Dec 7, 2023

github-actions bot mentioned this issue Dec 7, 2023

GH-39122: [C++] [Parquet] FLBA reader reading preallocs and null count check #39120

Closed

pitrou added a commit to pitrou/arrow that referenced this issue Dec 7, 2023

apacheGH-39122: [C++][Parquet] Optimize FLBA record reader

e0b5609

github-actions bot mentioned this issue Dec 7, 2023

GH-39122: [C++][Parquet] Optimize FLBA record reader #39124

Merged

pitrou closed this as completed in #39124 Dec 9, 2023

pitrou added this to the 15.0.0 milestone Dec 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[C++] [Parquet] FLBA reader does not pre-reserve memory #39122

[C++] [Parquet] FLBA reader does not pre-reserve memory #39122

Hattonuri commented Dec 7, 2023

[C++] [Parquet] FLBA reader does not pre-reserve memory #39122

[C++] [Parquet] FLBA reader does not pre-reserve memory #39122

Comments

Hattonuri commented Dec 7, 2023

Describe the enhancement requested

Component(s)