Skip to content

Commit

Permalink
ARROW-16983: [Go][Parquet] fix EstimatedDataEncodedSize of DeltaByteA…
Browse files Browse the repository at this point in the history
…rrayEncoder (#13522)

`DeltaByteArrayEncoder` extends `encoder` which calculates `EstimatedDataEncodedSize()` by calling `Len()` on its `sink`. `DeltaByteArrayEncoder` however does not write its data out to sink, instead writing out to `prefixEncoder` and `suffixEncoder`, causing EstimatedDataEncodedSize to always return zero, resulting in `FlushCurrentPage` never being called.

Authored-by: Matt DePero <depero@neeva.co>
Signed-off-by: Matthew Topol <mtopol@factset.com>
  • Loading branch information
mdepero committed Jul 6, 2022
1 parent 6f9674a commit 8abb941
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions go/parquet/internal/encoding/delta_byte_array.go
Expand Up @@ -39,6 +39,10 @@ type DeltaByteArrayEncoder struct {
lastVal parquet.ByteArray
}

func (enc *DeltaByteArrayEncoder) EstimatedDataEncodedSize() int64 {
return enc.prefixEncoder.EstimatedDataEncodedSize() + enc.suffixEncoder.EstimatedDataEncodedSize()
}

func (enc *DeltaByteArrayEncoder) initEncoders() {
enc.prefixEncoder = &DeltaBitPackInt32Encoder{
deltaBitPackEncoder: &deltaBitPackEncoder{encoder: newEncoderBase(enc.encoding, nil, enc.mem)}}
Expand Down

0 comments on commit 8abb941

Please sign in to comment.