Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 19 additions & 14 deletions parquet/src/arrow/arrow_reader/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,9 @@ pub mod statistics;
///
/// Most users should use one of the following specializations:
///
/// * synchronous API: [`ParquetRecordBatchReaderBuilder::try_new`]
/// * `async` API: [`ParquetRecordBatchStreamBuilder::new`]
/// * decoder API: [`ParquetDecoderBuilder::new`]
/// * synchronous API: [`ParquetRecordBatchReaderBuilder`]
/// * `async` API: [`ParquetRecordBatchStreamBuilder`]
/// * decoder API: [`ParquetPushDecoderBuilder`]
///
/// # Features
/// * Projection pushdown: [`Self::with_projection`]
Expand Down Expand Up @@ -93,8 +93,8 @@ pub mod statistics;
/// You can read more about this design in the [Querying Parquet with
/// Millisecond Latency] Arrow blog post.
///
/// [`ParquetRecordBatchStreamBuilder::new`]: crate::arrow::async_reader::ParquetRecordBatchStreamBuilder::new
/// [`ParquetDecoderBuilder::new`]: crate::arrow::push_decoder::ParquetPushDecoderBuilder::new
/// [`ParquetRecordBatchStreamBuilder`]: crate::arrow::async_reader::ParquetRecordBatchStreamBuilder
/// [`ParquetPushDecoderBuilder`]: crate::arrow::push_decoder::ParquetPushDecoderBuilder
/// [Apache Arrow]: https://arrow.apache.org/
/// [`StatisticsConverter`]: statistics::StatisticsConverter
/// [Querying Parquet with Millisecond Latency]: https://arrow.apache.org/blog/2022/12/26/querying-parquet-with-millisecond-latency/
Expand Down Expand Up @@ -719,11 +719,12 @@ impl<T: Debug + ChunkReader> Debug for SyncReader<T> {
}
}

/// A synchronous builder used to construct [`ParquetRecordBatchReader`] for a file
/// Creates [`ParquetRecordBatchReader`] for reading Parquet files into Arrow [`RecordBatch`]es
///
/// For an async API see [`crate::arrow::async_reader::ParquetRecordBatchStreamBuilder`]
///
/// See [`ArrowReaderBuilder`] for additional member functions
/// # See Also
/// * [`crate::arrow::async_reader::ParquetRecordBatchStreamBuilder`] for an async API
/// * [`crate::arrow::push_decoder::ParquetPushDecoderBuilder`] for a SansIO decoder API
/// * [`ArrowReaderBuilder`] for additional member functions
pub type ParquetRecordBatchReaderBuilder<T> = ArrowReaderBuilder<SyncReader<T>>;

impl<T: ChunkReader + 'static> ParquetRecordBatchReaderBuilder<T> {
Expand Down Expand Up @@ -1010,12 +1011,16 @@ impl<T: ChunkReader + 'static> Iterator for ReaderPageIterator<T> {

impl<T: ChunkReader + 'static> PageIterator for ReaderPageIterator<T> {}

/// An `Iterator<Item = ArrowResult<RecordBatch>>` that yields [`RecordBatch`]
/// read from a parquet data source
/// Reads Parquet data as Arrow [`RecordBatch`]es
///
/// This struct implements the [`RecordBatchReader`] trait and is an
/// `Iterator<Item = ArrowResult<RecordBatch>>` that yields [`RecordBatch`]es.
///
/// Typically, either reads from a file or an in memory buffer [`Bytes`]
///
/// Created by [`ParquetRecordBatchReaderBuilder`]
///
/// This reader is created by [`ParquetRecordBatchReaderBuilder`], and has all
/// the buffered state (DataPages, etc) necessary to decode the parquet data into
/// Arrow arrays.
/// [`Bytes`]: bytes::Bytes
pub struct ParquetRecordBatchReader {
array_reader: Box<dyn ArrayReader>,
schema: SchemaRef,
Expand Down
3 changes: 1 addition & 2 deletions parquet/src/arrow/push_decoder/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,7 @@ use std::sync::Arc;

/// A builder for [`ParquetPushDecoder`].
///
/// To create a new decoder, use [`ParquetPushDecoderBuilder::try_new_decoder`] and pass
/// the file length and metadata of the Parquet file to decode.
/// To create a new decoder, use [`ParquetPushDecoderBuilder::try_new_decoder`].
///
/// You can decode the metadata from a Parquet file using either
/// [`ParquetMetadataReader`] or [`ParquetMetaDataPushDecoder`].
Expand Down
Loading