Skip to content

Commit

Permalink
doc(cookbook): Include more details on Partial parsing
Browse files Browse the repository at this point in the history
Inspired by
- rust-bakery/nom#1160
- rust-bakery/nom#1582
- rust-bakery/nom#1145#issuecomment-678788326
  • Loading branch information
epage committed Feb 17, 2023
1 parent d7695ed commit 8608032
Show file tree
Hide file tree
Showing 3 changed files with 45 additions and 1 deletion.
2 changes: 2 additions & 0 deletions src/_cookbook/mod.rs
Expand Up @@ -4,6 +4,7 @@
//!
//! - [Elements of Programming Languages][language]
//! - [Implementing `FromStr`][fromstr]
//! - [Parsing Partial Input][partial]
//! - [Custom stream][stream]
//! - [Custom errors][error]
//!
Expand All @@ -14,4 +15,5 @@
pub mod error;
pub mod fromstr;
pub mod language;
pub mod partial;
pub mod stream;
40 changes: 40 additions & 0 deletions src/_cookbook/partial.rs
@@ -0,0 +1,40 @@
//! # Parsing Partial Input
//!
//! Typically, the input being parsed is all in-memory, or is complete. Some data sources are too
//! large to fit into memory, only allowing parsing an incomplete or [`Partial`] subset of the
//! data, requiring incrementally parsing.
//!
//! By wrapping a stream, like `&[u8]`, with [`Partial`], parsers will report when the data is
//! [`Incomplete`] and more input is [`Needed`], allowing the caller to stream-in additional data
//! to be parsed. The data is then parsed a chunk at a time.
//!
//! Chunks are typically defined by either:
//! - A header reporting the number of bytes, like with [`length_value`]
//! - [`Partial`] can explicitly be changed to being complete once the specified bytes are
//! acquired via [`StreamIsPartial::complete`].
//! - A delimiter, like with [ndjson](http://ndjson.org/)
//! - You can parse up-to the delimiter or do a `take_until0(delim).and_then(parser)`
//!
//! If the chunks are not homogenous, a state machine will be needed to track what the expected
//! parser is for the next chunk.
//!
//! Caveats:
//! - `winnow` takes the approach of re-parsing from scratch. Chunks should be relatively small to
//! prevent the re-parsing overhead from dominating.
//! - Parsers like [`many0`] do not know when an `eof` is from insufficient data or the end of the
//! stream, causing them to always report [`Incomplete`].
//!
//! # Example
//!
//! ```rust
#![doc = include_str!("../../examples/json/parser_partial.rs")]
//! ```

#![allow(unused_imports)] // Used for intra-doc links

use crate::error::ErrMode::Incomplete;
use crate::error::Needed;
use crate::multi::length_value;
use crate::multi::many0;
use crate::stream::Partial;
use crate::stream::StreamIsPartial;
4 changes: 3 additions & 1 deletion src/stream/mod.rs
Expand Up @@ -204,6 +204,8 @@ impl<I, S> crate::lib::std::ops::Deref for Stateful<I, S> {
///
/// See also [`StreamIsPartial`] to tell whether the input supports complete or partial parsing.
///
/// See also [Cookbook: Parsing Partial Input][crate::_cookbook::partial].
///
/// # Example
///
/// Here is how it works in practice:
Expand Down Expand Up @@ -921,7 +923,7 @@ where

/// Marks the input as being the complete buffer or a partial buffer for streaming input
///
/// See [Partial] for marking a presumed complete buffer type as a streaming buffer.
/// See [`Partial`] for marking a presumed complete buffer type as a streaming buffer.
pub trait StreamIsPartial: Sized {
/// Whether the stream is currently partial or complete
type PartialState;
Expand Down

0 comments on commit 8608032

Please sign in to comment.