linesep

Splitter Classes

linesep provides a set of classes (called splitters) for splitting strings in chunks, inspired by the ~codecs.IncrementalEncoder and ~codecs.IncrementalDecoder classes of the codecs module. Input is fed to a splitter instance one piece at a time, and the segments split from the input so far are (depending on the methods used) either returned immediately or else retrieveable from the splitter afterwards. This is useful when you have a data source that is neither a string nor a filehandle.

If the input is in the form of an iterable, a splitter can be used to iterate over it and yield each segment:

>>> import linesep >>> splitter = linesep.SeparatedSplitter("twofour' 'two' '' 'four' '' '' '|' 'six'

Alternatively, input can be provided to the splitter one piece at a time by passing it to the ~Splitter.split() method, which returns all newly-split off items:

>>> splitter = linesep.TerminatedSplitter("0", retain=False) >>> splitter.split("foo0bar0baz") ['foo', 'bar'] >>> splitter.split("0quux0gnusto0", final=True) ['baz', 'quux', 'gnusto']

At a lower level, input can be provided to the ~Splitter.feed() method, and the output can be retrieved with ~Splitter.get() or `~Splitter.getall()`:

>>> splitter = linesep.UniversalNewlineSplitter(retain=True, translate=True) >>> splitter.feed("foonbarrnbaz") >>> splitter.nonempty True >>> splitter.get() 'foon' >>> splitter.nonempty True >>> splitter.get() 'barn' >>> splitter.nonempty False >>> splitter.get() Traceback (most recent call last): ... SplitterEmptyError: No items available in splitter >>> splitter.close() >>> splitter.nonempty True >>> splitter.get() 'baz' >>> splitter.nonempty False

Like the *_preceded, *_separated, and *_terminated functions, strings passed to splitters may be either binary or text. However, the input to a single instance of a splitter must be either all binary or all text, and the output type will match.

Splitters

Splitter

ParagraphSplitter

PrecededSplitter

SeparatedSplitter

TerminatedSplitter

UnicodeNewlineSplitter

UniversalNewlineSplitter

Utilities

get_newline_splitter

SplitterState()

SplitterClosedError

SplitterEmptyError

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

splitters.rst

splitters.rst

Splitter Classes

Splitters

Utilities

Files

splitters.rst

Latest commit

History

splitters.rst

File metadata and controls

Splitter Classes

Splitters

Utilities