-
Notifications
You must be signed in to change notification settings - Fork 5
Split
Operator split offers a versatile way of partitioning an iterable object. It is an indispensable unit of logic, particularly when processing binary files as an iterable buffer, as it offers an easy and high-performance way of splitting any such buffer into blocks, so you can process it further, all through a single iteration.
- the operator takes a predicate, to signal when values are to be split (default
split
logic); - it supports option
toggle
, so one signal from predicate starts the new selection, and the next one ends;
By default, split values themselves are skipped. But options carryStart
and carryEnd
can change that,
to indicate if you want start or end split values carried back or forward. Note that in standard split
mode, only carryEnd
is used, while toggle
mode uses both carryStart
and carryEnd
.
The example below uses the default split logic, to split an array of numbers:
import {pipe, split} from 'iter-ops';
const data = [0, 1, 2, 0, 0, 3, 4, 5];
const i = pipe(
data,
split(v => v === 0) // split on value 0
);
console.log([...i]); //=> [ [], [ 1, 2 ], [], [ 3, 4, 5 ] ]
The output is, by design, consistent with the logic of String.split
, where start/end and middle gaps produce empty elements.
And since operator split always produces arrays of values, you get empty arrays for gaps.
If you do not want any such gaps, you can simply filter them out by length, as shown below:
const i = pipe(
data,
split(v => v === 0), // split on value 0
filter(a => !!a.length) // skip empty arrays
);
console.log([...i]); //=> [ [ 1, 2 ], [ 3, 4, 5 ] ]
In the default split scenario, we only know one value, by which we split the sequence.
But when we know two split values - start + end, we need to use toggle
logic instead,
whereby one return of true
from the predicate marks the beginning, and the next one
marks the end of each block.
Below, let's consider 0
as the start of each block, and 1
as the end of each block...
const data = [0, 33, 22, 1, 77, 44, 0, 55, 88];
const i = pipe(
data,
split(v => v === 0 || v === 1, {toggle: true}), // toggle-split in blocks with border values 0 and 1
);
console.log([...i]); //=> [ [ 33, 22 ], [ 55, 88 ] ]
Above, we skipped [77, 44]
as being outside any toggle block. And the last block [55, 88]
was included, even though it didn't close,
which is consistent with the general split logic.
Let's say, we want both 0
and 1
included into the same block, because they also represent valid block values:
const i = pipe(
data,
// toggle-split in blocks with border values 0 and 1,
// plus carry each block start forward, and carry each block end back:
split(v => v === 0 || v === 1, {toggle: true, carryStart: 1, carryEnd: -1}),
);
console.log([...i]); //=> [ [ 0, 33, 22, 1 ], [ 0, 55, 88 ] ]
See SplitValueCarry.
To further demonstrate the logic and flexibility of operator split, it lets you replicate operator page...
Splitting data into pages of fixed size
, by calling page(size)
, is the same as:
- splitting by internal list index, and carrying the split value forward:
split((_, index) => index.list >= size, {carryEnd: 1})
- or, splitting by internal list index - 1, and carrying the split value back:
split((_, index) => index.list >= size - 1, {carryEnd: -1})
Although the general split
logic works only with single-value splits, it is possible to work-around it,
and handle multi-value splits, by using parameter state
(third parameter of the predicate = iteration session state), to buffer extra
split values.