Skip to content

Rationale

Vitaly Tomilov edited this page Dec 1, 2022 · 55 revisions

Key reasons of why you might want to use this library...

Native JavaScript Protocol

This library operates on JavaScript native types (sync and async iterables), and outputs the same. This means, no integration commitment, you can make use of the library in any context, without creating any compatibility concerns.

This separation also has a profound impact on performance, as explained below.

Clear separation of synchronous and asynchronous processing

If you look at the Benchmarks, synchronous iteration outperforms asynchronous many times over. This tells us that mixing synchronous and asynchronous processing into one isn't a good idea. However, this is the path many frameworks are taking, sacrificing performance to the convenience of processing unification.

What makes matter worse, is that in the real world applications, the amount of asynchronous processing is significantly lower than synchronous.

To design a good product, you need a clear picture of your data flow, in order to be able to improve on performance and scalability efficiently, and that does require separation of synchronous and asynchronous layers in your data processing.

To illustrate this, let's start with a bad code example:

import {pipe, toAsync, filter, distinct, map, wait} from 'iter-ops';

const data = [12, 32, 357, ...]; // million items or so

const i = pipe(
    toAsync(data), // make asynchronous
    filter(a => a % 3 === 0), // take only numbers divisible by 3
    distinct(), // remove duplicates
    map(a => service.process(a)), // use async service, which returns Promise
    wait() // resolve each promise
); // inferred type = AsyncIterableExt

for await(const a of i) {
    console.log(a); // show resolved data
}

And here's what a good version of the same code should look:

import {pipe, toAsync, filter, distinct, map, wait} from 'iter-ops';

const data = [12, 32, 357, ...]; // million items or so

// syncronous pipeline:
const i = pipe(
    data,
    filter(a => a % 3 === 0),
    distinct()
); // inferred type = IterableExt

// asynchronous pipeline:
const k = pipe(
    toAsync(i), // enable async processing
    map(a => service.process(a)),
    wait()
); // inferred type = AsyncIterableExt

for await(const a of k) {
    console.log(a); // show resolved data
}

Just by separating synchronous processing pipeline from asynchronous one, in the above scenario of filtering through a lot of initial data, before asynchronous processing, we can achieve performance increase of easily 10x times over.

This library keeps things separately, both through explicit type control and run-time, so there is never any confusion of whether you're doing synchronous or asynchronous data processing at any given time.

Iteration Sessions

Operators in this library support iteration state/session (IterationState), which lets you persist additional processing logic, during iteration session, for a more complex processing logic.

In the example below, we use iteration session of the filter operator, to detect and remove repeated values (do not confuse with distinct, which removes all duplicates).

const i = pipe(
    iterable,
    filter((value, index, state) => {
        if(value === state.previousValue) {
            return false;
        }
        state.previousValue = value;
        return true;
    })
);

Here's a more generic distinctUntilChanged implementation as a custom operator.

Clone this wiki locally