multiflow

About

multiflow is a Python multithreading library for data processing pipelines/workflows, streaming, etc. It extends concurrent.futures by allowing the input and output to be generator objects. And, it makes it easy to string together multiple thread pools together to create a multithreaded pipeline.

Additionally, multiflow comes with periodic logging, automatic retries, error handling, and argument expansion.

Why?

The ability to accept an input generator object while yielding an output generator object makes it ideal for concurrently doing multiple jobs where the output of the first job is the input of the second job. This means that it can start doing work on the second job before the first job completes; thus, completing the total work faster.

A great use case for this is streaming data. For example, with multiflow and smart_open, you could stream images from S3 and process them in a multithreaded environment before exporting them elsewhere.

Install

pip install multiflow

Quickstart

from multiflow import MultithreadedFlow


image_paths = []  # list of images


def transform(image_path):
    # do some work
    return new_path


with MultithreadedFlow() as flow:
    flow.consume(image_paths)  # can accept generator object or iterable item (see examples below for generator)
    flow.add_function(transform)

    for output in flow:
        if output:  # if successful
            print(output)  # new_path
        else:
            e = output.get_exception()

    success = flow.get_successful_job_count()
    failed = flow.get_failed_job_count()

Examples

For a working program using multiflow, see this example which resizes a S3 bucket of images to 50% and saves the resized images locally.

Documentation

The documentation is still a work in progress, but for the most up to date documentation, please see this page.

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
.github/workflows		.github/workflows
.vscode		.vscode
docs		docs
examples		examples
multiflow		multiflow
tests		tests
.codecov.yml		.codecov.yml
.editorconfig		.editorconfig
.gitignore		.gitignore
.gitpod.yml		.gitpod.yml
LICENSE		LICENSE
README.md		README.md
requirements-test.txt		requirements-test.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

multiflow

About

Why?

Install

Quickstart

Examples

Documentation

About

Releases 5

Packages

Languages

License

nyoungstudios/multiflow

Folders and files

Latest commit

History

Repository files navigation

multiflow

About

Why?

Install

Quickstart

Examples

Documentation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Languages

Packages