Fluent data pipelines for python and your shell
CircleCI Documentation Status

  • Process big data in python using method chaining built on generators
  • Cross-platform CLI



flupy implements a fluent interface for chaining multiple method calls as a single python expression. All flupy methods return generators and are evaluated lazily in depth-first order. This allows flupy expressions to transform arbitrary size data in extremely limited memory.



  • Python 3.6+


Install flupy with pip:

$ pip install flupy


from itertools import count
from flupy import flu

# Processing an infinite sequence in constant memory
pipeline = flu(count()).map(lambda x: x**2) \
                       .filter(lambda x: x % 517 == 0) \
                       .chunk(5) \

for item in pipeline:

# Returns:
# [0, 267289, 1069156, 2405601, 4276624]
# [6682225, 9622404, 13097161, 17106496, 21650409]
# [26728900, 32341969, 38489616, 45171841, 52388644]


The flupy command line interface brings the same syntax for lazy piplines to your shell. Inputs to the flu command are auto-populated into a Fluent context named _.

$ flu -h
usage: flu [-h] [-f FILE] [-i [IMPORT [IMPORT ...]]] command

flupy: a fluent interface for python

positional arguments:
  command               flupy command to execute on input

optional arguments:
  -h, --help            show this help message and exit
  -f FILE, --file FILE  path to input file
  -i [IMPORT [IMPORT ...]], --import [IMPORT [IMPORT ...]]
                        modules to import
                        Syntax: <module>:<object>:<alias>
                                'import os' = '-i os'
                                'import os as op_sys' = '-i os::op_sys'
                                'from os import environ' = '-i os:environ'
                                'from os import environ as env' = '-i os:environ:env'
