Composable data transformation pipelines. Build reusable chains of key mapping, value transforms, and filtering operations on record lists.
pip install datautil-transformWith dev tools:
pip install datautil-transform[dev]from datautil_transform import (
Pipeline, map_keys, map_values, rename_keys, where, reject, unique_by,
)
# Build a text-processing pipeline
clean = Pipeline().then(str.strip).then(str.lower).then(str.title)
clean(" hello world ")
# 'Hello World'
# Rename keys across records
data = [{"first_name": "Alice", "age": 30}, {"first_name": "Bob", "age": 25}]
rename_keys({"first_name": "name"}, data)
# [{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]
# Uppercase all keys
map_keys(str.upper, data)
# [{'FIRST_NAME': 'Alice', 'AGE': 30}, ...]
# Filter records
where(lambda r: r["age"] >= 30, data)
# [{'first_name': 'Alice', 'age': 30}]
# Deduplicate by a field
records = [{"id": 1, "v": "a"}, {"id": 1, "v": "b"}, {"id": 2, "v": "c"}]
unique_by(lambda r: r["id"], records)
# [{'id': 1, 'v': 'a'}, {'id': 2, 'v': 'c'}]Create an empty pipeline. Chain steps with .then(fn) and execute with .run(data) or by calling directly.
p = Pipeline().then(int).then(lambda x: x * 2)
p("21")
# 42Apply a function to every key in each dict.
Apply a function to every value in each dict.
Rename keys using an old-to-new mapping. Unmapped keys are kept.
Keep records matching the predicate.
Remove records matching the predicate.
Deduplicate by a key function, keeping first occurrence.
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest
mypy src/
ruff check src/ tests/src/datautil_transform/
├── __init__.py # Public API re-exports
├── pipeline.py # Composable transformation pipeline
├── mapper.py # Key and value mapping utilities
└── filter.py # Filtering and deduplication
- Python 3.10+
- No runtime dependencies