String encoding, slugification, and sanitization helpers. Zero dependencies — just small, composable functions for everyday text processing.
pip install datautil-extrasWith dev tools:
pip install datautil-extras[dev]from datautil_extras import (
to_base32, from_base32,
to_base62, from_base62,
slugify,
strip_ansi, strip_html, strip_control,
)
# Base32 encoding (no padding)
to_base32(b"hello world")
# 'NBSWY3DPEB3W64TMMQ'
from_base32("NBSWY3DPEB3W64TMMQ")
# b'hello world'
# Base62 encoding for compact integer IDs
to_base62(123456789)
# '8M0kX'
# URL-safe slugs
slugify("Hello, World! 123")
# 'hello-world-123'
# Strip ANSI escape codes from terminal output
strip_ansi("\x1b[31mError\x1b[0m: something broke")
# 'Error: something broke'
# Remove HTML tags
strip_html("<p>Hello <b>world</b></p>")
# 'Hello world'Encode bytes as base32 without padding.
Decode a base32 string back to bytes. Padding is optional.
Encode a non-negative integer as a base62 string (digits + letters).
Decode a base62 string to an integer.
Convert a string into a URL-safe slug. Unicode is transliterated to ASCII.
Remove ANSI escape sequences.
Remove HTML tags.
Remove ASCII control characters (except tab, newline, CR).
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest
mypy src/
ruff check src/ tests/src/datautil_extras/
├── __init__.py # Public API re-exports
├── encoding.py # Base32 and base62 encoding
├── slugify.py # URL-safe slug generation
└── sanitize.py # String cleanup utilities
- Python 3.10+
- No runtime dependencies