Lightweight, configurable engine for normalizing Excel/CSV workbooks. Supported input formats are .xlsx, .xlsm, .xls, and .csv; output is always written as .xlsx.
- Python 3.11+
- pip (or uv)
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"uv sync# Stable
pip install "git+https://github.com/clac-ca/automatic-data-extractor.git#subdirectory=apps/ade-engine"
# Development branch
pip install "git+https://github.com/clac-ca/automatic-data-extractor@development#subdirectory=apps/ade-engine"pytest -q# 1) Create a starter config package (uses bundled template)
ade-engine config init my-config --package-name ade_config
# 2) Validate the config package
ade-engine config validate --config-package my-config
# 3) Process a single file
ade-engine process file \
--input data/samples/CaressantWRH_251130__ORIGINAL.xls \
--output-dir output \
--config-package my-config
# 4) Process an entire directory
ade-engine process batch \
--input-dir data/samples \
--output-dir output/batch \
--config-package my-configNotes:
--config-packagecan point to your generated folder (e.g.,my-config) or any config package path; it is required unless set viaADE_ENGINE_CONFIG_PACKAGEorsettings.toml.process batch --includeacts as an allowlist; if provided, only matching files run.--excludepatterns always prune recursively.process filerequires either--outputor--output-dir(mutually exclusive).
ade-engine process file– run the engine on one input file.ade-engine process batch– recurse a directory of inputs.ade-engine config init– scaffold a config package from the bundled template.ade-engine config validate– import and register a config package to ensure it’s wired correctly.ade-engine version– print the CLI version.
- Logs and outputs default to
./logsand./outputwhen not provided. - To change defaults globally, set environment variables with the
ADE_ENGINE_prefix or add asettings.tomlalongside your runs. - Need types for the web app? From the repo root, run
ade types(if working in the full monorepo).
MIT License. See LICENSE.