This project automates the training of wireless localization machine learning models using a low-code, configuration-first framework for radio localization in which 1) experiments are declared in human-readable configuration, 2) a workflow orchestrator runs standardized pipelines from data preparation to reporting, and 3) all artifacts, such as datasets, models, metrics, and reports are versioned. The pre-configured, versioned datasets reduce initial setup and boilerplate speeding up model development and evaluation. The design, with clear extension points, let experts add components without reworking the infrastructure.
- A low-code, configuration-first framework that bridges ease of use with scientific rigor by making reproducibility its default operating mode, integrating version control, execution isolation, and transparent artifact tracking
- Automated Training Pipelines: Five specialized pipelines for CTW2019, CTW2020, Log-a-Tec, Lumos5G and UMU datasets.
- Easy Setup: Minimal setup required with Conda dependencies.
- DVC Integration: Leveraging DVC for efficient data and model versioning, ensuring reproducibility and traceability.
- automated report generation consistent, comparable evaluation by applying a standardized set of metrics and reporting procedures across all methods and datasets, eliminating glue code and improving the credibility of result.
artifacts/<dataset>/data/{raw,interim,splits,prepared}contains dataset at different stages of data preparation pipeline.configs/<dataset>/dvc.yamlcontains pipeline instructions for DVC toolconfigs/<dataset>/params.yamlcontains ML model configurations
Before you begin, ensure you have the following installed:
- Conda for managing dependencies.
- Clone the repository to your local machine:
git clone https://github.com/sensorlab/nancy-saas-localization- Navigate to the cloned directory and install the required Conda dependencies:
conda env create -f environment.yaml
conda activate nancyTo build all models for all datasets run the ./run_pipelines.sh script. Grab a ☕ as it takes some time to build models from scratch.
Warning
In the current set up, we don't have artifact cache available. It will take some time to build all models from scratch.
If you wish to work on one particular dataset, follow these steps:
- Activate conda environment using
conda activate nancycommand. - Enter the subfolder
configs/<dataset>with configurations for a dataset. Replace<dataset>with either ofctw2019,ctw2020,logatec, orlumos5g. - (optional) tune/change ML model parameters in
params.yamlfile. - Run the following command to start the model training process:
# On first run also pull dataset dependencies with `--pull`.
# If complains something related to cache, add `--force` flag
dvc repro --pull
# On any subsequent run, it should be enough to run
dvc reproWhen submitting pull request, we suggest to run pre-commit checks. If you don't pre-commit installed yet, do the following steps:
- Run
pip install pre-committo install pre-commit-hooks tool - Run
pre-commit install, and it will make the tool part ofgit commitstep.
Now run pre-commit run --all-files to see if your changes comply with code rules.
This project is licensed under the BSD-3 Clause License - see the LICENSE file for details.
If you use this tool please cite our paper:
@article{strnad2025configuration,
title={A Configuration-First Framework for Reproducible, Low-Code Localization},
author={Strnad, Tim and Bertalani{\v{c}}, Bla{\v{z}} and Fortuna, Carolina},
journal={arXiv preprint arXiv:2510.25692},
year={2025}
}
This project has received funding from the European Union's Horizon Europe Framework Programme under grant agreement No. 101096456 (NANCY). The project is supported by the Smart Networks and Services Joint Undertaking and its members.