MAQS: Model-Augmented Q-Learning Scheduler for Energy-Harvesting IoT Devices

This repository contains the simulation code accompanying the paper:

MAQS: Model-Augmented Q-Learning Scheduler for Energy-Harvesting IoT Devices Brendan J. Mackenzie, Max Sebrechts, Koustabh Dolui, Sam Michiels, Danny Hughes DistriNet, KU Leuven - Accepted at IEEE WoWMoM 2026

Overview

MAQS is a fully on-device Reinforcement Learning (RL) approach for dynamic task scheduling on energy-harvesting Internet of Things (IoT) devices. Unlike existing RL methods that require off-device training in simulated environments, MAQS devices continually learn and adapt during deployment. The energy management problem is formulated as a Markov Decision Process (MDP), where the IoT device acts as an agent that simultaneously performs tasks and learns an optimal scheduling policy through Q-learning.

Key contributions:

On-device RL: MAQS operates entirely on-device, requiring only 324 bytes of RAM for its Q-table and negligible computation, making it viable on the most resource-constrained hardware.
Phantom Q-updates: A novel mechanism that updates multiple Q-values from a single experience, significantly accelerating convergence.
No manual tuning: Unlike AsTAR/AsTAR++, which require careful scenario-specific calibration, MAQS replaces expert-driven parameter tuning with automated learning.

Four scheduling algorithms are compared in this simulation:

MAQS: Q-learning based scheduler with model-augmented updates
MAQS++: MAQS with diurnal (day/night) optimisation using separate Q-tables
AsTAR: State-of-the-art prediction-free baseline
AsTAR++: AsTAR with diurnal optimisation

Simulation Environment

The simulation framework uses a discrete-time model populated with real-world environmental data and hardware specifications. The hardware model is based on the Circuit Dojo nRF9160 Feather, powered by monocrystalline solar panels (7 cm × 5 cm, 17% efficiency) and supercapacitors (5F or 60F depending on the application). The model accounts for equivalent series resistance (ESR), leakage current, and DC-DC converter efficiency (Texas Instruments TPS62840). Environmental conditions use the EnHANTs long-term light irradiance dataset from Columbia University.

Repository Structure

Qlearning.py: Q-learning agent implementation (Q-table, reward function, epsilon-greedy exploration, phantom Q-update)
simulation_func.py: Core simulation functions for MAQS: voltage computation with ESR model during task/sleep cycles
AsTAR_sim_func.py: Core simulation functions for AsTAR: continuous-time task+sleep simulation with irradiance lookup
MAQS_simulation.py: MAQS simulation runner
MAQS++_simulation.py: MAQS++ simulation runner (adds night-mode optimisation)
AsTAR_simulation.py: AsTAR simulation runner
AsTAR++_simulation.py: AsTAR++ simulation runner (adds night-mode optimisation)
dataexpl.py: Dataset preprocessing script (handles missing values, aggregates to 15-min intervals)
histogram.py: Generates task interval histograms from saved results
voltage_graphs.py: Generates voltage and irradiance comparison plots from saved results

Data & Results

data/: Raw irradiance measurement files (Setups A–F from the EnHANTs dataset)
datasets/: Processed datasets (15-min averaged irradiance values)
res_small/, res_lidar/, res_ltem/: Saved simulation results (voltages, intervals) for each application
histograms/: Generated histogram PDFs
voltage_graphs/: Generated voltage comparison PDFs

Application Scenarios

Three application scenarios are defined, as described in Table II of the paper:

Specification	LiDAR	LTE-M	Small load
Sensing current	122 mA	218 μA	5 mA
Sensing duration	1.5 s	0.271 s	1.5 s
Transmission current	0	316.99 mA	0
Transmission duration	0	1 s	0
Sleep current	2.2 μA	61 μA	2.2 μA
Capacitance	60 F	60 F	5 F
Dataset	Setup C	Setup C	Setup A

Selecting an Application

To change the application scenario, you must update the boolean flags at the top of two files: simulation_func.py (used by MAQS/MAQS++) and AsTAR_sim_func.py (used by AsTAR/AsTAR++). Set exactly one of the three flags to True and the others to False:

# In simulation_func.py AND AsTAR_sim_func.py:
SMALL_LOAD_APPLICATION = False
LIDAR_APPLICATION = True
LTEM_APPLICATION = False

You must also update the dataset loaded in each simulation runner script (e.g. MAQS_simulation.py, AsTAR_simulation.py, etc.) to match the application's dataset:

LiDAR / LTE-M: use datasets/datasetC_processed
Small load: use datasets/datasetA_processed

Usage

Preprocess data (only needed once):
```
python dataexpl.py
```
Select the application by editing the flags in simulation_func.py and AsTAR_sim_func.py, and the dataset path in the simulation runner script (see Selecting an Application above).
Run a simulation (e.g., MAQS):
```
python MAQS_simulation.py
```
Generate plots from saved results:
```
python histogram.py
python voltage_graphs.py
```
NOTE: these scripts use the results in the res_*/ folders, whilst running new simulations will save results in the top level folder.

MAQS++ Bug fix

A bug was discovered in the MAQS++ implementation affecting the night-time derating of the optimum voltage. After fixing this, the simulation results have shifted slightly as a consequence, but MAQS++ retains its relative position among the four scheduling methods: it still delivers higher throughput than baseline MAQS in every scenario, while still trailing standard MAQS on reliability in the energy-tight LTE-M and small-load profiles. The most visible effect of the fix is that the time MAQS++ spends below the minimum and shutoff voltages drops substantially across all three applications, particularly post-training.

Changes to the results from the paper can be seen in the table below:

LiDAR application — dataset C

Period	Measure	Old	New	Δ
Full	Time under V_min	8.53%	4.06%	−4.47 pp
Full	Time under V_shutoff	1.16%	0.68%	−0.48 pp
Full	Time above V_max	0.40%	0.20%	−0.20 pp
Full	Time above V_rating	0.20%	0.09%	−0.11 pp
Full	Avg. num. tasks/hour	46.70	46.21	−0.49
Full	Task interval std. dev. (s)	192.22	195.06	+2.84
Excl. first 10 days	Time under V_min	6.57%	2.42%	−4.15 pp
Excl. first 10 days	Time under V_shutoff	1.08%	0.33%	−0.75 pp
Excl. first 10 days	Time above V_max	0.42%	0.20%	−0.22 pp
Excl. first 10 days	Time above V_rating	0.21%	0.09%	−0.12 pp
Excl. first 10 days	Avg. num. tasks/hour	47.65	47.18	−0.47
Excl. first 10 days	Task interval std. dev. (s)	192.87	195.77	+2.90
Excl. first 30 days	Time under V_min	4.03%	1.38%	−2.65 pp
Excl. first 30 days	Time under V_shutoff	0.44%	0.04%	−0.40 pp
Excl. first 30 days	Time above V_max	0.45%	0.22%	−0.23 pp
Excl. first 30 days	Time above V_rating	0.23%	0.10%	−0.13 pp
Excl. first 30 days	Avg. num. tasks/hour	50.39	49.94	−0.45
Excl. first 30 days	Task interval std. dev. (s)	190.02	190.94	+0.92

LTE-M application — dataset C

Period	Measure	Old	New	Δ
Full	Time under V_min	14.36%	10.56%	−3.80 pp
Full	Time under V_shutoff	3.62%	1.73%	−1.89 pp
Full	Time above V_max	0.14%	0.14%	0.00 pp
Full	Time above V_rating	0.01%	0.04%	+0.03 pp
Full	Avg. num. tasks/hour	26.39	26.31	−0.08
Full	Task interval std. dev. (s)	249.15	270.39	+21.24
Excl. first 10 days	Time under V_min	12.86%	8.63%	−4.23 pp
Excl. first 10 days	Time under V_shutoff	3.51%	1.63%	−1.88 pp
Excl. first 10 days	Time above V_max	0.15%	0.14%	−0.01 pp
Excl. first 10 days	Time above V_rating	0.01%	0.04%	+0.03 pp
Excl. first 10 days	Avg. num. tasks/hour	26.96	26.86	−0.10
Excl. first 10 days	Task interval std. dev. (s)	249.63	273.11	+23.48
Excl. first 30 days	Time under V_min	9.44%	6.11%	−3.33 pp
Excl. first 30 days	Time under V_shutoff	2.17%	1.15%	−1.02 pp
Excl. first 30 days	Time above V_max	0.16%	0.16%	0.00 pp
Excl. first 30 days	Time above V_rating	0.01%	0.05%	+0.04 pp
Excl. first 30 days	Avg. num. tasks/hour	28.52	28.45	−0.07
Excl. first 30 days	Task interval std. dev. (s)	247.70	270.57	+22.87

Small load application — dataset A

Period	Measure	Old	New	Δ
Full	Time under V_min	10.68%	7.76%	−2.92 pp
Full	Time under V_shutoff	1.22%	0.78%	−0.44 pp
Full	Time above V_max	0.00%	0.00%	0.00 pp
Full	Time above V_rating	0.00%	0.00%	0.00 pp
Full	Avg. num. tasks/hour	17.91	17.15	−0.76
Full	Task interval std. dev. (s)	314.09	310.69	−3.40
Excl. first 10 days	Time under V_min	9.41%	5.77%	−3.64 pp
Excl. first 10 days	Time under V_shutoff	1.01%	0.80%	−0.21 pp
Excl. first 10 days	Time above V_max	0.00%	0.00%	0.00 pp
Excl. first 10 days	Time above V_rating	0.00%	0.00%	0.00 pp
Excl. first 10 days	Avg. num. tasks/hour	17.56	16.74	−0.82
Excl. first 10 days	Task interval std. dev. (s)	321.13	317.75	−3.38
Excl. first 30 days	Time under V_min	5.99%	4.53%	−1.46 pp
Excl. first 30 days	Time under V_shutoff	0.59%	0.76%	+0.17 pp
Excl. first 30 days	Time above V_max	0.00%	0.00%	0.00 pp
Excl. first 30 days	Time above V_rating	0.00%	0.00%	0.00 pp
Excl. first 30 days	Avg. num. tasks/hour	16.75	16.03	−0.72
Excl. first 30 days	Task interval std. dev. (s)	337.21	324.77	−12.44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MAQS: Model-Augmented Q-Learning Scheduler for Energy-Harvesting IoT Devices

Overview

Simulation Environment

Repository Structure

Data & Results

Application Scenarios

Selecting an Application

Usage

MAQS++ Bug fix

LiDAR application — dataset C

LTE-M application — dataset C

Small load application — dataset A

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
datasets		datasets
histograms		histograms
res_lidar		res_lidar
res_ltem		res_ltem
res_small		res_small
.gitignore		.gitignore
AsTAR++_simulation.py		AsTAR++_simulation.py
AsTAR_sim_func.py		AsTAR_sim_func.py
AsTAR_simulation.py		AsTAR_simulation.py
LICENSE		LICENSE
MAQS++_simulation.py		MAQS++_simulation.py
MAQS_simulation.py		MAQS_simulation.py
Qlearning.py		Qlearning.py
README.md		README.md
dataexpl.py		dataexpl.py
histogram.py		histogram.py
simulation_func.py		simulation_func.py
voltage_graphs.py		voltage_graphs.py

Folders and files

Latest commit

History

Repository files navigation

MAQS: Model-Augmented Q-Learning Scheduler for Energy-Harvesting IoT Devices

Overview

Simulation Environment

Repository Structure

Data & Results

Application Scenarios

Selecting an Application

Usage

MAQS++ Bug fix

LiDAR application — dataset C

LTE-M application — dataset C

Small load application — dataset A

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages