A mixed-integer linear programming model for optimizing operator-task-site assignments in an autonomous systems data collection operation.
Modern autonomous systems depend on large volumes of high-quality training data collected by human operators. The challenge: assign operators to tasks and sites to maximize quality-adjusted data output, subject to:
- Limited autonomous unit availability
- Site capacity constraints
- Minimum staffing requirements per task
- Varying operator skill levels across different task types
Rather than optimizing raw volume, we define a quality-scaled rate for each operator-task pair:
r_ij = (average volume per hour) × (QA pass rate)
This captures both speed and quality in a single metric.
x_ijk ∈ {0, 1}
Where x_ijk = 1 if operator i is assigned to task j at site k.
Maximize total quality-scaled output:
V = Σ_i Σ_j Σ_k r_ij · x_ijk
- Each operator assigned at most once
- Minimum staffing per task type
- Maximum capacity per site
- Total assignments ≤ available autonomous units
For a sample problem with:
- 32 operators
- 21 tasks
- 6 sites
- 30 available autonomous units
The solver finds the optimal assignment in <1 second, producing 53+ quality-hours of usable data per 8-hour shift—a 60% improvement over random assignment.
pip install -r requirements.txt
cd src
python optimize.pycd src
streamlit run Home.py├── data/
│ ├── input_data.xlsx # Default operator/task/site data
│ ├── operators.csv # Operator list
│ ├── tasks.csv # Tasks and minimum requirements
│ ├── sites.csv # Sites and capacities
│ └── rates.csv # Quality-scaled rates r_ij
├── src/
│ ├── Home.py # Streamlit dashboard
│ ├── dashboard.py # Dashboard utilities
│ ├── optimize.py # PuLP optimization script
│ ├── monte_carlo.py # Monte Carlo simulation
│ ├── generate_mock_data.py # Synthetic data generator
│ └── pages/
│ ├── 1_Operator_Skills.py
│ ├── 2_Assignment_Matrix.py
│ └── 3_Documentation.py
├── paper_figures/ # Publication-quality plots
├── paper_plots.py # Plot generation script
└── requirements.txt
- Multi-block assignments: Operators complete multiple task blocks per shift
- Dynamic re-optimization: Re-run when unit availability changes mid-shift
- Preference constraints: Accommodate supervisor preferences for specific operators
Kevin P. Rodriguez
December 2025