Large language models (LLMs) have advanced code generation from single-function tasks to competitive-programming problems, but existing multi-agent solutions either rely on costly large-scale (>30B) models or collapse when downsized to small open-source models. We present MapCoder-Lite, which upgrades a single 7B model into four role-specialised agents—retriever, planner, coder, and debugger—using only rank-32, role-specific LoRA adapters (<3% extra parameters).
Three lightweight techniques make this possible: (i) trajectory distillation from strong LLMs fixes format fragility in retrieval and debugging, (ii) supervisor-guided correction strengthens planning and coding agents, and (iii) agent-wise LoRA fine-tuning delivers memory-efficient specialisation. Comprehensive evaluation on xCodeEval, APPS, and CodeContests shows that MapCoder-Lite more than doubles xCodeEval accuracy (13.2% → 28.3%), eliminates all format failures, and closes to within six points of a 32B baseline while cutting GPU memory and token-generation time by 4×.
MapCoder-Lite extends a single 7B backbone model (Qwen2.5-7B-Instruct) with four specialized agents, each fine-tuned using lightweight LoRA adapters:
- Retrieval Agent: Identifies relevant algorithms and generates XML-formatted tutorials
- Planning Agent: Creates step-by-step execution plans with confidence scores
- Coding Agent: Implements the plan into executable code
- Debugging Agent: Iteratively fixes bugs based on test outcomes
All agents share the same frozen 7B base model, with each agent having its own rank-32 LoRA adapter, adding less than 3% of the base model's parameters.
# Clone the repository
git clone https://github.com/your-repo/MapCoder-Lite.git
cd MapCoder-Lite
# Create a virtual environment
conda create -n mapcoder-lite python=3.10
conda activate mapcoder-lite
# Install dependencies
pip install -r requirements.txtMapCoder-Lite uses trajectory distillation from strong LLMs (Qwen2.5-32B, DeepSeek-V3) and supervisor-guided refinement to create high-quality training datasets for each agent. The collected datasets are filtered based on execution tests to ensure format correctness and semantic accuracy.
To collect retrieval agent training data, run:
python src/main.py \
--track traj_ret \
--strategy MapCoderRPC \
--ret Qwen32B \
--plan Qwen32B \
--conf Qwen32B \
--code Qwen32B \
--debug Qwen32B \
--dataset xCodeEvalTrainThe tracked data will be saved to ./outputs/trajectory/traj_ret.jsonl.
To collect debugging agent training data, run:
python src/main.py \
--track traj_debug \
--strategy MapCoder \
--ret Qwen7B \
--plan Qwen7B \
--conf Qwen7B \
--code Qwen7B \
--debug Qwen32B \
--dataset xCodeEvalTrainThe tracked data will be saved to ./outputs/trajectory/traj_debug.jsonl.
To collect planning and coding agent training data using supervisor-guided refinement, run:
python src/main.py \
--track traj_plan_code_supervised \
--strategy RPC_Supervisor \
--ret Qwen7B \
--plan Qwen7B \
--conf Qwen7B \
--code Qwen7B \
--debug Qwen7B \
--dataset xCodeEvalTrainThe supervisor model (DeepSeek-V3) will analyze failures, identify responsible agents, and provide targeted feedback. The tracked data will be saved to ./outputs/trajectory/traj_plan_code_supervised.jsonl.
TBD
The training code is organized in the train/ directory, with separate subdirectories for each agent:
train/
├── retrieval/ # Retrieval agent training
│ ├── train_sft_lora.py
│ ├── data/ # Training data directory
│ └── lora/ # Trained LoRA adapters saved here
├── planning/ # Planning agent training - plan generation
├── confidence/ # Planning agent training - Confidence score calculation
├── coding/ # Coding agent training
└── debugging/ # Debugging agent training
All agents use the similar training script train_sft_lora.py with similar parameters. The general training command is:
cd train/<agent_directory>
python train_sft_lora.py \
--data_files <data_file_name> \
--model_name Qwen/Qwen2.5-7B-Instruct \
--batch 16 \
--rank 32 \
--alpha 32 \
--epochs 3 \
--module qkvo \
--lr 20Agent-specific directories:
- Retrieval:
train/retrieval/- Train retrieval agent for algorithm identification and XML tutorial generation - Planning:
train/planning/- Train planning agent for step-by-step plan generation - Confidence:
train/confidence/- Train confidence score calculation - Coding:
train/coding/- Train coding agent for code implementation - Debugging:
train/debugging/- Train debugging agent for bug fixing
Trained LoRA adapters are saved in each agent's lora/ subdirectory. Training data should be placed in each agent's data/ subdirectory in JSONL format.
After training all agents, you can run MapCoder-Lite inference using the fine-tuned LoRA adapters:
python src/main.py \
--ret Qwen7BFTRet \
--lora_path_ret train/retrieval/lora/ret-sft-merged_true-qkvo-r32-s427-lr20-b16 \
--plan Qwen7BFTPlan \
--lora_path_plan train/planning/lora/plan-sft-verify_7b-qkvo-r32-s211-lr20-b16 \
--conf Qwen7BFTConf \
--code Qwen7BFTCode \
--lora_path_code train/coding/lora/code-sft-verify_7b_coding-qkvo-r32-s211-lr20-b16 \
--debug Qwen7BFTDebug \
--lora_path_debug train/debugging/lora/debug-sft-all_true_4156-qkvo-r32-s779-lr20-b16 \
--dataset xCodeEval \
--strategy MapCoderOr use the provided script:
bash run.shMapCoder-Lite achieves the following results on competitive programming benchmarks:
| Benchmark | MapCoder (7B) | MapCoder-Lite (7B) | Improvement |
|---|---|---|---|
| xCodeEval | 13.21% | 28.30% | +114% |
| APPS | 6.00% | 8.00% | +33% |
| CodeContests | 6.06% | 13.33% | +120% |
Key improvements:
- Format failures eliminated: 29 → 0 on xCodeEval
- Memory efficiency: 4× reduction in GPU memory usage
- Performance gap: Within 6 points of 32B MapCoder baseline
If you use MapCoder-Lite in your research, please cite:
@article{lee2025mapcoderlite,
title={MapCoder-Lite: Distilling Multi-Agent Coding into a Single Small LLM},
author={Woongkyu Lee and Junhee Cho and Jungwook Choi},
journal={arXiv preprint arXiv:2509.17489},
year={2025}
}This project is licensed under the MIT License - see the LICENSE file for details.
This work builds upon the original MapCoder framework. We thank the authors for their foundational contributions to multi-agent code generation.