| language | license | library_name | tags | base_model | datasets | metrics | pipeline_tag | model-index | ||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
apache-2.0 |
transformers |
|
Qwen/Qwen3-8B |
|
|
text-generation |
|
This model is a fine-tuned version of Qwen/Qwen3-8B specifically trained to understand and answer questions about any given private or new project repository, for example, Laddr - a framework for building scalable multi-agent systems.
The fine-tuning was performed using LoRA (Low-Rank Adaptation) with an innovative training data generation approach that does not rely on LLM-generated synthetic data, avoiding circular dependencies and hallucination issues.
- ✅ Project-Specific Knowledge: Deep understanding of Laddr's architecture, codebase, and APIs
- ✅ Code Location: Accurately locates functions, classes, and modules (+30% improvement)
- ✅ Code Understanding: Explains code functionality with detailed context (+19.3% improvement)
- ✅ Maintains General Abilities: Retains base model's general knowledge capabilities
- ✅ Zero Hallucination Training Data: Generated from real code via AST parsing, not LLM synthesis
- Model: Qwen/Qwen3-8B
- Parameters: 8 Billion
- Architecture: Transformer-based causal language model
- Method: LoRA (Low-Rank Adaptation)
- LoRA Rank: 64
- LoRA Alpha: 128
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Training Framework: DeepSpeed ZeRO-3
- Precision: BF16
- Epochs: 3
- Training Samples: 650+
- Training Time: ~2-3 hours on 2x GPUs (48GB each)
The training dataset was automatically generated from the Laddr repository using:
- Python AST parsing for code structure extraction
- Real docstrings and code comments
- Function signatures and parameter information
- Call graph relationships
- Project statistics and module structure
Data Composition:
- Code Explanation: 300+ samples (46%)
- API Usage: 150+ samples (23%)
- Code Location: 100+ samples (15%)
- Project Overview: 50+ samples (8%)
- Design Proposals: 50+ samples (8%)
Data Split:
- Training: 80% (520+ samples)
- Validation: 10% (65+ samples)
- Test: 10% (65+ samples)
| Metric | Base Model | Fine-tuned | Improvement |
|---|---|---|---|
| Overall Score | 49.4% | 71.5% | +22.1% ✅ |
| Code Location | 60.0% | 90.0% | +30.0% ⭐ |
| Code Understanding | 59.3% | 78.6% | +19.3% |
| Project Overview | 35.0% | 51.7% | +16.7% |
| General Knowledge | 10.0% | 30.0% | +20.0% |
Code Location Tasks (+30.0%):
- Accurately identifies file locations of functions/classes
- Provides complete file paths with line numbers
- Eliminates uncertainty in location queries
Code Understanding Tasks (+19.3%):
- Explains code functionality with context
- Includes function signatures and parameters
- Extracts and presents real docstrings
Project Overview Tasks (+16.7%):
- Describes project purpose and architecture
- Lists technology stack and dependencies
- Provides project statistics and structure
General Knowledge (+20.0%):
- Maintains ability to answer general programming questions
- Improves conciseness (from verbose to clear)
- Preserves base model capabilities
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "tensense/code_repo_finetuning"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
question = "Where is the create_system_tools function located in Laddr?"
messages = [{"role": "user", "content": question}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0][len(inputs['input_ids'][0]):], skip_special_tokens=True)
print(response)Query about function usage, parameters, and return values.
Help new developers quickly understand the Laddr project structure and key components.
Provide context and explanations for code review processes.
Question: "Where is create_system_tools located in Laddr?"
Response:
create_system_toolsis located inlib/laddr/src/laddr/core/system_tools.py.Function Description: Create system tools for task delegation and artifact management. This function checks for user-provided overrides before registering the default implementations.
Function Signature:
def create_system_tools(message_bus, storage_backend=None, agent=None) -> dict[str, tuple[Any, list[str]]]:Parameters:
message_bus(Any): Message queue backendstorage_backend(Any): Optional storage backendagent(Any): Optional agent instanceReturns:
dict[str, tuple[Any, list[str]]]
Question: "What is Laddr's main functionality?"
Response:
Laddr is a Python project that serves as a framework for building scalable multi-agent systems.
Main Technologies: bandit, black, boto3-stubs, build, click, etc.
The project contains 279 code elements, consisting of 67 classes and 212 functions/methods.
Core Modules:
core(279 elements)cli(52 elements)llms(39 elements)
- Project-Specific: Optimized for Laddr project; may not perform as well on other codebases
- Knowledge Cutoff: Based on the Laddr repository as of training time (2025-01)
- Language Focus: Primarily trained on Python code and English/Chinese documentation
- Limited General Coding: While it maintains general knowledge, it's optimized for Laddr-specific queries
Unlike traditional approaches that use LLMs to generate synthetic training data, this project employs a novel methodology:
- AST-Based Code Parsing: Python Abstract Syntax Tree analysis extracts accurate code structure
- Real Documentation: Utilizes actual docstrings, comments, and code signatures
- Call Graph Analysis: Builds function dependency relationships
- Pattern Extraction: Identifies code patterns (implementation, usage, interaction)
- Template-Based QA: Generates question-answer pairs using templates with real code context
Benefits:
- ✅ Avoids circular dependency (using LLM data to train LLM)
- ✅ Eliminates hallucination in training data
- ✅ Ensures factual accuracy
- ✅ Provides complete reasoning traces
GitHub Repository
↓
[1. Repository Analyzer]
→ Extracts code elements, patterns, call graph
↓
[2. Data Generator]
→ Creates QA pairs with code context
↓
[3. Model Fine-tuner]
→ LoRA + DeepSpeed ZeRO-3 training
↓
[4. LoRA Merger]
→ Merges adapter into base model
↓
[5. Model Evaluator]
→ Compares base vs fine-tuned
↓
Fine-tuned Model
The training methodology is repository-agnostic and can be applied to any codebase:
# 1. Update configuration
python utils/config_manager.py https://github.com/your-org/your-repo
# 2. Analyze repository
python scripts/01_analyze_repo.py
# 3. Generate training data
python scripts/02_generate_data.py
# 4. Fine-tune model
deepspeed --num_gpus=2 scripts/03_train_model.py
# 5. Merge LoRA weights
python scripts/04_merge_weights.py
# 6. Evaluate
python scripts/05_evaluate.pySupported Languages (currently):
- Python (primary)
- Markdown (documentation)
Extensible to:
- JavaScript/TypeScript
- Java
- Go
- Rust
- Code Attribution: All training data comes from the open-source Laddr repository
- License Compliance: Respects Apache 2.0 license of both base model and Laddr project
- No Private Data: Only uses publicly available code
- Reproducibility: Complete methodology documented for transparency
If you use this model or methodology in your research, please cite:
@misc{qwen3-code-repo-finetuned-2025,
title={Finetune any base model (e.g. Qwen3-8B) on any given code repository},
author={Renjun XU},
year={2025},
publisher={HuggingFace},
url={https://huggingface.co/tensense/code_repo_finetuning}
}- Base Model: Qwen Team for Qwen3-8B
- Laddr Project: AgnetLabs for the multi-agent framework
- Training Framework: HuggingFace Transformers, DeepSpeed, PEFT (LoRA)
This model is released under the Apache 2.0 License, consistent with:
- Qwen3-8B base model license
- Laddr project license
[Renjun XU]
For questions or issues, please contact:
- Email: xu@tensense.org
- GitHub: TopologyApplied
- HuggingFace: tensense
- Base Model: Qwen/Qwen3-8B
- Training Code: GitHub Repository
- Checkpoint & Finetuned Model: Huggingface
- Laddr Project: GitHub
- Evaluation Report: [Link to comparison_report.json]
- Design Documentation: [Link to design docs]
- Initial release
- Fine-tuned on Laddr repository
- 650+ training samples
- LoRA rank 64, alpha 128
- 3 epochs training
- Overall improvement: +22.1%
Note: This is a demonstration of repository-specific fine-tuning methodology. The approach can be adapted to any codebase for creating custom code assistants.