Skip to content

Finetune language models to automatically generate documentation for different programming language (Python, Java, go, etc)

License

fastbatchai/docstring-generation

Repository files navigation

🚀 AutoDoc Course

MIT License Python Course

Learn how to fine-tune language models to automatically generate high-quality docstrings across multiple programming languages.

🎯 What You'll Learn

  • Multi-task Fine-tuning: Train models to generate docstrings across multiple programming languages simultaneously
  • LLM Fine-tuning Techniques: Instruction fine-tuning and RL fine-tuning using GRPO
  • Hands-on Experience: Work with different fine-tuning libraries (PEFT, TRL, Unsloth)
  • Cloud Infrastructure: Deploy scalable training with Modal
  • Performance Evaluation: Compare models using automated metrics and evaluation frameworks

🚀 Quick Start

# Clone and install
git clone https://github.com/fastbatchai/docstring-generation.git
cd docstring-generation
uv pip install -e .

# Setup Modal
modal setup

# Run training
modal run -i -m autoDoc.train --training-type sft --use-unsloth

📖 Course Lessons

📊 Results

Fine-tuning Performance: CodeGemma vs CodeGemma+LoRA

Language CodeGemma CodeGemma+LoRA Improvement
Python 0.47 0.52 +11%
Java 0.57 0.55 -4%
JavaScript 0.43 0.48 +12%
Go 0.49 0.54 +10%
PHP 0.42 0.63 +50%
Ruby 0.52 0.60 +15%

NOTE: These are preliminary results based on training with a small subset (1K samples for each programming language).

Instruction finetuning results

Model Comparison Across Different Base Models
LoRA Configuration Impact on Performance

More results are available in Lesson 5: Evaluation and Comparison

🤝 Community

📄 License

MIT License - see LICENSE file for details.


⭐ Star this repository if you found it helpful!

About

Finetune language models to automatically generate documentation for different programming language (Python, Java, go, etc)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages