An educational, end-to-end open-source knowledge base for LLM fine-tuning, dataset distillation, reinforcement learning, and local deployment.
π Languages: English | δΈζ | νκ΅μ΄ | ζ₯ζ¬θͺ
π€ Hugging Face: Jackrong
This repository is a growing educational resource portal for beginners and developers who want reproducible training pipelines, SFT and RL workflows including GRPO and GSPO, data preparation and distillation recipes, 16-bit export and GGUF deployment workflows, and agent-ready Qwen MTP GGUF conversion tools.
- π Start Here
- πΊοΈ Repository Map
- ποΈ Training Recipes
- β Supported Workflows
- π£οΈ Model Support Roadmap
- βοΈ Qwen MTP GGUF Conversion Skill
- π Guides and Reports
- π§ High-Fidelity Dataset Catalog
- π€ Open-Source Commitment
- π Citation
| I want to... | Recommended entry |
|---|---|
| Fine-tune my first model in a browser | Open the training recipe catalog |
| Run the Qwopus3.6 27B GSPO tutorial | Open the GSPO Python tutorial |
| Prepare or distill training data | Browse data-processing recipes |
| Find curated reasoning, coding, and conversation datasets | Open the dataset catalog |
| Convert a Qwen model to MTP-enabled GGUF | Open the Qwen MTP GGUF Skill |
| Read full beginner guides and reports | Open the PDF guide library |
| Automate repeatable Codex workflows | Open the Codex Goal templates |
| Resource | What you will find | Entry |
|---|---|---|
| ποΈ Training Recipes | SFT, GRPO, and GSPO notebooks and Python tutorials | Open |
| π§ͺ Data Processing | Distillation, preprocessing, filtering, and sampling workflows | Open |
| π§ Dataset Catalog | Curated high-fidelity datasets and download helpers | Open |
| βοΈ Qwen MTP GGUF Skill | Agent-ready MTP extraction, injection, conversion, validation, quantization, and upload pipeline | Open |
| π Guides and Reports | Long-form PDF tutorials and technical reports | Open |
| π Multilingual Docs | Chinese, Korean, and Japanese landing pages plus documentation indexes | Open |
| π€ Codex Goal Templates | Editable goal templates for RL training, MTP GGUF conversion, and repository maintenance | Open |
| Model | Method | Environment | Quick setup |
|---|---|---|---|
| Qwopus3.5 27B | SFT | Google Colab | |
| Qwopus3.6 27B | GSPO | Python script | |
| Qwen3.5 9B | SFT | Kaggle | |
| Qwopus3.5 35B | SFT | Kaggle | |
| Llama3.2-R1 3B | GRPO | Kaggle |
Browse the full catalog in train_code/README.md.
| Workflow | Status | Documentation |
|---|---|---|
| SFT with LoRA / QLoRA | β Released | Training recipes |
| GRPO reinforcement learning | β Released | Training recipes |
| GSPO reinforcement learning | β Released | Qwopus3.6 27B GSPO tutorial |
| Dataset distillation and preprocessing | β Released | Data-processing recipes |
| LoRA adapter save and merged 16-bit export | β Released | Training recipes |
| GGUF quantization | β Released | Training recipes |
| Qwen MTP GGUF conversion | β Released | MTP conversion skill |
Released RL recipes may use GRPO or GSPO depending on the model and training objective.
| Model Family | SFT Support | RL Support |
|---|---|---|
| Qwen 3.5 | β Released | Scheduled |
| Qwen 3.6 | β Released | β Released |
| Qwen 3 | Scheduled | Scheduled |
| Llama3.2-R1 3B | β Included | β Released |
| Llama 3.1 / 3.3 | Scheduled | Scheduled |
The qwen-mtp-gguf subproject supports Qwen-family MTP / nextn GGUF release workflows. It performs disk, RAM, tooling, token-access, and compatibility preflight checks, extracts compatible MTP tensors, injects them into the target model, converts with llama.cpp, smoke-tests outputs, quantizes releases, and supports safer upload/resume workflows.
π Open the MTP Skill Β· π Read the Pipeline Guide Β· π€ Read the Agent Usage Guide
Long-form PDFs live in the guide and technical report library.
| Guide | Topic | File |
|---|---|---|
| Qwopus3.5 27B Colab complete guide | Beginner-friendly end-to-end fine-tuning walkthrough | |
| Qwopus GLM 18B technical report | Model design and training notes |
The repository includes 24 curated high-fidelity datasets for reasoning, mathematics, coding, instruction following, conversation, and domain-specific distillation. Browse the full dataset catalog, or use download_datasets.py to batch download the suite for local training.
This project keeps the training source code and documentation for released fine-tuned models available so learners can reproduce, inspect, and adapt the workflows. The longer project philosophy and original message to builders are preserved in docs/PROJECT_PHILOSOPHY.md.
If you find this repository helpful in your learning or research, please consider citing it:
@misc{jackrong-llm-finetuning,
author = {Jackrong},
title = {Jackrong LLM Fine-Tuning Guide: An Educational LLM Fine-Tuning Knowledge Base},
year = {2026},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/R6410418/Jackrong-llm-finetuning-guide}}
}