Jackrong LLM Fine-Tuning Guide

An educational, end-to-end open-source knowledge base for LLM fine-tuning, dataset distillation, reinforcement learning, and local deployment.

🌐 Languages: English | 中文 | 한국어 | 日本語

🤗 Hugging Face: Jackrong

This repository is a growing educational resource portal for beginners and developers who want reproducible training pipelines, SFT and RL workflows including GRPO and GSPO, data preparation and distillation recipes, 16-bit export and GGUF deployment workflows, and agent-ready Qwen MTP GGUF conversion tools.

🚀 Start Here

I want to...	Recommended entry
Fine-tune my first model in a browser	Open the training recipe catalog
Run the Qwopus3.6 27B GSPO tutorial	Open the GSPO Python tutorial
Prepare or distill training data	Browse data-processing recipes
Find curated reasoning, coding, and conversation datasets	Open the dataset catalog
Convert a Qwen model to MTP-enabled GGUF	Open the Qwen MTP GGUF Skill
Read full beginner guides and reports	Open the PDF guide library
Automate repeatable Codex workflows	Open the Codex Goal templates

🗺️ Repository Map

Resource	What you will find	Entry
🏋️ Training Recipes	SFT, GRPO, and GSPO notebooks and Python tutorials	Open
🧪 Data Processing	Distillation, preprocessing, filtering, and sampling workflows	Open
🧠 Dataset Catalog	Curated high-fidelity datasets and download helpers	Open
⚙️ Qwen MTP GGUF Skill	Agent-ready MTP extraction, injection, conversion, validation, quantization, and upload pipeline	Open
📘 Guides and Reports	Long-form PDF tutorials and technical reports	Open
🌐 Multilingual Docs	Chinese, Korean, and Japanese landing pages plus documentation indexes	Open
🤖 Codex Goal Templates	Editable goal templates for RL training, MTP GGUF conversion, and repository maintenance	Open

🏋️ Training Recipes

Model	Method	Environment
Qwopus3.5 27B	SFT	Google Colab
Qwopus3.6 27B	GSPO	Python script
Qwen3.5 9B	SFT	Kaggle
Qwopus3.5 35B	SFT	Kaggle
Llama3.2-R1 3B	GRPO	Kaggle

Browse the full catalog in train_code/README.md.

✅ Supported Workflows

Workflow	Status	Documentation
SFT with LoRA / QLoRA	✅ Released	Training recipes
GRPO reinforcement learning	✅ Released	Training recipes
GSPO reinforcement learning	✅ Released	Qwopus3.6 27B GSPO tutorial
Dataset distillation and preprocessing	✅ Released	Data-processing recipes
LoRA adapter save and merged 16-bit export	✅ Released	Training recipes
GGUF quantization	✅ Released	Training recipes
Qwen MTP GGUF conversion	✅ Released	MTP conversion skill

🛣️ Model Support Roadmap

Released RL recipes may use GRPO or GSPO depending on the model and training objective.

Model Family	SFT Support	RL Support
Qwen 3.5	✅ Released	Scheduled
Qwen 3.6	✅ Released	✅ Released
Qwen 3	Scheduled	Scheduled
Llama3.2-R1 3B	✅ Included	✅ Released
Llama 3.1 / 3.3	Scheduled	Scheduled

⚙️ Qwen MTP GGUF Conversion Skill

The qwen-mtp-gguf subproject supports Qwen-family MTP / nextn GGUF release workflows. It performs disk, RAM, tooling, token-access, and compatibility preflight checks, extracts compatible MTP tensors, injects them into the target model, converts with llama.cpp, smoke-tests outputs, quantizes releases, and supports safer upload/resume workflows.

🚀 Open the MTP Skill · 📖 Read the Pipeline Guide · 🤖 Read the Agent Usage Guide

📘 Guides and Reports

Long-form PDFs live in the guide and technical report library.

Guide	Topic	File
Qwopus3.5 27B Colab complete guide	Beginner-friendly end-to-end fine-tuning walkthrough	PDF
Qwopus GLM 18B technical report	Model design and training notes	PDF

🧠 High-Fidelity Dataset Catalog

The repository includes 24 curated high-fidelity datasets for reasoning, mathematics, coding, instruction following, conversation, and domain-specific distillation. Browse the full dataset catalog, or use download_datasets.py to batch download the suite for local training.

🤝 Open-Source Commitment

This project keeps the training source code and documentation for released fine-tuned models available so learners can reproduce, inspect, and adapt the workflows. The longer project philosophy and original message to builders are preserved in docs/PROJECT_PHILOSOPHY.md.

📚 Citation

If you find this repository helpful in your learning or research, please consider citing it:

@misc{jackrong-llm-finetuning,
  author = {Jackrong},
  title = {Jackrong LLM Fine-Tuning Guide: An Educational LLM Fine-Tuning Knowledge Base},
  year = {2026},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/R6410418/Jackrong-llm-finetuning-guide}}
}

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.agents/skills		.agents/skills
High-fidelity Dataset		High-fidelity Dataset
codex-goals		codex-goals
data_processing_code		data_processing_code
docs		docs
guidePDF		guidePDF
qwen-mtp-gguf		qwen-mtp-gguf
train_code		train_code
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
download_datasets.py		download_datasets.py
split_large_files.py		split_large_files.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jackrong LLM Fine-Tuning Guide

📚 Table of Contents

🚀 Start Here

🗺️ Repository Map

🏋️ Training Recipes

✅ Supported Workflows

🛣️ Model Support Roadmap

⚙️ Qwen MTP GGUF Conversion Skill

📘 Guides and Reports

🧠 High-Fidelity Dataset Catalog

🤝 Open-Source Commitment

📚 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Jackrong LLM Fine-Tuning Guide

📚 Table of Contents

🚀 Start Here

🗺️ Repository Map

🏋️ Training Recipes

✅ Supported Workflows

🛣️ Model Support Roadmap

⚙️ Qwen MTP GGUF Conversion Skill

📘 Guides and Reports

🧠 High-Fidelity Dataset Catalog

🤝 Open-Source Commitment

📚 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages