Skip to content

R6410418/Jackrong-llm-finetuning-guide

Repository files navigation

Jackrong LLM Fine-Tuning Guide

An educational, end-to-end open-source knowledge base for LLM fine-tuning, dataset distillation, reinforcement learning, and local deployment.

🌐 Languages: English | δΈ­ζ–‡ | ν•œκ΅­μ–΄ | ζ—₯本θͺž

πŸ€— Hugging Face: Jackrong


Unsloth Google Colab PyTorch Hugging Face LoRA PEFT Beginner Friendly


This repository is a growing educational resource portal for beginners and developers who want reproducible training pipelines, SFT and RL workflows including GRPO and GSPO, data preparation and distillation recipes, 16-bit export and GGUF deployment workflows, and agent-ready Qwen MTP GGUF conversion tools.

πŸ“š Table of Contents

πŸš€ Start Here

I want to... Recommended entry
Fine-tune my first model in a browser Open the training recipe catalog
Run the Qwopus3.6 27B GSPO tutorial Open the GSPO Python tutorial
Prepare or distill training data Browse data-processing recipes
Find curated reasoning, coding, and conversation datasets Open the dataset catalog
Convert a Qwen model to MTP-enabled GGUF Open the Qwen MTP GGUF Skill
Read full beginner guides and reports Open the PDF guide library
Automate repeatable Codex workflows Open the Codex Goal templates

πŸ—ΊοΈ Repository Map

Resource What you will find Entry
πŸ‹οΈ Training Recipes SFT, GRPO, and GSPO notebooks and Python tutorials Open
πŸ§ͺ Data Processing Distillation, preprocessing, filtering, and sampling workflows Open
🧠 Dataset Catalog Curated high-fidelity datasets and download helpers Open
βš™οΈ Qwen MTP GGUF Skill Agent-ready MTP extraction, injection, conversion, validation, quantization, and upload pipeline Open
πŸ“˜ Guides and Reports Long-form PDF tutorials and technical reports Open
🌐 Multilingual Docs Chinese, Korean, and Japanese landing pages plus documentation indexes Open
πŸ€– Codex Goal Templates Editable goal templates for RL training, MTP GGUF conversion, and repository maintenance Open

πŸ‹οΈ Training Recipes

Model Method Environment Quick setup
Qwopus3.5 27B SFT Google Colab Open In Colab
Qwopus3.6 27B GSPO Python script Python Code
Qwen3.5 9B SFT Kaggle Open In Kaggle
Qwopus3.5 35B SFT Kaggle Open In Kaggle
Llama3.2-R1 3B GRPO Kaggle Open In Kaggle

Browse the full catalog in train_code/README.md.

βœ… Supported Workflows

Workflow Status Documentation
SFT with LoRA / QLoRA βœ… Released Training recipes
GRPO reinforcement learning βœ… Released Training recipes
GSPO reinforcement learning βœ… Released Qwopus3.6 27B GSPO tutorial
Dataset distillation and preprocessing βœ… Released Data-processing recipes
LoRA adapter save and merged 16-bit export βœ… Released Training recipes
GGUF quantization βœ… Released Training recipes
Qwen MTP GGUF conversion βœ… Released MTP conversion skill

πŸ›£οΈ Model Support Roadmap

Released RL recipes may use GRPO or GSPO depending on the model and training objective.

Model Family SFT Support RL Support
Qwen 3.5 βœ… Released Scheduled
Qwen 3.6 βœ… Released βœ… Released
Qwen 3 Scheduled Scheduled
Llama3.2-R1 3B βœ… Included βœ… Released
Llama 3.1 / 3.3 Scheduled Scheduled

βš™οΈ Qwen MTP GGUF Conversion Skill

The qwen-mtp-gguf subproject supports Qwen-family MTP / nextn GGUF release workflows. It performs disk, RAM, tooling, token-access, and compatibility preflight checks, extracts compatible MTP tensors, injects them into the target model, converts with llama.cpp, smoke-tests outputs, quantizes releases, and supports safer upload/resume workflows.

πŸš€ Open the MTP Skill Β· πŸ“– Read the Pipeline Guide Β· πŸ€– Read the Agent Usage Guide

πŸ“˜ Guides and Reports

Long-form PDFs live in the guide and technical report library.

Guide Topic File
Qwopus3.5 27B Colab complete guide Beginner-friendly end-to-end fine-tuning walkthrough PDF
Qwopus GLM 18B technical report Model design and training notes PDF

🧠 High-Fidelity Dataset Catalog

The repository includes 24 curated high-fidelity datasets for reasoning, mathematics, coding, instruction following, conversation, and domain-specific distillation. Browse the full dataset catalog, or use download_datasets.py to batch download the suite for local training.

🀝 Open-Source Commitment

This project keeps the training source code and documentation for released fine-tuned models available so learners can reproduce, inspect, and adapt the workflows. The longer project philosophy and original message to builders are preserved in docs/PROJECT_PHILOSOPHY.md.

πŸ“š Citation

If you find this repository helpful in your learning or research, please consider citing it:

@misc{jackrong-llm-finetuning,
  author = {Jackrong},
  title = {Jackrong LLM Fine-Tuning Guide: An Educational LLM Fine-Tuning Knowledge Base},
  year = {2026},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/R6410418/Jackrong-llm-finetuning-guide}}
}