🧠 LLM-Trainer: FunctionGemma Router PipelineWelcome to the LLM-Trainer. This project provides a "Full Stack" workflow to take Google's tiny-but-mighty FunctionGemma-270M-it and turn it into a specialized "Router" for your local applications.Most small models struggle with tool calling, but by utilizing official Google formatting—specifically the <start_function_call> and <start_function_declaration> tokens—we’ve created a pipeline that ensures high accuracy even at the 270M parameter scale.✨ FeaturesSynthetic Dataset Generation: Creates 100% compliant FunctionGemma training data.Unsloth Integration: 2x faster training and 70% less memory usage on Google Colab.GGUF Export: Directly outputs quantized .gguf files for use in llama.cpp or local Python scripts.Strict Routing: Optimized to distinguish between tool calls and general conversation (NONE).📂 Project StructureFileRoleEnvironmentgenerate_training_data.pyGenerates the synthetic JSONL dataset.Localtrain_LLM.pyFine-tunes the model using LoRA.Google Colab / GPUmain.pyLocal inference script to test your .gguf.Localdata/Directory containing your training samples.-🛠️ How to Run ItFollow this three-step pipeline to get your router up and running.Step 1: Generate the Data (Local)Before training, you need a dataset. This script generates 500+ examples of tool calls and casual chat variations.Bashpython generate_training_data.py This will create a file at data/training_data.jsonl.Step 2: Train the Model (Google Colab)Open a new notebook on Google Colab.Set your Runtime to T4 GPU.Upload the train_LLM.py code and your training_data.jsonl file.Run the cells. The script will:Download FunctionGemma-270M.Apply LoRA adapters.Train for 450 steps.Export the model as function_gemma_router-Q8_0.gguf.Download the .gguf file to your local project folder.Step 3: Deploy & Chat (Local)Ensure you have the requirements installed:Bashpip install llama-cpp-python Then, run the interactive chat script:Bashpython main.py 🧩 The Magic SyntaxThis project works because it strictly follows the Gemma Function Calling specification. If you are modifying the tools, ensure the developer instruction stays consistent:Plaintext<start_of_turn>developer You are a model that can do function calling... <start_function_declaration>...<end_function_declaration> <end_of_turn> ⚖️ RequirementsTraining: NVIDIA GPU (8GB+ VRAM recommended).Inference: Any modern CPU (FunctionGemma-270M is incredibly light).Libraries: unsloth, trl, transformers, llama-cpp-python.📜 LicenseMIT - Feel free to use this for your own local AI projects!
Atty3333/LLM-Trainer
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|