# Fine-tuning FunctionGemma with AutoTrain

This notebook demonstrates how to fine-tune FunctionGemma using **AutoTrain Advanced** - a no-code solution from Hugging Face.

**Objectives:**
- Train the model to call `set_square_color` when the user wants to change the color
- Train the model to call `get_square_color` when the user asks about the current color

**Reference:** [AutoTrain Advanced](https://github.com/huggingface/autotrain-advanced)


In [None]:
%pip install -q autotrain-advanced torch torchvision torchaudio


In [None]:
from huggingface_hub import login
login()


In [None]:
import json

def format_function_call_output(tool_name: str, tool_arguments: dict) -> str:
    if not tool_arguments:
        return f"<start_function_call>call:{tool_name}{{}}<end_function_call>"
    
    args_parts = []
    for key, value in tool_arguments.items():
        if isinstance(value, str):
            args_parts.append(f"{key}:<escape>{value}<escape>")
        else:
            args_parts.append(f"{key}:{value}")
    
    args_str = ",".join(args_parts)
    return f"<start_function_call>call:{tool_name}{{{args_str}}}<end_function_call>"

with open("dataset/square_color_dataset.json", "r") as f:
    raw_data = json.load(f)

autotrain_data = []
for sample in raw_data:
    tool_args = json.loads(sample["tool_arguments"])
    assistant_response = format_function_call_output(sample["tool_name"], tool_args)
    
    autotrain_data.append({
        "messages": [
            {"role": "user", "content": sample["user_content"]},
            {"role": "assistant", "content": assistant_response}
        ]
    })

with open("dataset/train.jsonl", "w") as f:
    for item in autotrain_data:
        f.write(json.dumps(item) + "\n")

print(f"Converted {len(autotrain_data)} examples to dataset/train.jsonl")


In [None]:
config = """
task: llm-sft
base_model: google/functiongemma-270m-it
project_name: functiongemma-square-color-autotrain
log: tensorboard
backend: local

data:
  path: dataset/
  train_split: train
  valid_split: null
  chat_template: tokenizer
  column_mapping:
    text_column: messages

params:
  block_size: 512
  model_max_length: 512
  epochs: 8
  batch_size: 4
  lr: 5e-5
  peft: false
  quantization: null
  padding: right
  optimizer: adamw_torch
  scheduler: constant
  gradient_accumulation: 1
  mixed_precision: bf16
  merge_adapter: false

hub:
  username: [HF USERNAME]
  token: [HF TOKEN]
  push_to_hub: true
"""

with open("autotrain_config.yaml", "w") as f:
    f.write(config.strip())

print("Created autotrain_config.yaml")


In [None]:
!autotrain --config autotrain_config.yaml
