##### Copyright 2024 Google LLC.

In [None]:
# @title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Finetuning Gemma Using LitGPT

[Gemma](https://ai.google.dev/gemma) is a family of lightweight, state-of-the-art open-source language models from Google. Built from the same research and technology used to create the Gemini models, Gemma models are text-to-text, decoder-only large language models (LLMs), available in English, with open weights, pre-trained variants, and instruction-tuned variants.
Gemma models are well-suited for various text-generation tasks, including question-answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop, or your cloud infrastructure, democratizing access to state-of-the-art AI models and helping foster innovation for everyone.

[LitGPT](https://github.com/Lightning-AI/litgpt) is a framework for working with Large Language models (LLMs). It goes beyond just running LLMs. LitGPT provides a toolkit for the entire LLM lifecycle, including pre-training new models, fine-tuning existing ones for specific tasks, evaluating their performance, and deploying them for real-world use.

This notebook guides you through fine-tuning, prompting, and deploying Gemma2 using LitGPT on Google Colab. You'll also upload your fine-tuned model to the Hugging Face Hub.

<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/Gemma/[Gemma_2]Finetune_with_LitGPT.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
</table>

## Setup

### Select the Colab runtime
To complete this tutorial, you'll need to have a Colab runtime with sufficient resources to run and fine-tune the Gemma model. In this case, you can use a T4 GPU with High RAM:

1. In the upper-right of the Colab window, select **▾ (Additional connection options)**.
2. Select **Change runtime type**.
3. Under **Hardware accelerator**, select **T4 GPU**. Toggle the High RAM option on.

### Setup Hugging Face

**Before you dive into the tutorial, let's get you set up with Hugging face:**

#### Hugging Face setup

1. **Hugging Face Account:**  If you don't already have one, you can create a free Hugging Face account by clicking [here](https://huggingface.co/join).

2. **Hugging Face Token:**  Generate a Hugging Face access (with `write` permission) token by clicking [here](https://huggingface.co/settings/tokens). You'll need this token later in the tutorial.

**Once you've completed these steps, you're ready to move on to the next section where you'll set up environment variables in your Colab environment.**

### Configure your HF token

Add your Hugging Face token to the Colab Secrets manager to securely store it.

1. Open your Google Colab notebook and click on the 🔑 Secrets tab in the left panel. <img src="https://storage.googleapis.com/generativeai-downloads/images/secrets.jpg" alt="The Secrets tab is found on the left panel." width=50%>
2. Create a new secret with the name `HF_TOKEN`.
3. Copy/paste your HF token key into the Value input box of `HF_TOKEN`.
4. Toggle the button on the left to allow notebook access to the secret.

In [1]:
import os
from google.colab import userdata

# Note: `userdata.get` is a Colab API. If you're not using Colab, set the env
# vars as appropriate for your system.
os.environ["HF_TOKEN"] = userdata.get("HF_TOKEN")

### Install dependencies

First, you must install the python package for LitGPT.

In [2]:
!pip install "litgpt[all]==0.5.3"

Collecting litgpt==0.5.3 (from litgpt[all]==0.5.3)
  Downloading litgpt-0.5.3-py3-none-any.whl.metadata (41 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/41.9 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.9/41.9 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting torch<=2.4.1,>=2.2.0 (from litgpt==0.5.3->litgpt[all]==0.5.3)
  Downloading torch-2.4.1-cp310-cp310-manylinux1_x86_64.whl.metadata (26 kB)
Collecting lightning==2.4.0 (from litgpt==0.5.3->litgpt[all]==0.5.3)
  Downloading lightning-2.4.0-py3-none-any.whl.metadata (38 kB)
Collecting jsonargparse<=4.32.1,>=4.30.1 (from jsonargparse[signatures]<=4.32.1,>=4.30.1->litgpt==0.5.3->litgpt[all]==0.5.3)
  Downloading jsonargparse-4.32.1-py3-none-any.whl.metadata (12 kB)
Collecting bitsandbytes==0.42.0 (from litgpt[all]==0.5.3)
  Downloading bitsandbytes-0.42.0-py3-none-any.whl.metadata (9.9 kB)
Collecting litdata==0.2.17 (from litgpt[all]=

Installing `litgpt` downgrades the pre-installed PyTorch version to 2.4.1, causing compatibility issues with `torchvision` and `torchaudio`. To avoid errors when pushing the fine-tuned model to the Hugging Face Hub, you must install versions of `torchvision` and `torchaudio` compatible with PyTorch 2.4.1.


In [19]:
!pip uninstall -y torchvision torchaudio
!pip install 'torchaudio==2.4.1' 'torchvision==0.19.1' --index-url https://download.pytorch.org/whl/cu121

Found existing installation: torchvision 0.20.1+cu121
Uninstalling torchvision-0.20.1+cu121:
  Successfully uninstalled torchvision-0.20.1+cu121
Found existing installation: torchaudio 2.5.1+cu121
Uninstalling torchaudio-2.5.1+cu121:
  Successfully uninstalled torchaudio-2.5.1+cu121
Looking in indexes: https://download.pytorch.org/whl/cu121
Collecting torchaudio==2.4.1
  Downloading https://download.pytorch.org/whl/cu121/torchaudio-2.4.1%2Bcu121-cp310-cp310-linux_x86_64.whl (3.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.4/3.4 MB[0m [31m47.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting torchvision==0.19.1
  Downloading https://download.pytorch.org/whl/cu121/torchvision-0.19.1%2Bcu121-cp310-cp310-linux_x86_64.whl (7.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.1/7.1 MB[0m [31m62.2 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: torchvision, torchaudio
Successfully installed torchaudio-2.4.1+cu121 torchvision-0.19.1

## Overview

LitGPT supports working with multiple local LLMs. It implements LLMs from scratch without any abstractions, giving users full control.

In this notebook, you'll implement the following workflows on Gemma 2 using LitGPT:

1. Fine-tune Gemma 2 on a small subset of the Alpaca dataset.
2. Perform inference using the fine-tuned model.
3. Deploy the fine-tuned model and send inference requests to the server using Python `requests`.
4. Upload the fine-tuned model to the Hugging Face Hub repository.

In this notebook, you'll use LitGPT's command-line interface to implement the aforementioned tasks. LitGPT also has an experimental Python API. You can explore its capabilities by visiting the [LitGPT Python API tutorial](https://github.com/Lightning-AI/litgpt/blob/main/tutorials/python-api.md).

## 1. Fine-tune Gemma 2 using LitGPT

In this section, you will fine-tune Gemma 2 on a small subset of the Alpaca dataset using the LitGPT command-line interface.


### Download the Gemma 2 model

LitGPT supports a variety of open source models including Gemma. To list the supported models, run the following command:

In [3]:
!litgpt download list

Please specify --repo_id <repo_id>. Available values:
codellama/CodeLlama-13b-hf
codellama/CodeLlama-13b-Instruct-hf
codellama/CodeLlama-13b-Python-hf
codellama/CodeLlama-34b-hf
codellama/CodeLlama-34b-Instruct-hf
codellama/CodeLlama-34b-Python-hf
codellama/CodeLlama-70b-hf
codellama/CodeLlama-70b-Instruct-hf
codellama/CodeLlama-70b-Python-hf
codellama/CodeLlama-7b-hf
codellama/CodeLlama-7b-Instruct-hf
codellama/CodeLlama-7b-Python-hf
databricks/dolly-v2-12b
databricks/dolly-v2-3b
databricks/dolly-v2-7b
EleutherAI/pythia-1.4b
EleutherAI/pythia-1.4b-deduped
EleutherAI/pythia-12b
EleutherAI/pythia-12b-deduped
EleutherAI/pythia-14m
EleutherAI/pythia-160m
EleutherAI/pythia-160m-deduped
EleutherAI/pythia-1b
EleutherAI/pythia-1b-deduped
EleutherAI/pythia-2.8b
EleutherAI/pythia-2.8b-deduped
EleutherAI/pythia-31m
EleutherAI/pythia-410m
EleutherAI/pythia-410m-deduped
EleutherAI/pythia-6.9b
EleutherAI/pythia-6.9b-deduped
EleutherAI/pythia-70m
EleutherAI/pythia-70m-deduped
garage-bAInd/Camel-Plat

In this notebook, you will use Gemma 2's 2b model. Download the model weights using the following command:

In [4]:
!litgpt download google/gemma-2-2b

Setting HF_HUB_ENABLE_HF_TRANSFER=1
config.json: 100% 818/818 [00:00<00:00, 5.36MB/s]
generation_config.json: 100% 168/168 [00:00<00:00, 1.06MB/s]
model-00001-of-00003.safetensors: 100% 4.99G/4.99G [00:12<00:00, 400MB/s]
model-00002-of-00003.safetensors: 100% 4.98G/4.98G [00:14<00:00, 346MB/s]
model-00003-of-00003.safetensors: 100% 481M/481M [00:01<00:00, 458MB/s]
model.safetensors.index.json: 100% 24.2k/24.2k [00:00<00:00, 48.1MB/s]
tokenizer.json: 100% 17.5M/17.5M [00:00<00:00, 42.6MB/s]
tokenizer.model: 100% 4.24M/4.24M [00:00<00:00, 48.3MB/s]
tokenizer_config.json: 100% 46.4k/46.4k [00:00<00:00, 48.9MB/s]
Converting .safetensor files to PyTorch binaries (.bin)
checkpoints/google/gemma-2-2b/model-00003-of-00003.safetensors --> checkpoints/google/gemma-2-2b/model-00003-of-00003.bin
checkpoints/google/gemma-2-2b/model-00001-of-00003.safetensors --> checkpoints/google/gemma-2-2b/model-00001-of-00003.bin
checkpoints/google/gemma-2-2b/model-00002-of-00003.safetensors --> checkpoints/goog

### Fine-tune Gemma 2 on Alpaca dataset
You will now fine-tune Gemma 2 on a subset of the Alpaca dataset.


**Alpaca dataset**

LitGPT supports instruction-tuning models on many popular open-source datasets, like Alpaca, Dolly, FLAN, etc., using a simple command-line interface. No need to download or prepare datasets separately; LitGPT handles this automatically.

The full [Alpaca](https://crfm.stanford.edu/2023/03/13/alpaca.html) dataset contains 52,000 instruction-response pairs, suitable for fine-tuning language models to follow instructions. However, for this task, you'll use a smaller subset of 2000 samples, [Alpaca2k](https://github.com/Lightning-AI/litgpt/blob/7449dad90740c4b0947a6ccb474b869ef969e110/tutorials/prepare_dataset.md#alpaca-2k).

Credits:
[mhenrichsen/alpaca_2k_test](https://huggingface.co/datasets/mhenrichsen/alpaca_2k_test) (This dataset provides the 2,000-sample Alpaca2k subset supported by LitGPT).

**LoRA fine-tuning**

LitGPT supports various fine-tuning methods, including full fine-tuning, LoRA, QLoRA, and adapter fine-tuning.

While full fine-tuning trains all model weight parameters, it's memory-intensive. For this reason, you'll use the LoRA technique to fine-tune Gemma 2 in this notebook.

LoRA (Low-Rank Adaptation) is a technique that freezes the original model's weights and introduces small, trainable parameter matrices for each layer. This significantly reduces the number of trainable parameters, leading to lower computational and memory requirements during fine-tuning.

LoRA reduces the storage requirements of LLMs without increasing inference latency.


You can read more about LoRA by visiting the [official LoRA Github repository](https://github.com/microsoft/LoRA).


**Command-line arguments**

Use the `litgpt finetune_lora` command to fine-tune Gemma 2.
The following command line arguments are specified:
1. `--data`: Specifies the dataset to be used for fine-tuning. While LitGPT supports various datasets, you'll be using `Alpaca2k` for this task. For more details about the supported datasets and data-specific command line arguments please refer to the [LitGPT preparing datasets tutorial](https://github.com/Lightning-AI/litgpt/blob/main/tutorials/prepare_dataset.md).
2. `--train.max_seq_length`: The maximum sequence length of the tokenized training samples. Samples that exceed this sequence length are truncated leading to a reduction in computational resource requirements. The maximum sequence length can be determined from the distribution of the training samples. In this tutorial, this value is set to 512. You can read more about this parameter in the [LitGPT preparing datasets tutorial](https://github.com/Lightning-AI/litgpt/blob/main/tutorials/prepare_dataset.md#truncating-datasets). You can explore the distribution of the `Alpaca2k` dataset in the [Alpaca2k section](https://github.com/Lightning-AI/litgpt/blob/main/tutorials/prepare_dataset.md#alpaca-2k) of this tutorial.
3. `--train.micro_batch_size`: Determines the number of samples processed per iteration. This value is set to 2 in this tutorial to avoid out-of-memory errors on the T4 GPU. You can adjust this based on your GPU's memory capacity.
4. `--train.epochs`: Specifies the number of epochs to fine-tune the model for. In this example, the model will be fine-tuned for one epoch. For better results, you can increase the number of epochs.
5. `--out_dir`: Specifies the directory where checkpoints are periodically saved during fine-tuning.
6. `--precision`: Sets the precision to `bf16-true`. Using lower precision (bf16) reduces memory usage compared to 32-bit precision. You can find more details on this in the [LitGPT handling out-of-memory errors guide](https://github.com/Lightning-AI/litgpt/blob/7449dad90740c4b0947a6ccb474b869ef969e110/tutorials/oom.md#use-lower-precision).

In addition to these parameters, you can customize other train, evaluation, LoRA, and dataset-specific settings in LitGPT. Run `litgpt finetune_lora --help` to see all configurable parameters.

Note: Fine-tuning Gemma 2 on Alpaca 2K with the specified hyperparameters takes about 45-50 minutes on a T4 GPU. For better results, you can adjust the training configuration, fine-tune for longer periods or use the full Alpaca dataset.


In [7]:
os.environ["FINETUNED_MODEL_DIR"] = "out/lit-finetuned/gemma-2-alpaca-it"

!litgpt finetune_lora google/gemma-2-2b \
  --data Alpaca2k \
  --train.max_seq_length 512 \
  --train.micro_batch_size 2 \
  --train.epochs 1 \
  --out_dir $FINETUNED_MODEL_DIR \
  --precision bf16-true

{'access_token': None,
 'checkpoint_dir': PosixPath('checkpoints/google/gemma-2-2b'),
 'data': Alpaca2k(mask_prompt=False,
                  val_split_fraction=0.05,
                  prompt_style=<litgpt.prompts.Alpaca object at 0x7cc16adf00a0>,
                  ignore_index=-100,
                  seed=42,
                  num_workers=4,
                  download_dir=PosixPath('data/alpaca2k')),
 'devices': 1,
 'eval': EvalArgs(interval=100,
                  max_new_tokens=100,
                  max_iters=100,
                  initial_validation=False,
                  final_validation=True,
                  evaluate_example='first'),
 'logger_name': 'csv',
 'lora_alpha': 16,
 'lora_dropout': 0.05,
 'lora_head': False,
 'lora_key': False,
 'lora_mlp': False,
 'lora_projection': False,
 'lora_query': True,
 'lora_r': 8,
 'lora_value': True,
 'num_nodes': 1,
 'optimizer': 'AdamW',
 'out_dir': PosixPath('out/lit-finetuned/gemma-2-alpaca-it'),
 'precision': 'bf16-true',
 'quantize

LitGPT's Python API supports pre-training and fine-tuning using the [PyTorch Lightning Trainer](https://lightning.ai/docs/pytorch/stable/common/trainer.html). You can read more about this in the [LitGPT Python API tutorial](https://github.com/Lightning-AI/litgpt/blob/main/tutorials/python-api.md#pytorch-lightning-trainer-support).

To customize fine-tuning further you can refer to the [LitGPT custom fine-tuning documentation](https://lightning.ai/lightning-ai/studios/litgpt-quick-start?section=featured#custom-finetuning).

## 2. Prompt the fine-tuned model

Next, you will test the fine-tuned model using the LitGPT command-line interface.

Use `litgpt generate` to prompt the fine-tuned Gemma 2 model. Specify the path to the fine-tuned model checkpoint in the command.

Use the `--prompt` argument to specify the query you want the model to answer.

You can also specify your preferred values for parameters like `top_k`, `top_p`, `temperature`, `max_new_tokens` etc.

Run `litgpt generate --help` to see all configurable parameters.
Please refer to the [LitGPT inference tutorial](https://github.com/Lightning-AI/litgpt/blob/main/tutorials/inference.md) for more details.

In [8]:
!litgpt generate $FINETUNED_MODEL_DIR/final \
  --prompt "Generate the next number in the Fibonnaci series: 1, 1, 2, 3, 5"

{'checkpoint_dir': PosixPath('out/lit-finetuned/gemma-2-alpaca-it/final'),
 'compile': False,
 'max_new_tokens': 50,
 'num_samples': 1,
 'precision': None,
 'prompt': 'Generate the next number in the Fibonnaci series: 1, 1, 2, 3, 5',
 'quantize': None,
 'temperature': 0.8,
 'top_k': 50,
 'top_p': 1.0}
Loading model 'out/lit-finetuned/gemma-2-alpaca-it/final/lit_model.pth' with {'name': 'Gemma-2-2b', 'hf_config': {'name': 'gemma-2-2b', 'org': 'google'}, 'scale_embeddings': True, 'attention_scores_scalar': 256, 'block_size': 8192, 'sliding_window_size': 4096, 'sliding_window_layer_placing': 2, 'vocab_size': 256000, 'padding_multiple': 512, 'padded_vocab_size': 256000, 'n_layer': 26, 'n_head': 8, 'head_size': 256, 'n_embd': 2304, 'rotary_percentage': 1.0, 'parallel_residual': False, 'bias': False, 'lm_head_bias': False, 'n_query_groups': 4, 'shared_attention_norm': False, 'norm_class_name': 'RMSNorm', 'post_attention_norm': True, 'post_mlp_norm': True, 'norm_eps': 1e-05, 'mlp_class_name':

## 3. Serve the fine-tuned model
You will now serve the fine-tuned model using LitGPT.

To deploy your fine-tuned model, use the `litgpt serve` command. Specify the port number using the `--port` argument.

When running in a Colab environment, you'll need to manage the LitGPT inference server as a Python subprocess using the `subprocess` package.

For more details on deploying LLMs with LitGPT, refer to the [LitGPT Serve and Deploy LLMs tutorial](https://github.com/Lightning-AI/litgpt/blob/main/tutorials/deploy.md).


In [9]:
import subprocess
import time

command = [
    "litgpt", "serve", "out/lit-finetuned/gemma-2-alpaca-it/final", "--port", "30000"
]

# Create a file to write logs
with open("litgpt_serve.log", "w") as logfile:
  # Use subprocess.Popen to run the command with nohup-like behavior
  server_process = subprocess.Popen(
    command,
    stdout=logfile,
    stderr=subprocess.STDOUT,
    stdin=subprocess.PIPE,
    start_new_session=True  # This is similar to nohup behavior, detaches from terminal
  )

  # Send an Enter key (\n) to the process to accept the terms
  server_process.stdin.write(b'\n')
  server_process.stdin.flush()

# Sleep for 60 seconds
time.sleep(60)

The server is now ready and can be reached at http://localhost:30000/ from within this notebook.

### Query the inference server

You can prompt the fine-tuned Gemma 2 model deployed via the inference server using Python's `requests` library.

You can craft your prompt to adhere to the format of the samples in the Alpaca dataset. Import `prompts` from `litgpt` and instantiate the Alpaca prompt style. Use this prompt style to convert any prompt to the Alpaca prompt format as demonstrated below.

In [10]:
from litgpt import prompts

alpaca_prompt_style = prompts.Alpaca()
prompt_text = alpaca_prompt_style.apply(prompt="Generate the next number in the Fibonnaci series.",
                                        input="1, 1, 2, 3, 5, 8")
print(prompt_text)

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Generate the next number in the Fibonnaci series.

### Input:
1, 1, 2, 3, 5, 8

### Response:



Use Python's `requests` library to send a prediction request with your prompt to the inference server.

In [12]:
import requests, json

server_url = " http://localhost:30000/"

response = requests.post(
    server_url + "predict",
    json={
      "prompt": prompt_text,
    }
)

print(json.dumps(response.json(), indent=2))

{
  "output": "The next number in the Fibonacci series is 8 as follows:\n\n1, 1, 2, 3, 5, 8, 13, 21 and 34, 55, 89"
}


You can stop the inference server by killing the server process you started earlier.

In [13]:
server_process.kill()

## 4. Push fine-tuned model to Hugging Face Hub
To push the fine-tuned model to the Hugging Face Hub, it must be converted to a format compatible with Hugging Face Transformers.

Since the model was fine-tuned using the LoRA technique, you must first run the following command on the fine-tuned model directory:


In [14]:
!litgpt merge_lora $FINETUNED_MODEL_DIR/final

{'checkpoint_dir': PosixPath('out/lit-finetuned/gemma-2-alpaca-it/final'),
 'precision': None,
 'pretrained_checkpoint_dir': None}
LoRA weights have already been merged in this checkpoint.


Next, convert the fine-tuned model to Hugging Face format using the following command. Specify the path to the LitGPT model to be converted and the desired output directory as arguments.


In [15]:
# Set an environment variable for the output directory where the model must be
# saved after conversion to Hugging Face compatible format.
# This will be used later to push the model to the hub.
os.environ["FINETUNED_HF_MODEL"] = "out/hf-format/gemma2-finetuned-it"

!litgpt convert_from_litgpt $FINETUNED_MODEL_DIR/final/ $FINETUNED_HF_MODEL

{'checkpoint_dir': PosixPath('out/lit-finetuned/gemma-2-alpaca-it/final'),
 'output_dir': PosixPath('out/hf-format/gemma2-finetuned-it')}


To push the model to Hugging Face Hub, you must load the fine-tuned model weights into a `Transformers` model.

First, a state dictionary is loaded from the fine-tuned model weights. Then, an instance of `AutoModel` is created from the configuration of the pre-trained `gemma-2-2b` model, loaded with the weights of the fine-tuned model.

To learn more about converting LitGPT models to Hugging Face Transformers, refer to the [converting LitGPT weights to Hugging Face Transformers tutorial](https://github.com/Lightning-AI/litgpt/blob/7449dad90740c4b0947a6ccb474b869ef969e110/tutorials/convert_lit_models.md).


In [1]:
import torch
from transformers import AutoModel

state_dict = torch.load("out/hf-format/gemma2-finetuned-it/model.pth")
model = AutoModel.from_pretrained("google/gemma-2-2b", state_dict=state_dict)

2.4.1+cu121


  state_dict = torch.load("out/hf-format/gemma2-finetuned-it/model.pth")


model.safetensors.index.json:   0%|          | 0.00/24.2k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/3 [00:00<?, ?it/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/4.99G [00:00<?, ?B/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/481M [00:00<?, ?B/s]

Use the model's `push_to_hub()` method to upload the model to Hugging Face Hub.

**Notes**:
1. In the following code snippet, replace "your_hf_username" with your Hugging Face username.
2. Your Hugging Face token needs to have write permissions.


In [3]:
model.push_to_hub("your_hf_username/gemma_finetuned_alpaca_it")

README.md:   0%|          | 0.00/5.17k [00:00<?, ?B/s]

No files have been modified since last commit. Skipping to prevent empty commit.


CommitInfo(commit_url='https://huggingface.co/prianka-kariat/gemma_finetuned_alpaca_it/commit/445cdc9120aacc566cd9582249095f94b64a98dd', commit_message='Upload model', commit_description='', oid='445cdc9120aacc566cd9582249095f94b64a98dd', pr_url=None, repo_url=RepoUrl('https://huggingface.co/prianka-kariat/gemma_finetuned_alpaca_it', endpoint='https://huggingface.co', repo_type='model', repo_id='prianka-kariat/gemma_finetuned_alpaca_it'), pr_revision=None, pr_num=None)

Congratulations! You've successfully fine-tuned, run, and served Gemma 2 using LitGPT.

What's next?

Your next steps could include the following:

**Experiment with different datasets**: Try fine-tuning on other instruction-tuning datasets supported by LitGPT. Implement custom workflows to fine-tune the model on other datasets from Hugging Face Hub or your data to adapt the model to various tasks or domains.

**Tune hyperparameters**: Adjust training parameters (e.g., learning rate, batch size, epochs, LoRA settings) to optimize performance and improve training efficiency.

**Evaluate the fine-tuned model**: Evaluate the fine-tuned model using LitGPT's command-line interface.
