#**PaliGemma2 Demo Notebook**


---
This notebook demonstrates how to use the PaliGemma2 class for computer vision tasks, specifically using the ball dataset.
## Pre-work
Let's make sure that we have access to GPU.

In [None]:
!nvidia-smi

## Mount Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

##Create a work directory and cd into it.

In [None]:
import os
working_dir = "paligemma2"
working_path = os.path.join('/content/drive/MyDrive', working_dir)
os.makedirs(working_path, exist_ok=True)

# Change directory to the working directory
%cd {working_path}

## Clone BaseballCV Repo

In [None]:
!git clone -b 68-add-paligemma2-class-and-notebook https://github.com/dylandru/BaseballCV.git

#Set as Current Directory and install requirements for BaseballCV

In [None]:
%cd BaseballCV
!pip install -r requirements.txt

##Due to needing to restart the session after the previous step, we need to redo this:

In [None]:
import os
working_dir = "paligemma2"
working_path = os.path.join('/content/drive/MyDrive', working_dir)
os.makedirs(working_path, exist_ok=True)

# Change directory to the working directory
%cd {working_path}
%cd BaseballCV

#Initialize PaliGemma2 Model


---
##Now let's initialize our PaliGemma2 model:

*(You will need a HuggingFace token, and need to added to the Secrets section - the "key" section on the left -  by the name "HF_TOKEN". You also need to ask permision to use the paligemma model at: https://huggingface.co/google/paligemma2-3b-pt-224)*


In [None]:
from baseballcv.models import PaliGemma2
from datetime import datetime
from google.colab import userdata
from huggingface_hub import login, hf_hub_download

HF_TOKEN = userdata.get('HF_TOKEN') or os.environ.get('HF_TOKEN')

# Log in to Hugging Face Hub
if HF_TOKEN:
  login(token=HF_TOKEN)
else:
  print("Warning: HF_TOKEN not found. You may need to request access to the model manually.")


# Initialize the model
batch_size=4
model = PaliGemma2(batch_size=batch_size)


##Let's load the ball dataset:

In [None]:
from baseballcv.functions import LoadTools

# Initialize LoadTools
load_tools = LoadTools()

# Load the ball dataset
dataset_path = load_tools.load_dataset('baseball')

# Define classes for the ball dataset
classes = {
    2: "baseball"
}

# Fine-tuning


---


## Let's fine-tune the model on the ball dataset:

In [None]:
# Fine-tune the model
training_results = model.finetune(
    dataset=dataset_path,
    classes=classes,
    train_test_split=(80, 10, 10),
    epochs=1,  # 1 epochs for brevity
    lr=1e-06,
    save_dir="model_checkpoints",
    num_workers=4,
    lora_r=8,
    lora_scaling=12,
    lora_dropout=0.05
)

print("Training Results:")
print(f"Best Metric: {training_results['best_metric']}")
print(f"Final Training Loss: {training_results['final_train_loss']}")
print(f"Final Validation Loss: {training_results['final_val_loss']}")
print(f"Model saved at: {training_results['model_path']}")

# Evaluation


---


## Let's evaluate the model's performance:

In [None]:
# Evaluate the model
evaluation_results = model.evaluate(
    base_path=dataset_path,
    classes=classes,
    num_workers=4
)

print("\nEvaluation Results:")
print(f"mAP: {evaluation_results.map50}")
print(f"mAP@50:95: {evaluation_results.map}")

# Visualizing Results with TensorBoard


---


## You can visualize the training metrics using TensorBoard:

In [None]:
# Load TensorBoard extension
%load_ext tensorboard

# Launch TensorBoard
%tensorboard --logdir {training_results['tensorboard_dir']}