### Login to KaggleHub
The code below imports the `kagglehub` library and authenticates your session so you can access datasets, models, and other KaggleHub resources.


In [None]:
import kagglehub
kagglehub.login()

# Environment and GPU Configuration 

This cell prepares the runtime for fine-tuning.
We set the Keras backend to TensorFlow, specify which GPUs should be visible, enable dynamic GPU memory growth to avoid full memory reservation, and suppress unnecessary TensorFlow logs.

After applying these settings, we import TensorFlow, check how many GPUs are available, and enable memory-growth on each one. This ensures stable GPU usage when loading and training the Gemma 3 model.

In [None]:
import os
os.environ["KERAS_BACKEND"] = "tensorflow"
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1'
os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

# Import TensorFlow FIRST to lock in GPU
import tensorflow as tf
gpus = tf.config.list_physical_devices('GPU')
print(f"Initial GPU check: {len(gpus)} GPUs")

if gpus:
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
    print("✓ GPUs configured")

# Upgrade KerasNLP to the Latest Version

This cell installs the latest version of KerasNLP, which includes full support for the Gemma 3 model family.
We run a simple pip upgrade command, and then print a confirmation.
Do not restart the runtime after installing, because TensorFlow and the GPU setup from the previous cell would reset.

In [None]:
# Upgrade to latest KerasNLP for Gemma3 support
!pip install -q --upgrade keras-nlp
!

print("✓ KerasNLP upgraded to latest - continue to next cell (do NOT restart)")

# Verify Installation and Environment

This cell performs several checks before we start fine-tuning:


1. Imports required libraries: keras, keras_nlp, TopKSampler, time, csv, and logging.
2. Suppresses verbose logs from sentencepiece.
3. Prints the current versions of Keras and KerasNLP.
4. Re-checks that GPUs are still available.
5. Verifies that the Gemma3CausalLM model is present in KerasNLP.


This ensures the environment is correctly set up and ready for model fine-tuning.

In [None]:
import keras
import keras_nlp
from keras_nlp.samplers import TopKSampler
from time import time
import csv
import logging

# Suppress messages
logging.getLogger("sentencepiece").setLevel(logging.ERROR)

print("="*60)
print("KerasNLP version:", keras_nlp.__version__)
print("Keras version:", keras.__version__)

# Re-verify GPU
import tensorflow as tf
gpus = tf.config.list_physical_devices('GPU')
print(f"Num GPUs: {len(gpus)}")

if gpus:
    print("✓✓✓ GPU STILL DETECTED! ✓✓✓")
else:
    print("⚠️ GPU lost")
    
# Check Gemma3
if hasattr(keras_nlp.models, 'Gemma3CausalLM'):
    print("✓ Gemma3CausalLM available!")
else:
    print("✗ Gemma3CausalLM NOT available")
    print(f"Available: {[x for x in dir(keras_nlp.models) if 'Gemma' in x]}")
print("="*60)

# Load Gemma3 270M Model

This cell loads the Gemma3 270M causal language model using KerasNLP’s from_preset method.
We use the Kaggle-hosted preset to get the pre-trained weights and configuration.
Once loaded, the model is ready for fine-tuning.

In [None]:
import kagglehub

path = kagglehub.model_download("keras/gemma3/keras/gemma3_270m")
print("Path to model files:", path)

In [None]:
# Load the model
# We load the model gemma_3_270M using keras_nlp.
print("Loading Gemma3 270M model...")
gemma_lm = keras_nlp.models.Gemma3CausalLM.from_preset(path)
print("✓ Model loaded successfully!")


# Load and Prepare Dataset

This cell reads a CSV file containing medical question-answer pairs and converts it into a format suitable for fine-tuning.

The CSV has two columns: question and answer.

Each row is transformed into a dictionary with keys prompts (from question) and responses (from answer).

All examples are collected in a list called data.

In [None]:
import csv
import kagglehub


path = kagglehub.dataset_download("gpreda/medquad")
print("Dataset downloaded to:", path)

csv_path = f"{path}/medquad.csv"      

data = []

# The CSV file contains two columns 'question' and 'answer'
with open(csv_path, mode='r', encoding='utf-8') as file:
    reader = csv.DictReader(file)
    for row in reader:
        # we replace with 'prompts' and 'responses'
        data.append({"prompts": row['question'], 'responses': row['answer']})