# Gemma3



To enhance the functionality of the CoreAI  environment, we need to install some libraries not pre-installed but required for this notebook. 

## Pre-requisites
Open your terminal or command prompt within the Jupyter notebook. Navigate via `File -> New -> Terminal`.
Type `bash` to access a shell compatible with the following commands.
Navigate to the project directory where you want to set up the environment:

```bash
export PROJECT_NAME="Gemma3"
export PIP_CACHE_DIR=`pwd`/.cache/pip
mkdir -p $PIP_CACHE_DIR
python -m venv --system-site-packages myvenv
source myvenv/bin/activate
pip install ipykernel
python -m ipykernel install --user --name=${PROJECT_NAME}-myvenv --display-name="Python (${PROJECT_NAME}-myvenv)"
echo ""; echo "Before continuing load the created Python kernel: Python (${PROJECT_NAME}-myvenv)"
```

Load the Python kernel described above before running the cell below (it might take a few seconds for the kernel to appear in the list of kernels).


## Install Required Libraries:

Before running the following commands, load the `Python (myvenv)` kernel.

Ensure you are in the directory where the Jupyter Notebook and the `myvenv` directory are located. 

In [None]:
!. ./myvenv/bin/activate; pip install -r requirements.txt

Add `accelerate` to the PATH

In [None]:
import os
pwd = os.getcwd()
os.environ['PATH'] =  os.path.join(pwd, 'myvenv/bin') + os.pathsep + os.environ['PATH']

! echo $PATH
! which accelerate

**Make sure `HF_HOME` is set BEFORE `transformers` is loaded**

In [None]:
import os
os.makedirs('HF_HOME', exist_ok=True)
os.environ['HF_HOME'] = 'HF_HOME'

In [None]:
# Load environment configuration file
# This sets up the basic structure for API credentials

import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

if 'HF_TOKEN' not in os.environ:
    printf("No HF_TOKEN set, will not be able to download the model")
    exit(1)

hf_token=os.environ['HF_TOKEN']

# LLM: Large Language Model queries

## LLM: Obtain model

In [None]:
from transformers import pipeline
import torch

pipe = pipeline(
    "text-generation",
    model="google/gemma-3-4b-it",
    device="cuda",
    torch_dtype=torch.bfloat16,
    token=hf_token
)


## LLM: Question Answering

In [None]:
question = "What is OpenStack? What is its landscape? What is the scientific SIG?"

messages = [
    {
        "role": "system",
        "content": [{"type": "text", "text": "you are a helpful assistant"}]
    },
    {
        "role": "user",
        "content": [{"type": "text", "text": f"{question}"}]
    },
]
user_message = messages[1]["content"][0]["text"]

output = pipe(text_inputs=user_message, max_new_tokens=4000)
print(output[0]["generated_text"])

### When done using the LLM: free up GPU memory

Note: we are using the same variable for text and image (`pipe`) this code can be used when switching from one method to the next

In [None]:
# Free up the model from memory before testing the VLM (Garbage Collection)
import gc

# check memory
print(torch.cuda.memory_allocated())

del pipe

gc.collect()
torch.cuda.empty_cache()

# check memory again
print(torch.cuda.memory_allocated())

# VLM: Image Understanding

## VLM: Obtain model

In [None]:
from transformers import pipeline
import torch

pipe = pipeline(
    "image-text-to-text",
    model="google/gemma-3-4b-it",
    device="cuda",
    torch_dtype=torch.bfloat16,
    token=hf_token
)

## VLM: Exract information from images

Modify prompt to ask questions

In [None]:
import PIL 
import base64
import io
from IPython.display import display

img_file = "data/openstack-map-v20250401.png" # a framework for a page object detection network based on Mask R-CNN


def base64_image(img_file):
    img_type = "png"
    img_b64 = None
    img_str = None
    img_bytes = io.BytesIO()
    with PIL.Image.open(img_file) as image:
        display(image)
        image.save(img_bytes, format=img_type)
        img_b64 = base64.b64encode(img_bytes.getvalue()).decode('utf-8')
    if img_b64 is not None:
        img_str = f"data:image/{img_type};base64,{img_b64}"
    
    if img_str is None:
        print("No valid image data")
        exit(1)

    return img_str

img_str = base64_image(img_file)

messages = [
    {
        "role": "system",
        "content": [{"type": "text", "text": "You are a helpful assistant."}]
    },
    {
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": { "url": img_str } },
            {"type": "text", "text": "Describe the image. what is the field of expertise needed, explain the idea behind the meaning of the image?"}
        ]
    }
]

output = pipe(text=messages, max_new_tokens=1000)
print(output[0]["generated_text"][-1]["content"])