# Exercise: Setting Up and Running a VLM

This notebook demonstrates how to install dependencies, download a model from Hugging Face,
and load it using the `llama_cpp` library. Follow the steps below to understand how to:
- Install necessary Python packages.
- Download a GGUF model file from Hugging Face.
- Load and interact with the model.

Make sure you have the required dependencies installed before running the notebook.

In [None]:
# Check GPU availability
!nvidia-smi

In [None]:
# Install necessary dependencies for running the Llama model
!pip3 install llama-cpp-python==0.3.4 --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu124
!pip3 install huggingface_hub==0.28.0

In [None]:
# Import required libraries for model execution and downloading
from llama_cpp import Llama
from llama_cpp.llama_chat_format import MoondreamChatHandler

In [None]:
# Create the chat handler for the VLM
chat_handler = MoondreamChatHandler.from_pretrained(
    repo_id="vikhyatk/moondream2",
    filename="*mmproj*",
)

In [None]:
# Create the VLM
llm = Llama.from_pretrained(
    repo_id="vikhyatk/moondream2",
    filename="*text-model*",
    chat_handler=chat_handler,
    n_ctx=4096,  # should be increased to accommodate the image embedding
)

In [None]:
# Prompt the VLM using Frieren image
llm.create_chat_completion(
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is the girl of the image eating?"},
                {"type": "image_url", "image_url": {
                    "url": "https://media.tenor.com/p78411-TVP4AAAAe/hamburger-frieren.png"}}

            ]
        }
    ]
)["choices"][0]["message"]["content"]