# Ollama advanced exercises

**Running LLM's locally - Master Applied AI - Michiel Bontenbal - 12 december 2024**

Ollama is a tool that allows users to run open-source large language models (LLMs) locally on your laptop. Ollama supports a variety of models, including Llama2, Mistral, CodeLlama and many others. 

You'll need to download ollama first. Download it from www.ollama.com.
You'll also need to do the notebook 'ollama.ipynb' first to get a basic understanding of ollama.


### Contents
0. Installs, checks and imports
1. Download GEITje from Huggingface & run it
2. Download FIETje from Huggingface & run it
3. Compare two models in a Gradio frontend

## 0. Installs, checks and imports

In [None]:
%pip install --upgrade ollama

In [None]:
# Check your version of python. To run ollama with python you will need Python 3.8 or higher.
from platform import python_version
print(python_version())

In [None]:
# Make sure you run from harddisk. Running this from OneDrive makes it much slower.
import os
print(f"Current working directory: {os.getcwd()}")

Current working directory: /Users/michielbontenbal/Desktop


In [None]:
import ollama
ollama.list()

## 1. Download and run GEITje from Huggingface

#### Exercise: 1 Download a model from the Huggingface Hub

Steps:
1. Go to Huggingface Hub
2. Find the model 'GEITje-7B-ultra-GGUF'
3. Check the different versions that are available!
4. Check also under 'files' the different versions & their size.
4. Click the button 'Use this model' to use it with Ollama.
5. Pick one version that is suitable for your laptop! (small/fast enough to run)
6. Make sure you'll change it to ollama pull <modelname>!

In [None]:
!ollama pull hf.co/BramVanroy/GEITje-7B-ultra-GGUF:Q3_K_M

#### Exercise 2: Run this model using python

Copy paste code from the notebook 'ollama.ipynb' to do this


In [8]:
#List all the models on your device
model_list = []
models = ollama.list()
modellen = models.models
for i in range (len(modellen)):
    print(models.models[i].model)
    model_list.append(models.models[i].model)

hf.co/BramVanroy/fietje-2-chat-gguf:Q3_K_M
hf.co/BramVanroy/GEITje-7B-ultra-GGUF:Q3_K_M
llama3.2:latest
llama3.2:1b
moondream:1.8b
nomic-embed-text:latest


In [6]:
#select the model. copy paste.
model = 'hf.co/BramVanroy/fietje-2-chat-gguf:Q3_K_M'
model

'hf.co/BramVanroy/fietje-2-chat-gguf:Q3_K_M'

In [7]:
#YOUR SOLUTION HERE
import ollama

def ask_ollama(question, model):
    """
    
    Sends a question to the Ollama API and returns the response.
    """
    response = ollama.chat(
        model=model,
        messages=[
            {'role': 'user', 'content': question},
        ],
    )

    return response['message']['content']

ask_ollama('vertel een mop', model )

'Wat dacht je van deze humoristische mop: "Het was een nogal saaie zaterdagmiddag voor mij, en ik wilde wat opfleuren. Toen besloot ik om wat gedichten te schrijven die ik al lang had verzameld. Ik had net een kop thee gehad toen ik eindelijk iets had dat ik zeker wist: het was de beste thee ooit!"'

## 2. Download and run FIETje

Find Fietje at https://huggingface.co/BramVanroy/fietje-2-chat-gguf

In [4]:
! ollama run hf.co/BramVanroy/fietje-2-chat-gguf:Q3_K_M

[?25lpulling manifest ⠋ [?25h[?25l[2K[1Gpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest ⠼ [?25h[?25l[2K[1Gpulling manifest ⠴ [?25h[?25l[2K[1Gpulling manifest ⠦ [?25h[?25l[2K[1Gpulling manifest ⠧ [?25h[?25l[2K[1Gpulling manifest ⠇ [?25h[?25l[2K[1Gpulling manifest ⠏ [?25h[?25l[2K[1Gpulling manifest ⠋ [?25h[?25l[2K[1Gpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest 
pulling 3344578b4f11...   0% ▕                ▏    0 B/1.4 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling 3344578b4f11...   0% ▕                ▏    0 B/1.4 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling 3344578b4f11...   0% ▕                ▏    0 B/1.4 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling 3344578b4f11...   0% ▕                

## 3. Compare two models
We can compare two models side by side by running the code below. 

In [11]:

import gradio as gr

models = model_list #TO DO Change models!

def compare_models(prompt, model1, model2):
    response1 = ask_ollama(prompt, model1)
    response2 = ask_ollama(prompt, model2)
    return response1, response2

  # Add or modify this list based on available models

iface = gr.Interface(
    fn=compare_models,
    inputs=[
        gr.Textbox(label="Enter your prompt"),
        gr.Dropdown(choices=models, label="Select Model 1"),
        gr.Dropdown(choices=models, label="Select Model 2")
    ],
    outputs=[
        gr.Textbox(label="Model 1 Response"),
        gr.Textbox(label="Model 2 Response")
    ],
    title="LLM Model Comparison Arena",
    description="Compare responses from two different LLM models side by side."
)

iface.launch()

* Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.


