# ollama

[![Index](https://img.shields.io/badge/Index-blue)](../index.ipynb)
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/digillia/Digillia-Colab/blob/main/tools/ollama.ipynb)

Like [vLLM](./vllm.ipynb), ollama is a fast and easy-to-use library for LLM inference and serving.

Docs:
- https://github.com/ollama/ollama
- https://ollama.com/

## Installation

> <p style="color:red;">ollama should be installed and running.<p>

Download and install from https://ollama.com/download.

In [15]:
import os
import sys

# Supprimer les commentaires pour installer
# !pip3 install -qU -r ../requirements.txt

# À installer dans tous les cas pour Google Colab et Github
if 'google.colab' in sys.modules or 'CI' in os.environ:
    !wget https://ollama.ai/install.sh -O /tmp/ollama.sh
    !bash /tmp/ollama.sh
    !pip3 install -qU ollama

## Exécution en Arrière-Plan

In [None]:
import subprocess
import time
import os

def kill_ollama():
    # Kill any existing Ollama processes
    try:
        subprocess.run(['pkill', 'ollama'], check=False)
        time.sleep(2)
        print("Cleaned up any existing Ollama processes")
    except:
        pass

# Launch Ollama in the background
def start_ollama():
    try:
        # Start Ollama serve in background
        process = subprocess.Popen(['ollama', 'serve'], 
                                 stdout=subprocess.PIPE, 
                                 stderr=subprocess.PIPE,
                                 preexec_fn=os.setsid if os.name != 'nt' else None)
        
        # Give it a moment to start
        time.sleep(10)
        
        print("Ollama started in background")
        return process
    except Exception as e:
        print(f"Error starting Ollama: {e}")
        return None

if 'google.colab' in sys.modules or 'CI' in os.environ:
    # Clean up any existing Ollama processes
    kill_ollama()
    # Start Ollama
    ollama_process = start_ollama()

In [17]:
!ollama --version

ollama version is 0.9.6


## Chargement du Model

In [18]:
import ollama

In [19]:
ollama.list()

ListResponse(models=[])

In [20]:
# Pull the Gemma3n model
# https://ollama.com/search
ollama.pull('gemma3n:latest')

ProgressResponse(status='success', completed=None, total=None, digest=None)

In [21]:
ollama.list()

ListResponse(models=[Model(model='gemma3n:latest', modified_at=datetime.datetime(2025, 7, 27, 13, 6, 38, 43613, tzinfo=TzInfo(+02:00)), digest='15cb39fd9394fd2549f6df9081cfc84dd134ecf2c9c5be911e5629920489ac32', size=7547589116, details=ModelDetails(parent_model='', format='gguf', family='gemma3n', families=['gemma3n'], parameter_size='6.9B', quantization_level='Q4_K_M'))])

## Interaction avec le Modèle

In [22]:
import nest_asyncio
nest_asyncio.apply()

class Client:
    def __init__(self, model="gemma3n:latest"):
        self.client = ollama.AsyncClient()
        self.model = model
        self.messages = []

    async def chat(self, content: str):
        self.messages.append({"role": "user", "content": content})
        response = await self.client.chat(self.model, messages=self.messages)
        msg = response.message.model_dump() # pydantic
        self.messages.append(msg)
        return msg["content"]

client = Client()
text = await client.chat("Quand la France a-t-elle gagné la Coupe du Monde de football pour la première fois ?")
print(text)

La France a gagné la Coupe du Monde de football pour la première fois en **1998**. 

Ils ont remporté le tournoi qui s'est déroulé en France, en battant l'Italie en finale sur le score de 1-0.



In [23]:
import pprint
pprint.pp(client.messages)

[{'role': 'user',
  'content': 'Quand la France a-t-elle gagné la Coupe du Monde de football '
             'pour la première fois ?'},
 {'role': 'assistant',
  'content': 'La France a gagné la Coupe du Monde de football pour la première '
             'fois en **1998**. \n'
             '\n'
             "Ils ont remporté le tournoi qui s'est déroulé en France, en "
             "battant l'Italie en finale sur le score de 1-0.\n",
  'thinking': None,
  'images': None,
  'tool_calls': None}]


## Ménage

In [24]:
ollama.delete('gemma3n:latest')
ollama.list()

ListResponse(models=[])

In [None]:
if 'google.colab' in sys.modules or 'CI' in os.environ:
    # Terminate Ollama
    if ollama_process and ollama_process.poll() is None:
        ollama_process.kill()
        ollama_process.wait(timeout=5)
        print("Ollama process killed")