# ollama Colab demo

Version 2: in-thread

## Step 1: runtime

Ensure your Colab runtime is "T4 GPU" through the _Runtime_ => _Change Runtime_ menu.

After that, execute the next cells.

## Step 2: install pciutils and ollama

The warning about systemd should not be a blocker.

In [None]:
!sudo apt update
!sudo apt install -y pciutils
!curl -fsSL https://ollama.com/install.sh | sh

## Step 3: start ollama in a thread, from Python

You can launch a long-running process right from within Python and leave it running while you move on to the next cells:

In [None]:
import threading
import subprocess
import time

def run_ollama_serve():
    subprocess.Popen(["ollama", "serve"])

thread = threading.Thread(target=run_ollama_serve)
thread.start()
time.sleep(5)

## Step 4: launch ollama commands

Even with ollama started from Python, regular bang-commands reach the listening server and work:

In [None]:
!ollama pull llama3.2

## Step 5: use ollama from Python

A little demonstration (one of the possible ways):

In [None]:
!pip install -qU langchain-ollama

In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM

template = """Question: {question}

Answer as concisely as possible.

Answer:"""

prompt = ChatPromptTemplate.from_template(template)

model = OllamaLLM(model="llama3.2")

chain = prompt | model

In [None]:
chain.invoke({"question": "What are the evolutionary reasons for the dikaryophase in higher Eumycota?"})