# Simple Ollama Environment – Binder-compatible Notebook

This notebook sets up a **local Ollama server** inside a Jupyter session (for example, on [**mybinder.org**](https://mybinder.org)) without requiring `sudo` or Docker.

It will:

1. Install the **Ollama Python client** (`ollama`) and `requests`.
2. Download the **Linux AMD64 Ollama tarball** (`ollama-linux-amd64.tgz`).
3. Extract it into your home directory (e.g. `~/ollama`).
4. Start `ollama serve` locally on `http://127.0.0.1:11434`.
5. Pull a small default model: `qwen2.5:0.5b-instruct`.
6. Run a minimal **Python chat example** using `ollama.chat()`.

The setup works both on Binder and in any standard Jupyter environment running on Linux.


## 0. How to use this notebook with Binder

**A. Add the notebook to your repo**

1. Save this file as `simple-ollama-environment-binder.ipynb`.
2. Commit and push it to your Git repository (e.g. on GitHub or GitLab).

**B. (Optional but recommended) Add `requirements.txt`**

In the root of your repo, create a `requirements.txt` with at least:

```text
ollama
requests
```

Binder will then preinstall these dependencies when building the image. (If you skip this,
the notebook will still `pip install` them at runtime, but that is a bit slower.)

**C. Create a Binder link**

You can use a URL like this (replace placeholders in ALL CAPS):

```text
https://mybinder.org/v2/gh/USER/REPO/BRANCH?labpath=simple-ollama-environment-binder.ipynb
```

Example:

```text
https://mybinder.org/v2/gh/myname/simple-ollama-environment/HEAD?labpath=simple-ollama-environment-binder.ipynb
```

**D. Once Binder launches**

1. Wait for the Binder image to build and JupyterLab to start.
2. The notebook should open automatically (or click it in the file browser).
3. Run the cells from top to bottom:
   - **Cell 1** – installs Python dependencies.
   - **Cells 2–4** – download & start the local Ollama server.
   - **Cell 5** – pulls the `qwen2.5:0.5b-instruct` model.
   - **Cell 6** – sends a test chat to the model.
4. You can then adapt the last cell(s) for your own prompts and use-cases.


## 1. Install Python dependencies

We install:

- `ollama` – the official Python client
- `requests` – to download the server tarball and poll the API

If you're on Binder **and** you added these to `requirements.txt`, this step will be quick.


In [None]:
!pip install -q ollama requests

import ollama, requests
print("ollama Python client version:", getattr(ollama, "__version__", "unknown"))


## 2. Configure paths and environment variables

We will install the Ollama binary into a directory in your home, e.g. `~/ollama`,
and store models under `~/ollama-models`.

This follows the same pattern commonly used on HPC / no-sudo environments:

```bash
mkdir -p ollama
tar -C ollama -xzvf ollama-linux-amd64.tgz
./ollama/bin/ollama --version
```


In [None]:
import os
from pathlib import Path

HOME = Path.home()
OLLAMA_DIR = HOME / "ollama"
OLLAMA_MODELS_DIR = HOME / "ollama-models"
OLLAMA_TARBALL = HOME / "ollama-linux-amd64.tgz"

OLLAMA_DIR.mkdir(parents=True, exist_ok=True)
OLLAMA_MODELS_DIR.mkdir(parents=True, exist_ok=True)

# Environment variables used by the Ollama server
os.environ["OLLAMA_MODELS"] = str(OLLAMA_MODELS_DIR)
os.environ["OLLAMA_HOST"] = "http://127.0.0.1:11434"

OLLAMA_BIN = OLLAMA_DIR / "bin" / "ollama"

print("HOME              =", HOME)
print("OLLAMA_DIR        =", OLLAMA_DIR)
print("OLLAMA_MODELS_DIR =", OLLAMA_MODELS_DIR)
print("OLLAMA_BIN        =", OLLAMA_BIN)
print("OLLAMA_HOST       =", os.environ["OLLAMA_HOST"])


## 3. Download and unpack the Ollama Linux AMD64 tarball

This cell downloads `ollama-linux-amd64.tgz` and extracts it into `OLLAMA_DIR`.
If the binary already exists, it will skip the download.


In [None]:
import io
import tarfile
import subprocess

OLLAMA_URL = "https://ollama.com/download/ollama-linux-amd64.tgz"

if not OLLAMA_BIN.exists():
    print("Downloading:", OLLAMA_URL)
    r = requests.get(OLLAMA_URL, stream=True)
    r.raise_for_status()
    data = r.content
    print("Download size (bytes):", len(data))

    file_like = io.BytesIO(data)
    print("Extracting to", OLLAMA_DIR)
    with tarfile.open(fileobj=file_like, mode="r:gz") as tar:
        tar.extractall(path=OLLAMA_DIR)
else:
    print("Ollama binary already exists at:", OLLAMA_BIN)

print("Verifying Ollama binary and version...")
if not OLLAMA_BIN.exists():
    raise FileNotFoundError(f"Expected Ollama binary at {OLLAMA_BIN}, but it was not found.")

result = subprocess.run([str(OLLAMA_BIN), "--version"], capture_output=True, text=True)
print("Return code:", result.returncode)
print(result.stdout or result.stderr)


## 4. Start the Ollama server inside this Jupyter session

This starts `ollama serve` in the background and waits until the API at
`http://127.0.0.1:11434/api/tags` responds.


In [None]:
import time
import signal

base_url = os.environ.get("OLLAMA_HOST", "http://127.0.0.1:11434")
tags_url = base_url.rstrip("/") + "/api/tags"

# If an old server process exists, check its status
if 'OLLAMA_SERVER_PROCESS' in globals():
    if OLLAMA_SERVER_PROCESS.poll() is None:
        print("Ollama server is already running with PID", OLLAMA_SERVER_PROCESS.pid)
    else:
        print("Previous Ollama server process has exited; starting a new one...")
        del OLLAMA_SERVER_PROCESS

if 'OLLAMA_SERVER_PROCESS' not in globals() or OLLAMA_SERVER_PROCESS.poll() is not None:
    print("Starting Ollama server...")
    OLLAMA_SERVER_PROCESS = subprocess.Popen(
        [str(OLLAMA_BIN), "serve"],
        stdout=subprocess.DEVNULL,
        stderr=subprocess.STDOUT,
    )
    time.sleep(2)

print("Waiting for Ollama server to become ready at", tags_url)
for i in range(60):
    try:
        r = requests.get(tags_url, timeout=2)
        if r.ok:
            print("✅ Ollama server is up!")
            break
    except Exception:
        pass
    time.sleep(1)
else:
    print("⚠️ Server did not become ready within 60 seconds.")
    print("You may want to rerun this cell or check that the binary is executable.")


## 5. Pull a small default model: `qwen2.5:0.5b-instruct`

This pulls a relatively small model from the Ollama library. You can change
`MODEL_NAME` to any other Ollama model tag you prefer.


In [None]:
MODEL_NAME = "qwen2.5:0.5b-instruct"  # change this tag if you want a different model

print(f"Pulling model: {MODEL_NAME} ...")
pull_proc = subprocess.run([str(OLLAMA_BIN), "pull", MODEL_NAME])
if pull_proc.returncode == 0:
    print("✅ Model pulled successfully.")
else:
    print("⚠️ Model pull returned a non-zero exit code.")


## 6. Quickstart: chat with the model via Python

Now that the server is running and the model is pulled, we can send a simple
chat message using the **Ollama Python client**.


In [None]:
import ollama

# Ensure the client points to the local server
os.environ.setdefault("OLLAMA_HOST", "http://127.0.0.1:11434")

MODEL_NAME = globals().get("MODEL_NAME", "qwen2.5:0.5b-instruct")

prompt = (
    "Di' solo 'Ciao!' in italiano e poi dammi 1 consiglio molto breve "
    "per studiare meglio (in italiano)."
)

response = ollama.chat(
    model=MODEL_NAME,
    messages=[{"role": "user", "content": prompt}],
)

print("--- Model response ---\n")
print(response["message"]["content"])


## 7. (Optional) Stop the Ollama server

This will gracefully stop the background `ollama serve` process. If you are on Binder,
the entire container will also be discarded when you close the session.


In [None]:
if 'OLLAMA_SERVER_PROCESS' in globals() and OLLAMA_SERVER_PROCESS.poll() is None:
    print("Stopping Ollama server (PID", OLLAMA_SERVER_PROCESS.pid, ")...")
    OLLAMA_SERVER_PROCESS.terminate()
    try:
        OLLAMA_SERVER_PROCESS.wait(timeout=10)
        print("Ollama server stopped.")
    except Exception:
        print("Process did not exit in time, killing...")
        OLLAMA_SERVER_PROCESS.kill()
else:
    print("No running Ollama server process found.")
