# Homework 2: open source models with Ollama

In this notebook, we will utilize open-source LLMs via Ollama, a robust and user-friendly platform designed to run LLMs locally. Ollama simplifies downloading, installing, and interacting with various LLMs, enabling offline access and reliable performance in areas with limited internet connectivity. Additionally, it offers a local API for seamless integration into applications and workflows, making it an excellent tool for experimentation, learning, and practical AI deployment.

To install Ollama locally, we will use Docker to pull the official container image from Docker Hub. Execute the following command:

```bash
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
```

The `/root/.ollama` folder contains information about the models stored in Ollama after downloading them. We are using a named volume for this directory because it allows us to attach the same volume to another container running Ollama, ensuring consistent model storage and access across different containers.

After executing the former command, we will check if the container is active by checking the Ollama version:

In [1]:
!docker exec ollama ollama -v

ollama version is 0.1.48


We will download a smaller LLM called gemma:2b. To do this, we enter the Ollama container and pull the model with the following command:

```bash
ollama pull gemma:2b
```

In Docker, the results are saved into `/root/.ollama`. For the sake of this homework, we are particularly interested in the metadata about this model, which can be found in `models/manifests/registry.ollama.ai/library`.

For a list of available models, visit [Ollama's model library](https://ollama.com/library).

In [2]:
import subprocess
import json

_ = subprocess.run("docker exec ollama ollama pull gemma:2b".split(),
                   stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

result = subprocess.run(
    "docker exec ollama cat /root/.ollama/models/manifests/registry.ollama.ai/library/gemma/2b".split(), 
    capture_output=True,
    text=True
)

# Get the standard output
gemma_metadata = result.stdout
# Parse the JSON string
gemma_metadata = json.loads(gemma_metadata)

# Print the formatted JSON
print("Formatted JSON Output:")
print(json.dumps(gemma_metadata, indent=4))


Formatted JSON Output:
{
    "schemaVersion": 2,
    "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
    "config": {
        "mediaType": "application/vnd.docker.container.image.v1+json",
        "digest": "sha256:887433b89a901c156f7e6944442f3c9e57f3c55d6ed52042cbb7303aea994290",
        "size": 483
    },
    "layers": [
        {
            "mediaType": "application/vnd.ollama.image.model",
            "digest": "sha256:c1864a5eb19305c40519da12cc543519e48a0697ecd30e15d5ac228644957d12",
            "size": 1678447520
        },
        {
            "mediaType": "application/vnd.ollama.image.license",
            "digest": "sha256:097a36493f718248845233af1d3fefe7a303f864fae13bc31a3a9704229378ca",
            "size": 8433
        },
        {
            "mediaType": "application/vnd.ollama.image.template",
            "digest": "sha256:109037bec39c0becc8221222ae23557559bc594290945a2c4221ab4f303b8871",
            "size": 136
        },
        {
            "medi

We can now test this LLM by executing `ollama run gemma:2b`. This allows us to start communicating with the model and ask questions to it. Here's an example where we ask the model what's ten times ten:

![](./img/question_3.png)

Another way to create the container, instead of using a named volume, is through bind mounts. This approach provides more control over where the files are stored locally, rather than using the default location of a named volume.

A practical use case for this method is to have a specific folder where the model data is stored after executing ollama pull. This is particularly useful when creating custom containers for Ollama. Instead of downloading the model each time and relying on the Ollama library, you can copy the files you initially downloaded, which were saved thanks to the bind mount.

Let's run a container with this new configuration:

First, create a directory to store the files:

```bash
mkdir ollama_files
```
Then, run the Docker container with the bind mount:

```bash
docker run -it \
    --rm \
    -v ./ollama_files:/root/.ollama \
    -p 11434:11434 \
    --name ollama \
    ollama/ollama

docker exec -it ollama ollama pull gemma:2b
```

We will now use the du (disk usage) command to check the size of the model we just downloaded. We will use the --si option to display sizes in powers of 1000 (rather than 1024), and the -s option to summarize the size of the folder without listing individual file sizes.

```bash
du --si -s ollama_files/models
```

In [3]:
result = subprocess.run("du --si -s ollama_files/models".split(), capture_output=True, text=True)
print(result.stdout)

1.7G	ollama_files/models



Now that we have downloaded the files for this model, we can create our own Ollama-based container and copy the model directly into it. To do this, we will create a Dockerfile that copies these files:

In [4]:
%%writefile Dockerfile
FROM ollama/ollama

COPY ./ollama_files /root/.ollama

EXPOSE 11434

CMD ["serve"]

Writing Dockerfile


We set the `serve` command here for readability, as it is the default command used by Ollama. This can be verified in the original image by executing `docker inspect`. Let's now build the custom image:

```bash
docker build -t ollama-gemma2b .
```

And run it:

```bash
docker run -it --rm -p 11434:11434 ollama-gemma2b
```

We can interact with the model using the OpenAI library to connect to the Ollama API. Here is an example of how to set up the connection:

In [5]:
from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1/',
    api_key='ollama',
)

Now let's use the model:

In [6]:
prompt = "What's the formula for energy?"
response = client.chat.completions.create(
        model='gemma:2b',
        temperature=0,
        messages=[{"role": "user", "content": prompt}]
    )

In [7]:
print(response.choices[0].message.content)

Sure, here's the formula for energy:

**E = K + U**

Where:

* **E** is the energy in joules (J)
* **K** is the kinetic energy in joules (J)
* **U** is the potential energy in joules (J)

**Kinetic energy (K)** is the energy an object possesses when it moves or is in motion. It is calculated as half the product of an object's mass (m) and its velocity (v) squared:

**K = 1/2mv^2**

**Potential energy (U)** is the energy an object possesses due to its position or configuration. It is calculated as the product of an object's mass, gravitational constant (g), and height or position above a reference point.

**U = mgh**

**Where:**

* **m** is the mass in kilograms (kg)
* **g** is the acceleration due to gravity in meters per second squared (m/s²)
* **h** is the height or position in meters (m)

The formula shows that energy can be expressed as the sum of kinetic and potential energy. The kinetic energy is a measure of the object's ability to do work, while the potential energy is a measur

In [8]:
response.usage

CompletionUsage(completion_tokens=283, prompt_tokens=34, total_tokens=317)