In this homework, we'll experiment more with Ollama

Q1. Running Ollama with Docker

Let's run ollama with Docker. We will need to execute the same command as in the lectures:

docker run -it \
    --rm \
    -v ollama:/root/.ollama \
    -p 11434:11434 \
    --name ollama \
    ollama/ollama
What's the version of ollama client?

To find out, enter the container and execute ollama with the -v flag.

answer: \
@liveisliiife ➜ /workspaces/llm-zoomcamp (main) $ docker exec -it ollama bash \
root@95a901b51738:/# ollama -v  \
ollama version is 0.3.5  

Q2. Downloading an LLM

We will donwload a smaller LLM - gemma:2b.

Again let's enter the container and pull the model:

ollama pull gemma:2b 

In docker, it saved the results into /root/.ollama

We're interested in the metadata about this model. You can find it in models/manifests/registry.ollama.ai/library

What's the content of the file related to gemma?

answer: \
{"schemaVersion":2,"mediaType":"application/vnd.docker.distribution.manifest.v2+json","config":  {"mediaType":"application/vnd.docker.container.image.v1+json","digest":"sha256:887433b89a901c156f7e6944442f3c9e57f3c55d6ed52042cbb7303aea994290","size":483},"layers":   [{"mediaType":"application/vnd.ollama.image.model","digest":"sha256:c1864a5eb19305c40519da12cc543519e48a0697ecd30e15d5ac228644957d12","size":1678447520},{"mediaType":"application/vnd.ollama.image.license","digest":"sha256:097a36493f718248845233af1d3fefe7a303f864fae13bc31a3a9704229378ca","size":8433},{"mediaType":"application/vnd.ollama.image.template","digest":"sha256:109037bec39c0becc8221222ae23557559bc594290945a2c4221ab4f303b8871","size":136},{"mediaType":"application/vnd.ollama.image.params","digest":"sha256:22a838ceb7fb22755a3b0ae9b4eadde629d19be1f651f73efb8c6b4e2cd0eea0","size":84}]}root@95a901b51738:~/.ollama/models/manifests/registry.ollama.ai/library/gemma# 

Q3. Running the LLM

Test the following prompt: "10 * 10". What's the answer?

answer: \
/.ollama/models/manifests/registry.ollama.ai/library/gemma# ollama run gemma:2b \
\>>> 10 * 10 , what is the answer ? \
Sure, the answer is 100.

Q4. Donwloading the weights 

We don't want to pull the weights every time we run a docker container. Let's do it once and have them available every time we start a container.

First, we will need to change how we run the container.

Instead of mapping the /root/.ollama folder to a named volume, let's map it to a local directory:


mkdir ollama_files

docker run -it \
    --rm \
    -v ./ollama_files:/root/.ollama \
    -p 11434:11434 \
    --name ollama \
    ollama/ollama


Now pull the model:

docker exec -it ollama ollama pull gemma:2b 

What's the size of the ollama_files/models folder?

0.6G

1.2G

1.7G

2.2G

Hint: on linux, you can use du -h for that.

answer:

@liveisliiife ➜ /workspaces/llm-zoomcamp/02-Open Source LLMs (main) $ du -h ollama_files

1.6G    ollama_files/models/blobs

8.0K    ollama_files/models/manifests/registry.ollama.ai/library/gemma

12K     ollama_files/models/manifests/registry.ollama.ai/library

16K     ollama_files/models/manifests/registry.ollama.ai

20K     ollama_files/models/manifests

1.6G    ollama_files/models

1.6G    ollama_files

@liveisliiife ➜ /workspaces/llm-zoomcamp/02-Open Source LLMs (main) $ du -sh ollama_files

1.6G    ollama_files

so,answer is 1.7G



Q5. Adding the weights

Let's now stop the container and add the weights to a new image

For that, let's create a Dockerfile:

FROM ollama/ollama

COPY ...

What do you put after COPY?

answer: ollama_files /root/.ollama

Q6. Serving it

Let's build it:

docker build -t ollama-gemma2b .

And run it:

docker run -it --rm -p 11434:11434 ollama-gemma2b
We can connect to it using the OpenAI client

Let's test it with the following prompt:

prompt = "What's the formula for energy?"
Also, to make results reproducible, set the temperature parameter to 0:

response = client.chat.completions.create(
    #...
    temperature=0.0
)

How many completion tokens did you get in response?

304

604

904

1204

answer:

@liveisliiife ➜ /workspaces/llm-zoomcamp/02-Open Source LLMs (main) $ ipython \
Python 3.12.1 (main, Aug  8 2024, 18:45:38) [GCC 9.4.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.26.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: ls \
Dockerfile                    hw2.ipynb
__pycache__/                  ollama.ipynb
docker                        ollama_elasticsearch_docker.ipynb
docker-compose.yaml           ollama_files/
huggingface_flan_T5.ipynb     prompt.md
huggingface_mistral_7b.ipynb  qa_faq.py
huggingface_phi3.ipynb        starter.ipynb

In [2]: prompt = "What's the formula for energy?"

In [3]: from openai import OpenAI \
   ...:  \
   ...: client = OpenAI(  \
   ...:        base_url='http://localhost:11434/v1/', \
   ...:        api_key='ollama', \
   ...: ) 


In [4]: response = client.chat.completions.create( \
   ...:     model="gemma:2b", \
   ...:     messages=[{"role":"user","content":prompt}], \
   ...:     temperature=0.0 \
   ...: ) 


In [5]: response \
Out[5]: ChatCompletion(id='chatcmpl-985', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Sure, here's the formula for energy:\n\n**E = K + U**\n\nWhere:\n\n* **E** is the energy in joules (J)\n* **K** is the kinetic energy in joules (J)\n* **U** is the potential energy in joules (J)\n\n**Kinetic energy (K)** is the energy an object possesses when it moves or is in motion. It is calculated as half the product of an object's mass (m) and its velocity (v) squared:\n\n**K = 1/2 * m * v^2**\n\n**Potential energy (U)** is the energy an object possesses when it is in a position or has a specific configuration. It is calculated as the product of an object's mass and the gravitational constant (g) multiplied by the height or distance of the object from a reference point.\n\n**Gravitational potential energy (U)** is given by the formula:\n\n**U = mgh**\n\nWhere:\n\n* **m** is the mass of the object in kilograms (kg)\n* **g** is the acceleration due to gravity in meters per second squared (m/s^2)\n* **h** is the height or distance of the object in meters (m)\n\nThe formula for energy can be used to calculate the total energy of an object, the energy of a specific part of an object, or the change in energy of an object over time.", refusal=None, role='assistant', function_call=None, tool_calls=None))], created=1724319021, model='gemma:2b', object='chat.completion', service_tier=None, system_fingerprint='fp_ollama', usage=CompletionUsage(completion_tokens=304, prompt_tokens=34, total_tokens=338))


In [6]: total_tokens = response.usage.total_tokens 

In [7]: total_tokens \
Out[7]: 338 

304 is the closest answer.
