# Homework 2: Introduction for [LLM Zoomcamp 2024](https://courses.datatalks.club/llm-zoomcamp-2024/)
- [homework](https://github.com/DataTalksClub/llm-zoomcamp/blob/main/cohorts/2024/02-open-source/homework.md)
- Due date: 9 July 2024 01:00 (local time)
- [Submit here](https://courses.datatalks.club/llm-zoomcamp-2024/homework/hw2)
    - use private github account for log in

## Extra questions

- Homework URL: https://github.com/alexkolo/llm-zoomcamp-2024/blob/main/cohorts/2024/02-open-source/module02.ipynb
- Time spent on lectures (hours): ~2h
- Time spent on homework (hours): ~2h

## Q1. Running Ollama with Docker

Let's run ollama with Docker. We will need to execute the 
same command as in the lectures:

```bash
docker run -it \
    --rm \
    -v ollama:/root/.ollama \
    -p 11434:11434 \
    --name ollama \
    ollama/ollama
```

What's the version of ollama client? 

To find out, enter the container and execute `ollama` with the `-v` flag.

### Docker Command explained

1. **`docker run`**: This is the basic Docker command to run a container.

2. **`-it`**: These are two flags combined:
    - `-i` (interactive): Keeps the STDIN open even if not attached.
    - `-t` (tty): Allocates a pseudo-TTY, which provides an interactive terminal session.

3. **`--rm`**: Automatically removes the container when it exits. This ensures that you don't have leftover stopped containers.

4. **`-v ollama:/root/.ollama`**: This mounts a volume. The `-v` flag specifies a volume mount:
    - `ollama` is the name of the Docker volume on the host.
    - `/root/.ollama` is the path inside the container where the volume will be mounted. This allows for persistent storage of data that is kept even when the container is removed.

5. **`-p 11434:11434`**: This publishes a container's port(s) to the host. The `-p` flag specifies port mapping:
    - `11434:11434` means that port 11434 on the host is mapped to port 11434 on the container. This makes the container's service accessible via port 11434 on the host machine.

6. **`--name ollama`**: Assigns a name to the container. In this case, the container will be named "ollama".

7. **`ollama/ollama`**: This is the image to run. `ollama/ollama` refers to a Docker image, which includes the application and its dependencies.

### Code

```bash
docker exec -it ollama /bin/bash
ollama -v
```

### Answer

`0.1.48`

## Q2. Downloading an LLM 

We will donwload a smaller LLM - gemma:2b. 

Again let's enter the container and pull the model:

```bash
ollama pull gemma:2b
```

In docker, it saved the results into `/root/.ollama`

We're interested in the metadata about this model. You can find
it in `models/manifests/registry.ollama.ai/library`

What's the content of the file related to gemma?


### Code

```bash
docker exec -it ollama /bin/bash
ollama pull gemma:2b
cat /root/.ollama/models/manifests/registry.ollama.ai/library/gemma/2b
```

### Answer

```json
{"schemaVersion":2,"mediaType":"application/vnd.docker.distribution.manifest.v2+json","config":{"mediaType":"application/vnd.docker.container.image.v1+json","digest":"sha256:887433b89a901c156f7e6944442f3c9e57f3c55d6ed52042cbb7303aea994290","size":483},"layers":[{"mediaType":"application/vnd.ollama.image.model","digest":"sha256:c1864a5eb19305c40519da12cc543519e48a0697ecd30e15d5ac228644957d12","size":1678447520},{"mediaType":"application/vnd.ollama.image.license","digest":"sha256:097a36493f718248845233af1d3fefe7a303f864fae13bc31a3a9704229378ca","size":8433},{"mediaType":"application/vnd.ollama.image.template","digest":"sha256:109037bec39c0becc8221222ae23557559bc594290945a2c4221ab4f303b8871","size":136},{"mediaType":"application/vnd.ollama.image.params","digest":"sha256:22a838ceb7fb22755a3b0ae9b4eadde629d19be1f651f73efb8c6b4e2cd0eea0","size":84}]}
```

## Q3. Running the LLM

Test the following prompt: "10 * 10". What's the answer?

### Code

```bash
docker exec -it ollama /bin/bash
ollama pull gemma:2b
ollama run gemma:2b "10 * 10"   
```

### Answer

```html
Sure, here is the answer:

10 * 10<sup>end_of_turn</sup>

This expression evaluates to 100, which is 10 multiplied by 10<sup>2</sup>.
```


## Q4. Donwloading the weights 

We don't want to pull the weights every time we run
a docker container. Let's do it once and have them available
every time we start a container.

First, we will need to change how we run the container.

Instead of mapping the `/root/.ollama` folder to a named volume,
let's map it to a local directory:

```bash
mkdir ollama_files

docker run -it \
    --rm \
    -v ./ollama_files:/root/.ollama \
    -p 11434:11434 \
    --name ollama \
    ollama/ollama
```

Now pull the model:

```bash
docker exec -it ollama ollama pull gemma:2b 
```

What's the size of the `ollama_files/models` folder? 

* 0.6G
* 1.2G
* 1.7G
* 2.2G

Hint: on linux, you can use `du -h` for that.

### Code

1. Terminal
  
    ```bash
    mkdir ollama_files

    docker run -it \
        --rm \
        -v ./ollama_files:/root/.ollama \
        -p 11434:11434 \
        --name ollama \
        ollama/ollama
    ```

2. Terminal

    ```bash
    docker exec -it ollama ollama pull gemma:2b 
    du -h ollama_files
    ```

### Answer

`1.6G    ollama_files/models`

-> closest: ` 1.7G`

### clean

```bash
docker exec -it ollama /bin/bash
ollama list
ollama rm gemma:2b
```

## Q5. Adding the weights 

Let's now stop the container and add the weights 
to a new image

For that, let's create a `Dockerfile`:

```dockerfile
FROM ollama/ollama

COPY ...
```

What do you put after `COPY`?

### Code

```bash
docker build -f ./cohorts/2024/02-open-source/Dockerfile -t new_ollama_image .
docker run -it --rm -p 11434:11434 --name new_ollama_container new_ollama_image
# another terminal
docker exec -it new_ollama_container /bin/bash
```

```bash
docker stop new_ollama_container
docker rmi new_ollama_image
docker images # check if removed
```

### Answer

`ollama_files /root/.ollama`

## Q6. Serving it 

Let's build it:

```bash
docker build -t ollama-gemma2b .
```

And run it:

```bash
docker run -it --rm -p 11434:11434 ollama-gemma2b
```

We can connect to it using the OpenAI client

Let's test it with the following prompt:

```python
prompt = "What's the formula for energy?"
```

Also, to make results reproducible, set the `temperature` parameter to 0:

```bash
response = client.chat.completions.create(
    #...
    temperature=0.0
)
```

How many completion tokens did you get in response?

* 304
* 604
* 904
* 1204

### Answer

- for `tiktoken` "gpt-4o": 256
- for `tiktoken` "gpt-4": 261
- https://tiktokenizer.vercel.app/  gemma-7b : 292


-> closest: `304`

### Code

```bash
docker build -f ./cohorts/2024/02-open-source/Dockerfile -t ollama-gemma2b .
docker run -it --rm -p 11434:11434 ollama-gemma2b
```

# another terminal
`docker exec -it new_ollama_container /bin/bash`

In [2]:
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11434/v1/",
    api_key="ollama",
)
prompt = "What's the formula for energy?"

In [5]:
response = client.chat.completions.create(
    model="gemma:2b",
    messages=[{"role": "user", "content": prompt}],
    temperature=0.0,
)

In [11]:
repoonse_text = response.choices[0].message.content

In [18]:
import tiktoken

encoding = tiktoken.encoding_for_model("gpt-4")

In [19]:
# Encode the text to get the tokens
tokens = encoding.encode(repoonse_text)

In [20]:
print(f"Question 6. Number of tokens {len(tokens)}")

Question 6. Number of tokens 261


In [16]:
print(repoonse_text)

Sure, here's the formula for energy:

**E = K + U**

Where:

* **E** is the energy in joules (J)
* **K** is the kinetic energy in joules (J)
* **U** is the potential energy in joules (J)

**Kinetic energy (K)** is the energy an object possesses when it moves or is in motion. It is calculated as half the product of an object's mass (m) and its velocity (v) squared:

**K = 1/2mv^2**

**Potential energy (U)** is the energy an object possesses due to its position or configuration. It is calculated as the product of an object's mass, gravitational constant (g), and height or position above a reference point.

**U = mgh**

Where:

* **m** is the mass in kilograms (kg)
* **g** is the gravitational constant (9.8 m/s^2)
* **h** is the height or position in meters (m)

The formula shows that energy can be expressed as the sum of kinetic and potential energy. The kinetic energy is a measure of the object's ability to do work, while the potential energy is a measure of the object's ability to do w

# Clean docker

`docker system prune -a`