Skip to content

Latest commit

 

History

History
172 lines (108 loc) · 3.68 KB

homework.md

File metadata and controls

172 lines (108 loc) · 3.68 KB

Homework: Open-Source LLMs

In this homework, we'll experiment more with Ollama

Q1. Running Ollama with Docker

Let's run ollama with Docker. We will need to execute the same command as in the lectures:

docker run -it \
    --rm \
    -v ollama:/root/.ollama \
    -p 11434:11434 \
    --name ollama \
    ollama/ollama

What's the version of ollama client?

To find out, enter the container and execute ollama with the -v flag.

Answer: Enter container: docker exec -it ollama bash

ollamaversion: 0.1.44

Q2. Downloading an LLM

We will donwload a smaller LLM - gemma:2b.

Again let's enter the container and pull the model:

ollama pull gemma:2b

In docker, it saved the results into /root/.ollama

We're interested in the metadata about this model. You can find it in models/manifests/registry.ollama.ai/library

What's the content of the file related to gemma?

Answer:

{"schemaVersion":2,"mediaType":"application/vnd.docker.distribution.manifest.v2+json","config":{"mediaType":"application/vnd.docker.container.image.v1+json","digest":"sha256:887433b89a901c156f7e6944442f3c9e57f3c55d6ed52042cbb7303aea994290","size":483},"layers":[{"mediaType":"application/vnd.ollama.image.model","digest":"sha256:c1864a5eb19305c40519da12cc543519e48a0697ecd30e15d5ac228644957d12","size":1678447520},{"mediaType":"application/vnd.ollama.image.license","digest":"sha256:097a36493f718248845233af1d3fefe7a303f864fae13bc31a3a9704229378ca","size":8433},{"mediaType":"application/vnd.ollama.image.template","digest":"sha256:109037bec39c0becc8221222ae23557559bc594290945a2c4221ab4f303b8871","size":136},{"mediaType":"application/vnd.ollama.image.params","digest":"sha256:22a838ceb7fb22755a3b0ae9b4eadde629d19be1f651f73efb8c6b4e2cd0eea0","size":84}]}

Q3. Running the LLM

Test the following prompt: "10 * 10". What's the answer?

Answer:

run the model: ollama run gemma:2b

prompt input: "10*10"

output:

The model follows the natural language instructions and will execute the following code:

10 * 10

The output of this code will be:

100

Q4. Downloading the weights

We don't want to pull the weights every time we run a docker container. Let's do it once and have them available every time we start a container.

First, we will need to change how we run the container.

Instead of mapping the /root/.ollama folder to a named volume, let's map it to a local directory:

mkdir ollama_files

docker run -it \
    --rm \
    -v ./ollama_files:/root/.ollama \
    -p 11434:11434 \
    --name ollama \
    ollama/ollama

Now pull the model:

docker exec -it ollama ollama pull gemma:2b

What's the size of the ollama_fikes/models folder?

  • 0.6G
  • 1.2G
  • 1.7G
  • 2.2G

Hint: on linux, you can use du -h for that.

Answer:

~1.6G

Q5. Adding the weights

Let's now stop the container and add the weights to a new image

For that, let's create a Dockerfile:

FROM ollama/ollama

COPY ...

What do you put after COPY?

Answer:

/ollama_files

Q6. Serving it

Let's build it:

docker build -t ollama-gemma2b .

And run it:

docker run -it --rm -p 11434:11434 ollama-gemma2b

We can connect to it using the OpenAI client

Let's test it with the following prompt:

prompt = "What's the formula for energy?"

Also, to make results reproducible, set the temperature parameter to 0:

response = client.chat.completions.create(
    #...
    temperature=0.0
)

How many completion tokens did you get in response?

  • 304
  • 604
  • 904
  • 1204

Answer: build image: docker build -t ollama-gemma2b .

run image: docker run -it --rm -p 11434:11434 --name ollama-gemma2b ollama-gemma2b

enter image: docker exec -it ollama-gemma2b bash

result: 304