# Week 8a: Multimodal LLMs with Ollama

Today you will be using Ollama to download and run state of the art multi-modal LLMs. 

Ollama has an online database of recent state of the art open-weight and open-source LLMs models, such as Meta's LLAMA class of models, Deep Seek, Mistral. You can see the list of models available here: https://ollama.com/search

With Ollama LLMs models can be interacted with in their standalone CLI software, or you can connect Ollama to Python code using [their python api](https://pypi.org/project/ollama/) to build interactive software interfaces and other LLM powered applications in Python.

### Setting up your Python environment

Before you work through this notebook, please follow the instructions in [Setup-and-test-conda-environment.ipynb](Setup-and-test-conda-environment.ipynb)

Once you have done that you will need to make sure that the environment selected to run this notebook and all the other notebooks used in this unit is called `aim`. 

To do this click the **Select kernel** button in the top right corner of this notebook, and then select `aim`.

To make sure that is configured properly, Hit the run cell button (▶) on the cell below:

In [None]:
import os
print(os.environ['CONDA_DEFAULT_ENV'])

Does it output the text `aim`?

If it does not output the text `aim`, please revisit and follow the instructions in [Setup-and-test-conda-environment.ipynb](Setup-and-test-conda-environment.ipynb).

If you still cannot get it working, please raise this with the course instructor. 

## Step 1: Install Ollama

Download the standalone Ollama application for your operating system from the following url: https://ollama.com/download

Follow the instructions to download and install Ollama on your computer.

## Step 2: Download LLAVA

Now you can use Ollama to download the weights of an LLM from their database. Today you will be using [LLAVA](https://llava-vl.github.io/), which is a multimodal LLM that can perform reasoning on text and images.

To download LLAVA open your respective CLI software and run the command:

```
conda activate aim
```
```
ollama pull llava:7b
```

After successfully downloading the weights for LLAVA now you can test to see if it works by running:

```
ollama run llava:7b
```

Try typing a question into the terminal session to ask LLAVA a question.

## Step 3: Run LLAVA via python

Now you can try running LLAVA via the python API. Run the following code to test this with text:

In [None]:
from ollama import chat

from ollama import ChatResponse

response: ChatResponse = chat(model='llava:7b', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])
print(response['message']['content'])
# or access fields directly from the response object
print(response.message.content)

### Step 4: Reason about an image with LLAVA

Now you will see how you can load in an image and reason about it using LLAVA.

First you will need to load in an image. Let's use one from earlier in the term:

In [None]:
from PIL import Image
image_path = 'media/simple-mlp.png'
im = Image.open('media/simple-mlp.png')
im

Now using the following code you will see how we can give this image to a multimodal LLM and ask it questions relating to the image.

This code uses a data stream to get the data as it is being generated. After the response is streamed via ollama it will be printed out.

In [None]:
from util.image_helper import get_image_bytes
from util.llm_helper import analyze_image_file, stream_parser

stream = analyze_image_file(image_path, model='llava:7b', user_prompt='What is in this image?')
       
parsed = stream_parser(stream)

out_text = ''
for chunk in parsed:
    out_text+=chunk

print(out_text)

### Step 5: Interact with the model in streamlit 

Now your task is to implement an interactive application in streamlit [week-8b-llava-streamlit-app.py](week-8b-llava-streamlit-app.py)

Most of the code is given for you for building the user interface and loading in an image. Your task is to use write using the example about to analyse the image file that the user has uploaded and display a response:

**Task 1:** Call the function `analyse_image_file`, you will need to pass in the uploaded image, specify the correct model that you have used in this notebook, and pass in the variable `chat_input` as the user prompt.

**task 2:** Use the streamlit function [st.write_stream](https://docs.streamlit.io/develop/api-reference/write-magic/st.write_stream) to display the streamed text. You will need to call the function `stream_parser` first before passing this to the `write_stream` function.