# Ollama + OpenAI + Python + Jupyter
**Credits:** Section 1- 6 is the work done by Pamela Fox, Python Cloud Advocate, Microsoft. You can read the original notebook here - https://github.com/pamelafox/ollama-python-playground/blob/main/ollama.ipynb

Section 0 is written by Balaji Alwar, Service Lead, Datahub
Tested tinylla model on July 8th on https://data100.datahub.berkeley.edu/ and https://datahub.berkeley.edu/. Ideally, would recommend 2 GB/4 GB RAM to execute the cells in the notebook.

### Welcome to the Ollama Model Tutorial

**Overview**
In this tutorial, you will learn how to install and interact with the Ollama model on a Jupyter server. This hands-on guide will take you through the necessary steps to set up the environment, download the model, and perform basic operations with it.

**What You Will Learn**
- **Introduction to Ollama:** Understand what Ollama is and its applications
- **Setting Up the Environment:** Learn how to set up your Jupyter server environment to work with Ollama.
- **Downloading and Installing Ollama:** Step-by-step instructions on downloading the Ollama executable file. Making the downloaded file executable.
- **Running Ollama Commands:** Execute basic Ollama commands from within the Jupyter notebook. Interact with the Ollama model to perform specific tasks.
- **Using Ollama for Machine Learning:** Load a pre-trained model using Ollama. Run the model to make predictions or analyze data.
- **Practical Examples:** Walkthrough of practical examples to solidify your understanding. Apply Ollama to real-world datasets.

**Prerequisites**
- **Basic Knowledge of Command Line:** Familiarity with basic command-line operations will be helpful.
- **Python Basics:** Understanding basic Python programming concepts. 
- **Jupyter Notebook Usage:** Basic knowledge of how to navigate and use Jupyter notebooks.

Let's Get Started!

By the end of this tutorial, you'll have a solid understanding of how to set up and use Ollama within a Jupyter environment.

## 0. Install Ollama models in Jupyterhub

The OpenAI Python package is a powerful tool that allows you to interact with OpenAI's API to access state-of-the-art machine learning models, including language models like GPT-3. This package provides a convenient way to integrate advanced AI capabilities into your projects, enabling you to perform tasks such as natural language processing, text generation, translation, and more. Commands below install OpenAI package if it is not installed previously

In [None]:
try:
    import openai
except ImportError:
    !pip install openai
    import openai

The exclamation mark (!) is used in Jupyter notebooks to indicate that the following command should be executed in the shell (i.e., as a command-line instruction). This allows users to run shell commands directly from a Jupyter notebook cell. The below command navigates to your home directory in the Jupyter server

In [None]:
!cd

wget is a command-line utility for downloading files from the web. It supports HTTP, HTTPS, and FTP protocols, making it versatile for retrieving content from various types of URLs. The below command downloads the required ollama binary files to the home directory in your Jupyter Server. You will be using the downloaded file to launch the ollama server

In [None]:
!wget https://github.com/ollama/ollama/releases/download/v0.1.48/ollama-linux-amd64

ls command lists available files in your current directory. Check whether ollama-linux-amd64 binary file is available in your home directory

In [None]:
!ls

Import os package; The below command imports the os module in Python, which provides a way to interact with the operating system. The os module allows you to execute system commands, manipulate the file system, and perform other OS-level operations.

In [None]:
import os

So,the below command makes the file ollama-linux-amd64 executable. After running this command, you can run this file as a program.

In [None]:
os.system("chmod +x ollama-linux-amd64")

The below command tells the operating system to run the file ollama-linux-amd64. The ./ at the beginning specifies that the file is in the current directory. serve: This is an argument passed to the ollama-linux-amd64 program. It tells the program to start a service or server. &: This symbol tells the operating system to run the program in the background as you execute other cells in a notebook. 

In [None]:
os.system("./ollama-linux-amd64 serve&")

The command below pulls the TinyLlama model from ollama library and launches the model in your Jupyter server. Ollama supports a list of models available on ollama.com/library. TinyLlama is a compact model with only 1.1B parameters. This compactness allows it to cater to a multitude of applications demanding a restricted computation and memory footprint.

In [None]:
os.system("./ollama-linux-amd64 run tinyllama")

The command below lists the models that are currently installed in your Jupyter server

In [None]:
os.system("./ollama-linux-amd64 list")

## 1. Specify the model name

If you pulled in a different model than "tinyllama", change the value in the cell below.
That variable will be used in code throughout the notebook.

In [None]:
MODEL_NAME = "tinyllama"

## 2. Setup the Open AI client

Typically the OpenAI client is used with OpenAI.com or Azure OpenAI to interact with large language models.
However, it can also be used with Ollama, since Ollama provides an OpenAI-compatible endpoint at "http://localhost:11434/v1".

In [None]:
import openai

client = openai.OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="nokeyneeded",
)

## 3. Generate a chat completion

Now we can use the OpenAI SDK to generate a response for a conversation. This request should generate a haiku about cats:

In [None]:
response = client.chat.completions.create(
    model=MODEL_NAME,
    temperature=0.7,
    n=1,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Write a haiku about a hungry cat"},
    ],
)

print("Response:")
print(response.choices[0].message.content)


## 4. Another example

In [None]:
SYSTEM_MESSAGE = """
I want you to act like Elmo from Sesame Street.
I want you to respond and answer like Elmo using the tone, manner and vocabulary that Elmo would use.
Do not write any explanations. Only answer like Elmo.
You must know all of the knowledge of Elmo, and nothing more.
"""

USER_MESSAGE = """
Hi Elmo, how are you doing today?
"""

response = client.chat.completions.create(
    model=MODEL_NAME,
    temperature=0.7,
    n=1,
    messages=[
        {"role": "system", "content": SYSTEM_MESSAGE},
        {"role": "user", "content": USER_MESSAGE},
    ],
)

print("Response:")
print(response.choices[0].message.content)


## 5. Few shot examples

Another way to guide a language model is to provide "few shots", a sequence of example question/answers that demonstrate how it should respond.

The example below tries to get a language model to act like a teaching assistant by providing a few examples of questions and answers that a TA might give, and then prompts the model with a question that a student might ask.

Try it first, and then modify the `SYSTEM_MESSAGE`, `EXAMPLES`, and `USER_MESSAGE` for a new scenario.

In [None]:
SYSTEM_MESSAGE = """
You are a helpful assistant that helps students with their homework.
Instead of providing the full answer, you respond with a hint or a clue.
"""

EXAMPLES = [
    (
        "What is the capital of France?",
        "Can you remember the name of the city that is known for the Eiffel Tower?"
    ),
    (
        "What is the square root of 144?",
        "What number multiplied by itself equals 144?"
    ),
    (   "What is the atomic number of oxygen?",
        "How many protons does an oxygen atom have?"
    ),
]

USER_MESSAGE = "What is the largest planet in our solar system?"


response = client.chat.completions.create(
    model=MODEL_NAME,
    temperature=0.7,
    n=1,
    messages=[
        {"role": "system", "content": SYSTEM_MESSAGE},
        {"role": "user", "content": EXAMPLES[0][0]},
        {"role": "assistant", "content": EXAMPLES[0][1]},
        {"role": "user", "content": EXAMPLES[1][0]},
        {"role": "assistant", "content": EXAMPLES[1][1]},
        {"role": "user", "content": EXAMPLES[2][0]},
        {"role": "assistant", "content": EXAMPLES[2][1]},
        {"role": "user", "content": USER_MESSAGE},
    ],
)


print("Response:")
print(response.choices[0].message.content)

## 6. Retrieval Augmented Generation

RAG (Retrieval Augmented Generation) is a technique to get a language model to answer questions accurately for a particular domain, by first retrieving relevant information from a knowledge source and then generating a response based on that information.

We have provided a local CSV file with data about hybrid cars. The code below reads the CSV file, searches for matches to the user question, and then generates a response based on the information found. Note that this will take longer than any of the previous examples, as it sends more data to the model. If you notice the answer is still not grounded in the data, you can try system engineering or try other models. Generally, RAG is more effective with either larger models or with fine-tuned versions of SLMs.

In [None]:
import csv

SYSTEM_MESSAGE = """
You are a helpful assistant that answers questions about cars based off a hybrid car data set.
You must use the data set to answer the questions, you should not provide any information that is not in the provided sources.
"""

USER_MESSAGE = "how fast is a prius?"

# Open the CSV and store in a list
with open("hybrid.csv", "r") as file:
    reader = csv.reader(file)
    rows = list(reader)

# Normalize the user question to replace punctuation and make lowercase
normalized_message = USER_MESSAGE.lower().replace("?", "").replace("(", " ").replace(")", " ")

# Search the CSV for user question using very naive search
words = normalized_message.split()
matches = []
for row in rows[1:]:
    # if the word matches any word in row, add the row to the matches
    if any(word in row[0].lower().split() for word in words) or any(word in row[5].lower().split() for word in words):
        matches.append(row)

# Format as a markdown table, since language models understand markdown
matches_table = " | ".join(rows[0]) + "\n" + " | ".join(" --- " for _ in range(len(rows[0]))) + "\n"
matches_table += "\n".join(" | ".join(row) for row in matches)
print(f"Found {len(matches)} matches:")
print(matches_table)

# Now we can use the matches to generate a response
response = client.chat.completions.create(
    model=MODEL_NAME,
    temperature=0.7,
    n=1,
    messages=[
        {"role": "system", "content": SYSTEM_MESSAGE},
        {"role": "user", "content": USER_MESSAGE + "\nSources: " + matches_table},
    ],
)

print("Response:")
print(response.choices[0].message.content)