Introduction

This notebook demonstrates how Ollama's Llama 3.1:8b model can interpret questions about a given dataset.


## Installation

Executes the ollama_install.sh script to install Ollama on your machine. Ensure that curl is installed in your environment.

In [1]:
!chmod +x /content/ollama_install.sh
!/content/ollama_install.sh

Installing Ollama on your environment
>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
############################################################################################# 100.0%
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.


## Run Ollama Service
To run the Ollama service in the background, this method down below allows the Ollama service to stay active while the main program continues executing other tasks.

In [2]:
%run '/content/ollama_thread.py'

## Downloading Ollama's Llama 3.1:8b Model

The command down below downloads the Llama 3.1:8b model from Ollama's repository to our Jupyter Notebook's environment.


In [3]:
!ollama pull llama3.1:8b

[?25lpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠼ [?25h[?25l[2K[1Gpulling manifest ⠼ [?25h[?25l[2K[1Gpulling manifest ⠴ [?25h[?25l[2K[1Gpulling manifest ⠧ [?25h[?25l[2K[1Gpulling manifest ⠇ [?25h[?25l[2K[1Gpulling manifest ⠇ [?25h[?25l[2K[1Gpulling manifest 
pulling 667b0c1932bc...   0% ▕▏    0 B/4.9 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling 667b0c1932bc...   0% ▕▏    0 B/4.9 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling 667b0c1932bc...   0% ▕▏    0 B/4.9 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling 667b0c1932bc...   0% ▕▏    0 B/4.9 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling 667b0c1932bc...   0% ▕▏  19 MB/4.9 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling 667b0c1932bc...   1% ▕▏  39 MB/4.9 GB                  [

## Downloading LangChain Ollama

Installs LangChain Ollama's integration package, which allows us to use Ollama models within the LangChain framework.


In [4]:
!pip install langchain-ollama

Collecting langchain-ollama
  Downloading langchain_ollama-0.2.1-py3-none-any.whl.metadata (1.9 kB)
Collecting ollama<1,>=0.3.0 (from langchain-ollama)
  Downloading ollama-0.4.4-py3-none-any.whl.metadata (4.7 kB)
Collecting httpx<0.28.0,>=0.27.0 (from ollama<1,>=0.3.0->langchain-ollama)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Downloading langchain_ollama-0.2.1-py3-none-any.whl (15 kB)
Downloading ollama-0.4.4-py3-none-any.whl (13 kB)
Downloading httpx-0.27.2-py3-none-any.whl (76 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.4/76.4 kB[0m [31m5.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: httpx, ollama, langchain-ollama
  Attempting uninstall: httpx
    Found existing installation: httpx 0.28.0
    Uninstalling httpx-0.28.0:
      Successfully uninstalled httpx-0.28.0
Successfully installed httpx-0.27.2 langchain-ollama-0.2.1 ollama-0.4.4


## Imports
Imports for our notebook.

In [8]:
import ollama
from IPython.display import display, Markdown
import pandas as pd
from io import StringIO

## Defining our Dataset

We will define our dataset below, which represents the medal counts for each country in the Tokyo 2020 Olympics.

In [14]:
data = """
Country,Gold Medal,Silver Medal,Bronze Medal,Total
United States of America,39,41,33,113
People's Republic of China,38,32,18,88
Japan,27,14,17,58
Great Britain,22,21,22,65
ROC,20,28,23,71
Australia,17,7,22,46
Netherlands,10,12,14,36
France,10,12,11,33
Germany,10,11,16,37
Italy,10,10,20,40
Canada,7,6,11,24
Brazil,7,6,8,21
New Zealand,7,6,7,20
Cuba,7,3,5,15
Hungary,6,7,7,20
Republic of Korea,6,4,10,20
Poland,4,5,5,14
Czech Republic,4,4,3,11
Kenya,4,4,2,10
Norway,4,2,2,8
Jamaica,4,1,4,9
Spain,3,8,6,17
Sweden,3,6,0,9
Switzerland,3,4,6,13
Denmark,3,4,4,11
Croatia,3,3,2,8
Islamic Republic of Iran,3,2,2,7
Serbia,3,1,5,9
Belgium,3,1,3,7
Bulgaria,3,1,2,6
Slovenia,3,1,1,5
Uzbekistan,3,0,2,5
Georgia,2,5,1,8
Chinese Taipei,2,4,6,12
Turkey,2,2,9,13
Greece,2,1,1,4
Uganda,2,1,1,4
Ecuador,2,1,0,3
Ireland,2,0,2,4
Israel,2,0,2,4
Qatar,2,0,1,3
Bahamas,2,0,0,2
Kosovo,2,0,0,2
Ukraine,1,6,12,19
Belarus,1,3,3,7
Romania,1,3,0,4
Venezuela,1,3,0,4
India,1,2,4,7
"Hong Kong, China",1,2,3,6
Philippines,1,2,1,4
Slovakia,1,2,1,4
South Africa,1,2,0,3
Austria,1,1,5,7
Egypt,1,1,4,6
Indonesia,1,1,3,5
Ethiopia,1,1,2,4
Portugal,1,1,2,4
Tunisia,1,1,0,2
Estonia,1,0,1,2
Fiji,1,0,1,2
Latvia,1,0,1,2
Thailand,1,0,1,2
Bermuda,1,0,0,1
Morocco,1,0,0,1
Puerto Rico,1,0,0,1
Colombia,0,4,1,5
Azerbaijan,0,3,4,7
Dominican Republic,0,3,2,5
Armenia,0,2,2,4
Kyrgyzstan,0,2,1,3
Mongolia,0,1,3,4
Argentina,0,1,2,3
San Marino,0,1,2,3
Jordan,0,1,1,2
Malaysia,0,1,1,2
Nigeria,0,1,1,2
Bahrain,0,1,0,1
Saudi Arabia,0,1,0,1
Lithuania,0,1,0,1
North Macedonia,0,1,0,1
Namibia,0,1,0,1
Turkmenistan,0,1,0,1
Kazakhstan,0,0,8,8
Mexico,0,0,4,4
Finland,0,0,2,2
Botswana,0,0,1,1
Burkina Faso,0,0,1,1
Côte d'Ivoire,0,0,1,1
Ghana,0,0,1,1
Grenada,0,0,1,1
Kuwait,0,0,1,1
Republic of Moldova,0,0,1,1
Syrian Arab Republic,0,0,1,1
"""

# Use StringIO to simulate reading from a file
df = pd.read_csv(StringIO(data))

#Shuffling the data around so that its not in order
df = df.sample(frac=1).reset_index(drop=True)
df

Unnamed: 0,Country,Gold Medal,Silver Medal,Bronze Medal,Total
0,Morocco,1,0,0,1
1,Serbia,3,1,5,9
2,Latvia,1,0,1,2
3,Belarus,1,3,3,7
4,Kuwait,0,0,1,1
...,...,...,...,...,...
88,Philippines,1,2,1,4
89,Republic of Moldova,0,0,1,1
90,Ireland,2,0,2,4
91,Jordan,0,1,1,2


## Processing Questions About Our Dataset

The following method below enable Ollama's Llama 3.1:8b model to process and answer users questions about the given dataset above

In [15]:
def process_question_with_data(df_column, question, model='llama3.1:8b'):
    data = df_column.tolist()

    # Prepare the question by adding the data to the query

    question_with_data = f"{question} Data: {data}"

    # Prepare the message for Ollama
    messages = [
        {
            'role': 'user',
            'content': question_with_data
        }
    ]

    # Request response from Ollama
    response = ollama.chat(model=model, messages=messages)

    # Return the response message content
    return response['message']['content']


## Dataset Question Examples

In [17]:
question = "What team/country got the most medals?"
combined_data = df['Country'] + " " + df['Total'].astype(str)
response = process_question_with_data(combined_data, question)
display(Markdown(response))

Let's count the medals!

Here are the results:

**Top 10 countries/teams with the most medals:**

1. **United States of America** (113 medals)
2. People's Republic of China (88 medals)
3. Japan (58 medals)
4. Great Britain (65 medals) 
5. ROC (71 medals)
6. Australia (46 medals)
7. Italy (40 medals)
8. Netherlands (36 medals)
9. Germany (37 medals)
10. Canada (24 medals)

**Note:** I corrected the ranking of Great Britain to 5th and ROC to 2nd

In [18]:
question = "What team/country got the most gold medals?"
combined_data = df['Country'] + " " + df['Gold Medal'].astype(str)
response = process_question_with_data(combined_data, question)
display(Markdown(response))

A fun data question!

According to the data, the teams that got the most gold medals are:

1. **People's Republic of China**: 38 gold medals
2. **United States of America**: 39 gold medals ( wait, no!)
3. **ROC** (Russian Olympic Committee): 20 gold medals
4. **Japan**: 27 gold medals
5. **Australia**: 17 gold medals

But the team that got the most gold medals is actually... **The United States of America**, with a whopping **39** gold medals!

(Note: I'm assuming that "ROC" refers to the Russian Olympic Committee, which was formed after Russia's suspension from international sports competitions due to doping allegations.)