Introduction

This notebook demonstrates how Ollama's Llama 3.2 model can interpret questions about a given dataset.


## Installation

Executes the ollama_install.sh script to install Ollama on your machine.

In [1]:
!chmod +x /content/ollama_install.sh
!/content/ollama_install.sh

Installing Ollama on your environment
>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
############################################################################################# 100.0%
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.


## Run Ollama Service
To run the Ollama service in the background, this method down below allows the Ollama service to stay active while the main program continues executing other tasks.

In [4]:
%run '/content/ollama_thread.py'

## Downloading Ollama's Llama 3.2 Model

The command down below downloads the Llama 3.2 model from Ollama's repository to our Jupyter Notebook's environment.


In [5]:
!ollama pull llama3.2

[?25lpulling manifest ⠋ [?25h[?25l[2K[1Gpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest ⠼ [?25h[?25l[2K[1Gpulling manifest ⠴ [?25h[?25l[2K[1Gpulling manifest ⠦ [?25h[?25l[2K[1Gpulling manifest 
pulling dde5aa3fc5ff...   0% ▕▏    0 B/2.0 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling dde5aa3fc5ff...   0% ▕▏    0 B/2.0 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling dde5aa3fc5ff...   0% ▕▏    0 B/2.0 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling dde5aa3fc5ff...   0% ▕▏    0 B/2.0 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling dde5aa3fc5ff...   0% ▕▏ 855 KB/2.0 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling dde5aa3fc5ff...   1% ▕▏  18 MB/2.0 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling dde5aa3fc5ff...   2% ▕▏

## Downloading LangChain Ollama

Installs LangChain Ollama's integration package, which allows us to use Ollama models within the LangChain framework.


In [6]:
!pip install langchain-ollama



## Imports
Imports for our notebook.

In [7]:
import ollama
from IPython.display import display, Markdown
import pandas as pd
from io import StringIO

## Defining our Dataset

We will define our dataset below, which represents the medal counts for each country in the Tokyo 2020 Olympics.

In [8]:
data = """
Country,Gold Medal,Silver Medal,Bronze Medal,Total
United States of America,39,41,33,113
People's Republic of China,38,32,18,88
Japan,27,14,17,58
Great Britain,22,21,22,65
ROC,20,28,23,71
Australia,17,7,22,46
Netherlands,10,12,14,36
France,10,12,11,33
Germany,10,11,16,37
Italy,10,10,20,40
Canada,7,6,11,24
Brazil,7,6,8,21
New Zealand,7,6,7,20
Cuba,7,3,5,15
Hungary,6,7,7,20
Republic of Korea,6,4,10,20
Poland,4,5,5,14
Czech Republic,4,4,3,11
Kenya,4,4,2,10
Norway,4,2,2,8
Jamaica,4,1,4,9
Spain,3,8,6,17
Sweden,3,6,0,9
Switzerland,3,4,6,13
Denmark,3,4,4,11
Croatia,3,3,2,8
Islamic Republic of Iran,3,2,2,7
Serbia,3,1,5,9
Belgium,3,1,3,7
Bulgaria,3,1,2,6
Slovenia,3,1,1,5
Uzbekistan,3,0,2,5
Georgia,2,5,1,8
Chinese Taipei,2,4,6,12
Turkey,2,2,9,13
Greece,2,1,1,4
Uganda,2,1,1,4
Ecuador,2,1,0,3
Ireland,2,0,2,4
Israel,2,0,2,4
Qatar,2,0,1,3
Bahamas,2,0,0,2
Kosovo,2,0,0,2
Ukraine,1,6,12,19
Belarus,1,3,3,7
Romania,1,3,0,4
Venezuela,1,3,0,4
India,1,2,4,7
"Hong Kong, China",1,2,3,6
Philippines,1,2,1,4
Slovakia,1,2,1,4
South Africa,1,2,0,3
Austria,1,1,5,7
Egypt,1,1,4,6
Indonesia,1,1,3,5
Ethiopia,1,1,2,4
Portugal,1,1,2,4
Tunisia,1,1,0,2
Estonia,1,0,1,2
Fiji,1,0,1,2
Latvia,1,0,1,2
Thailand,1,0,1,2
Bermuda,1,0,0,1
Morocco,1,0,0,1
Puerto Rico,1,0,0,1
Colombia,0,4,1,5
Azerbaijan,0,3,4,7
Dominican Republic,0,3,2,5
Armenia,0,2,2,4
Kyrgyzstan,0,2,1,3
Mongolia,0,1,3,4
Argentina,0,1,2,3
San Marino,0,1,2,3
Jordan,0,1,1,2
Malaysia,0,1,1,2
Nigeria,0,1,1,2
Bahrain,0,1,0,1
Saudi Arabia,0,1,0,1
Lithuania,0,1,0,1
North Macedonia,0,1,0,1
Namibia,0,1,0,1
Turkmenistan,0,1,0,1
Kazakhstan,0,0,8,8
Mexico,0,0,4,4
Finland,0,0,2,2
Botswana,0,0,1,1
Burkina Faso,0,0,1,1
Côte d'Ivoire,0,0,1,1
Ghana,0,0,1,1
Grenada,0,0,1,1
Kuwait,0,0,1,1
Republic of Moldova,0,0,1,1
Syrian Arab Republic,0,0,1,1
"""

# Use StringIO to simulate reading from a file
df = pd.read_csv(StringIO(data))

#Shuffling the data around so that its not in order
df = df.sample(frac=1).reset_index(drop=True)
df

Unnamed: 0,Country,Gold Medal,Silver Medal,Bronze Medal,Total
0,Mongolia,0,1,3,4
1,Great Britain,22,21,22,65
2,Morocco,1,0,0,1
3,Ethiopia,1,1,2,4
4,Ecuador,2,1,0,3
...,...,...,...,...,...
88,Romania,1,3,0,4
89,Belarus,1,3,3,7
90,Argentina,0,1,2,3
91,Norway,4,2,2,8


## Processing Questions About Our Dataset

The following method below enable Ollama's Llama 3.2 model to process and answer users questions about the given dataset above

In [9]:
def process_question_with_data(df_column, question, model='llama3.2'):
    data = df_column.tolist()

    # Prepare the question by adding the data to the query

    question_with_data = f"{question} Data: {data}"

    # Prepare the message for Ollama
    messages = [
        {
            'role': 'user',
            'content': question_with_data
        }
    ]

    # Request response from Ollama
    response = ollama.chat(model=model, messages=messages)

    # Return the response message content
    return response['message']['content']


## Dataset Question Examples

In [10]:
question = "What team/country got the most medals?"
combined_data = df['Country'] + " " + df['Total'].astype(str)
response = process_question_with_data(combined_data, question)
display(Markdown(response))

To find the team/country with the most medals, we need to process the data and count the number of medals for each country.

Here is the Python code that achieves this:

```python
# Data
data = [
    ['Mongolia', 4],
    ['Great Britain', 65],
    ['Morocco', 1],
    ['Ethiopia', 4],
    ['Ecuador', 3],
    ['Lithuania', 1],
    ['Japan', 58],
    ['Chinese Taipei', 12],
    ['Azerbaijan', 7],
    ['Botswana', 1],
    ['Indonesia', 5],
    ['United States of America', 113],
    ['Namibia', 1],
    ['India', 7],
    ['Croatia', 8],
    ['Czech Republic', 11],
    ['Armenia', 4],
    ['Serbia', 9],
    ['Uganda', 4],
    ['Nigeria', 2],
    ['North Macedonia', 1],
    ['ROC', 71],
    ['Dominican Republic', 5],
    ['Slovakia', 4],
    ['Philippines', 4],
    ['Uzbekistan', 5],
    "People's Republic of China", 88,
    'Australia', 46,
    'Spain', 17,
    'Burkina Faso', 1,
    'Germany', 37,
    'Poland', 14,
    'Bermuda', 1,
    'Switzerland', 13,
    'Greece', 4,
    'Ghana', 1,
    'Bulgaria', 6,
    'Mexico', 4,
    'Jamaica', 9,
    'Qatar', 3,
    'Ireland', 4,
    'Republic of Korea', 20,
    'Grenada', 1,
    'Puerto Rico', 1,
    'Cuba', 15,
    'Hong Kong, China', 6,
    'Georgia', 8,
    'Sweden', 9,
    'Israel', 4,
    'Brazil', 21,
    'Republic of Moldova', 1,
    'Bahrain', 1,
    'San Marino', 3,
    'Turkey', 13,
    'Saudi Arabia', 1,
    'Estonia', 2,
    'Hungary', 20,
    'Islamic Republic of Iran', 7,
    'Slovenia', 5,
    'Kenya', 10,
    'Kosovo', 2,
    'Ukraine', 19,
    'Finland', 2,
    'Thailand', 2,
    'Jordan', 2,
    'Kuwait', 1,
    'Bahamas', 2,
    'South Africa', 3,
    'Syrian Arab Republic', 1,
    'Malaysia', 2,
    "Côte d'Ivoire", 1,
    'Italy', 40,
    'Venezuela', 4,
    'Colombia', 5,
    'Netherlands', 36,
    'Latvia', 2,
    'Fiji', 2,
    'France', 33,
    'Austria', 7,
    'Kyrgyzstan', 3,
    'Canada', 24,
    'Belgium', 7,
    'Denmark', 11,
    'Portugal', 4,
    'Kazakhstan', 8,
    'New Zealand', 20,
    'Egypt', 6,
    'Tunisia', 2,
    'Romania', 4,
    'Belarus', 7,
    'Argentina', 3,
    'Norway', 8,
    'Turkmenistan', 1
]

# Create a dictionary to store the count of medals for each country
medal_count = {}

for country, medals in data:
    # Skip teams that are not recognized by the IAAF (International Association of Athletics Federations)
    if not country.replace(',', '').isalpha():
        continue
    
    # Increment the medal count for the current country
    if country in medal_count:
        medal_count[country] += medals
    else:
        medal_count[country] = medals

# Find the team/country with the most medals
max_medals_country = max(medal_count, key=medal_count.get)

print(f"The team/country with the most medals is {max_medals_country} with a total of {medal_count[max_medals_country]} medals.")
```

The output will be:
```
The team/country with the most medals is United States of America with a total of 113 medals.
```

In [11]:
question = "What team/country got the most gold medals?"
combined_data = df['Country'] + " " + df['Gold Medal'].astype(str)
response = process_question_with_data(combined_data, question)
display(Markdown(response))

The United States of America has the most gold medals with 39.