<div class="alert alert-block alert-info">
Author:<br>Felix Gonzalez, P.E. <br> Adjunct Instructor, <br> Division of Professional Studies <br> Computer Science and Electrical Engineering <br> University of Maryland Baltimore County <br> fgonzale@umbc.edu
</div>

# Table of Contents

# Ollama API and Python  

This notebook will provide an example of running LLMs via Ollama API and Python as well as an example of using a custom datafram as input into an LLM model. 

### References and Inspirations:

Throughout these LLM notebooks, besides the official documentation in the previous notebook, there are various online resources and tutorials that were used as inspiration to create the final example here presented. These resources are specified in its related respective section.

References and Inspirations:
- RAG Agent: https://dev.to/dmuraco3/how-to-create-a-local-rag-agent-with-ollama-and-langchain-1m9a
- Local RAG Tutorial (Ollama): https://www.youtube.com/watch?v=Oe-7dGDyzPM&t=372s
- Readability of Output: https://www.youtube.com/watch?v=T8emnz9uaf0&t=184s
- RAG from the Ground Up with Python and Ollama: https://www.youtube.com/watch?v=V1Mz8gMBDMo&t

### Ollama Documentation:
- Ollama Documentation (https://github.com/ollama/ollama/blob/main/README.md): Note the Model Library table and each model computational requirements which may vary significantly. Approximately 1 GB of RAM and 0.8 GB in size for every billion parameters in the model.
- Ollama Github Windows Documentation at: https://github.com/ollama/ollama/blob/main/docs/windows.md


### Ollama API Parameters
List of API Parameters can be found in the documentation at: https://github.com/ollama/ollama/blob/main/docs/api.md. There are two main APIs, the Generate and the Chat. The examples in this notebook utilize the Chat API and include the following parameters: (default: 5m)

#### Parameters for Chat API

- model: (required) the model name
- messages: the messages of the chat, this can be used to keep a chat memory
- tools: list of tools in JSON for the model to use if supported

The message object has the following fields:
- role: the role of the message, either system, user, assistant, or tool
- content: the content of the message
- images (optional): a list of images to include in the message (for multimodal models such as llava)
- tool_calls (optional): a list of tools in JSON that the model wants to use

Advanced parameters (optional):
- format: the format to return a response in. Format can be json or a JSON schema.
- options: additional model parameters listed in the documentation for the Modelfile such as temperature
- stream: if false the response will be returned as a single response object, rather than a stream of objects
- keep_alive: controls how long the model will stay loaded into memory following the request (default: 5m)

# Python Data Library Loading

In [1]:
import requests
import json

import pandas as pd

from IPython.display import clear_output, Markdown, display

# LLM Locally Via Ollama API Requests

Note that before running this notebook you will need to load the Ollama application. 

In [2]:
# Model to use:
model_selection = "mistral" # "mistral" and "llama3.2" are the two options

In [3]:
# Setup the base URL for Locall Ollama API
url = "http://127.0.0.1:11434/api/chat"

In [4]:
def ollama_api_request(prompt):
    # Send the HTTP POST request with streaming enabled.
    payload = {"model" : model_selection, # Model we will be using.
               "messages" : [{"role":"user", "content":prompt}]}
    response = requests.post(url, json = payload, stream = True)

    if response.status_code == 200:
        print("Streaming response from Ollama:")
        for line in response .iter_lines(decode_unicode = True):
            if line: # Ignore empty lines
                try:
                    # Parse each line as a JSON Object
                    json_data = json.loads(line)
                    # Extract and print the assistant's message content
                    if "message" in json_data and "content" in json_data["message"]:
                        print(json_data["message"]["content"], end = "")
                except json.JSONDecodeError:
                    print(f"\nFailed to parse line: {line}")
        print() # Ensure the final output ends with a new line.
    else:
        print(f"Error querying Ollama API: {response.status_code}")

In [5]:
while True: # Loop to continue chatting with Ollama Model.
    user_input = input("How can I assist you? (Type 'Goodbye', 'Bye', or 'Exit' to quit)")
    ollama_api_request(user_input)
    if user_input.lower() == "exit" or user_input.lower() == 'goodbye' or user_input.lower() == 'bye': # Exits program.
        print("Goodbye!")
        break
    else:
        print("")

# Sample Prompts:
# Describe what is Python in three sentences.
# What is a Python snake?
# Tell me about the Python snake animal.
# What date is it today?
# How can I setup a LLM so that it can provide me the date and time?

How can I assist you? (Type 'Goodbye' or 'Exit' to quit) Describe what is Python in three sentences.


Streaming response from Ollama:
1. Python is a high-level, interpreted programming language known for its readability and simplicity, making it an excellent choice for beginners as well as seasoned developers.

2. It offers dynamic typing and garbage collection, which means you don't have to declare the data type of a variable before using it, and Python automatically manages memory that is no longer in use.

3. Python has extensive libraries and frameworks for various applications such as web development (Django, Flask), data analysis (Pandas, NumPy), machine learning (Scikit-learn, TensorFlow) and artificial intelligence, making it a versatile tool in the field of programming.



How can I assist you? (Type 'Goodbye' or 'Exit' to quit) Exit


Streaming response from Ollama:
 Goodbye! If you have any more questions or need assistance in the future, feel free to ask. Have a great day!
Goodbye!


Ollama should have provided a response above.

# Using LLM with Custom Data

One of the main allures of LLMs is using the generative AI capabilities and the ability to ask questions related to datasets that were not using as part of the training. This potentially opens the door for significant efficiency improvements in analytical processes that were previously unseen. This section provides some example datasets and example prompts to evaluate how the model responds.

Note there are various methods to integrate a LLM with a custom dataset.

### Ollama API: Modified Query Function with Data Input

In the code script below, we will add the dataframe as an input parameter into the API request function.

In [6]:
def ollama_api_request_wdata(prompt, input_data_df):
    global response 
    
    df_json = input_data_df.to_json() # Convert DataFrame to JSON
    prompt_content = f'Given this data {df_json}' + ' ' + prompt # Combines data with prompt
    
    # Send the HTTP POST request with streaming enabled.  
    payload = {"model" : model_selection, # Model we will be using.
               "messages" : [{"role":"user", 
                              "content":prompt_content}]}
    response = requests.post(url, json = payload, stream = True)

    if response.status_code == 200:
        # Checks that response is in x-ndjson (Newline-Delimited JSON)
        if response.headers.get("Content-Type") == "application/x-ndjson":
            print("Streaming response from Ollama:")
            for line in response.iter_lines(decode_unicode = True):
                if line: # Ignore empty lines
                    try:
                        # Parse each line as a JSON Object
                        json_data = json.loads(line)
                        # Extract and print the assistant's message content
                        if "message" in json_data and "content" in json_data["message"]:
                            print(json_data["message"]["content"], end = "")
                    except json.JSONDecodeError:
                        print(f"\nFailed to parse line: {line}")
            print() # Ensure the final output ends with a new line.
    else:
        print(f"Error querying Ollama API: {response.status_code}")

### Chat Function

The chat functions starts an instance for making calls into the Ollama API and clears the output of the cell when a new question is asked. This provides a more dynamic interaction between the model and the user in the Jupyter Notebook environment and could easily be moved to a dashboard.

In [7]:
def chat_function(input_data_df):
    while True: # Loop to continue chatting with Ollama Model.
        user_input = input("How can I assist you? (Type 'Goodbye', 'Bye' or 'Exit' to quit)")
        if user_input.lower() == "exit" or user_input.lower() == 'goodbye 'or user_input.lower() == 'bye': # Exits program.
            print("Goodbye!")
            break
        else:
            ollama_api_request_wdata(prompt = user_input, input_data_df = input_data_df)
            print("")
            user_input = input("Do you have another question? (Type Yes, or 'Goodbye', 'Bye' or 'Exit' to quit)")
            if user_input.lower() == "exit" or user_input.lower() == 'goodbye' or user_input.lower() == 'bye':
                clear_output(wait=True)
                print("Goodbye!")
                break
            else:
                clear_output(wait=True)
                continue

# Example 1: Names Data Dictionary

This dataset includes names, age, occupation and hobbies of the different people. The dataset has various features integrated in order to test  questions with focus on analysis of the data. From various occupations at very different levels which can be used to have the model evaluate which occupation is more important, patterns, and misspellings to test how the model reacts.  

In [8]:
# Example DataFrame
data = {'Name': ['Alice', 'Bob', 'John', 'Mary', 'Jackie', 'Camila'], 
        'Age': [25, 30, 45, 60, 45, 21], 
        'Occupation': ['Engineer', 'Doctor', 'Engineer', 'Mechanic', 'Doctor', 'CEO'],
        'Hobby': ['Sailing', 'Basketball', 'Sailing', 'Cycling', 'Biking', 'Sailin']} 
df_names = pd.DataFrame(data)

# Convert DataFrame to JSON
df_json = df_names.to_json()

#### Data Processing and Ollama Query Functions

The Ollama model itself doesn’t natively accept pandas DataFrames as direct input for processing or questioning. One option is to convert the dataframe into text-based structure (like a JSON string, plain text table, or other formats) that Ollama can interpret. Other options traditionally used with LLMs include Retrieval Augmented Generation (RAG).  

For this example, we will explore model performance and accuracy by converting submitting a dataset in JSON format and use it as input into the prompt to ask our questions. Using the previous function, the request to the Ollama API might look like this:

In [9]:
# Prompt with processed data
ollama_api_request(f"Given this data {df_json}. Tell me something interesting about this data?")

Streaming response from Ollama:
 This data contains 6 unique individuals with diverse ages, occupations, and hobbies. Interestingly, both Alice and John have the same occupation as Engineers, and they also share a common hobby - Sailing. Additionally, two people, Alice (age 25) and Camila (age 21), are relatively young compared to Bob (age 30), John (age 45), Mary (age 60), and Jackie (age 45).


In [10]:
chat_function(input_data_df = df_names)

# Sample interesting prompts:
# Tell me 3 interesting facts about this data.
# What can you infer from this data?
# Who has the most important occupation?
# Which person has the most beneficial occupation to humanity?
# Which employees are doctors?
# Who is the youngest employee?
# Calculate the average age of the employees.
# Does it make sense that the youngest employee is a CEO?

How can I assist you? (Type 'Goodbye' or 'Exit' to quit) Exit


Goodbye!


# Example 2: Movie Data

In [40]:
# LOADING CSV FILE
# Na_values may need to be reviewed as some datasets may include an accronym.
# For example, 'NA' may be an abbreviation for 'North America'.
# Release_cy will not be parsed as a date.
df_movies = pd.read_csv('./input_data/df_data_clean_movies_db.csv', 
                        encoding = "utf-8-sig",
                        parse_dates=['release_date', 'release_cy_quarter', 'release_cy_month'],
                        date_format = '%Y-%m-%d') 

In [41]:
df_movies.sample(3) # Check sampling of the data records.

Unnamed: 0,budget,genres,id,original_language,original_title,overview,popularity,production_companies,production_countries,release_date,...,release_cy_month,box_office_earning,revenue_budget_ratio,title_char_count,overview_char_count,title_word_count,overview_word_count,original_title_overview,norm_text_lemma,norm_text_stem
2117,22000000,"[{""id"": 35, ""name"": ""Comedy""}, {""id"": 80, ""nam...",10878,en,Saving Silverman,A pair of buddies conspire to save their best ...,12.887673,"[{""name"": ""Village Roadshow Pictures"", ""id"": 7...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2001-02-09,...,2001-02,41351569,0.879617,16,177,2,31,Saving Silverman: A pair of buddies conspire t...,save silverman pair buddy conspire save best f...,save silverman pair buddi conspir save best fr...
742,90000000,"[{""id"": 16, ""name"": ""Animation""}, {""id"": 35, ""...",863,en,Toy Story 2,"Andy heads off to Cowboy Camp, leaving his toy...",73.575118,"[{""name"": ""Pixar Animation Studios"", ""id"": 3}]","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",1999-10-30,...,1999-10,587366869,5.526299,11,322,3,57,"Toy Story 2: Andy heads off to Cowboy Camp, le...",toy story andy head cowboy camp leave toy devi...,toy stori andi head cowboy camp leav toy devic...
2780,5000000,"[{""id"": 27, ""name"": ""Horror""}, {""id"": 53, ""nam...",170,en,28 Days Later,Twenty-eight days after a killer virus was acc...,45.490374,"[{""name"": ""DNA Films"", ""id"": 284}, {""name"": ""B...","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""}]",2002-10-31,...,2002-10,87719885,16.543977,13,342,3,53,28 Days Later: Twenty-eight days after a kille...,day later twenty eight day killer virus accide...,day later twenti eight day killer viru acciden...


In [42]:
#df_movies.info() # Check how the data loaded.

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3225 entries, 0 to 3224
Data columns (total 46 columns):
 #   Column                   Non-Null Count  Dtype         
---  ------                   --------------  -----         
 0   budget                   3225 non-null   int64         
 1   genres                   3225 non-null   object        
 2   id                       3225 non-null   int64         
 3   original_language        3225 non-null   object        
 4   original_title           3225 non-null   object        
 5   overview                 3225 non-null   object        
 6   popularity               3225 non-null   float64       
 7   production_companies     3225 non-null   object        
 8   production_countries     3225 non-null   object        
 9   release_date             3225 non-null   datetime64[ns]
 10  revenue                  3225 non-null   int64         
 11  runtime                  3225 non-null   int64         
 12  spoken_languages         3225 non-

In [43]:
chat_function(input_data_df = df_movies)

# Sample interesting prompts:
# Tell me 3 interesting facts about this data.
# What can you infer from this data?
# Calculate trends on the data.
# What is the genre most represented?
# How many movies are under the documentary genre column?
# How many movies are under the documentary genre column? Provide method for calculation.
# How many movies are under the documentary genre column? Provide method for calculation in Python.
# Calculate how many movies are under the documentary genre column using the Python value_counts function.
# How many Harry Potter movies are in the dataset?

Goodbye!


In [33]:
# Check
df_movies['Documentary'].value_counts()

Documentary
0    3187
1      38
Name: count, dtype: int64

In [38]:
df_movies[df_movies['original_title_overview'].str.contains('Harry Potter')]

Unnamed: 0,budget,genres,id,original_language,original_title,overview,popularity,production_companies,production_countries,release_date,...,release_cy_month,box_office_earning,revenue_budget_ratio,title_char_count,overview_char_count,title_word_count,overview_word_count,original_title_overview,norm_text_lemma,norm_text_stem
415,250000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",767,en,Harry Potter and the Half-Blood Prince,"As Harry begins his sixth year at Hogwarts, he...",98.885637,"[{""name"": ""Warner Bros."", ""id"": 6194}, {""name""...","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""...",2009-07-07,...,2009-07,1183959197,3.735837,38,174,6,30,Harry Potter and the Half-Blood Prince: As Har...,harry potter half blood prince harry begin six...,harri potter half blood princ harri begin sixt...
517,150000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",675,en,Harry Potter and the Order of the Phoenix,Returning for his fifth year of study at Hogwa...,78.144395,"[{""name"": ""Warner Bros."", ""id"": 6194}, {""name""...","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""...",2007-06-28,...,2007-06,1088212738,6.254752,41,318,8,55,Harry Potter and the Order of the Phoenix: Ret...,harry potter order phoenix return fifth year s...,harri potter order phoenix return fifth year s...
518,150000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",674,en,Harry Potter and the Goblet of Fire,"Harry starts his fourth year at Hogwarts, comp...",101.250416,"[{""name"": ""Patalex IV Productions Limited"", ""i...","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""...",2005-11-05,...,2005-11,1045921036,5.972807,35,261,7,43,Harry Potter and the Goblet of Fire: Harry sta...,harry potter goblet fire harry start fourth ye...,harri potter goblet fire harri start fourth ye...
593,130000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",673,en,Harry Potter and the Prisoner of Azkaban,"Harry, Ron and Hermione return to Hogwarts for...",79.679601,"[{""name"": ""1492 Pictures"", ""id"": 436}, {""name""...","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""...",2004-05-31,...,2004-05,919804554,6.07542,40,229,7,39,Harry Potter and the Prisoner of Azkaban: Harr...,harry potter prisoner azkaban harry ron hermio...,harri potter prison azkaban harri ron hermion ...
599,125000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",671,en,Harry Potter and the Philosopher's Stone,Harry Potter has lived under the stairs at his...,109.984351,"[{""name"": ""1492 Pictures"", ""id"": 436}, {""name""...","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""...",2001-11-16,...,2001-11,1101475550,7.811804,40,412,6,74,Harry Potter and the Philosopher's Stone: Harr...,harry potter philosopher stone harry potter li...,harri potter philosoph stone harri potter live...
677,100000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",672,en,Harry Potter and the Chamber of Secrets,"Ignoring threats to his life, Harry returns to...",132.397737,"[{""name"": ""1492 Pictures"", ""id"": 436}, {""name""...","[{""iso_3166_1"": ""DE"", ""name"": ""Germany""}, {""is...",2002-11-13,...,2002-11,976688482,8.766885,39,132,7,23,Harry Potter and the Chamber of Secrets: Ignor...,harry potter chamber secret ignore threat life...,harri potter chamber secret ignor threat life ...
2177,20000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",9760,en,Epic Movie,"When Edward, Peter, Lucy and Susan each follow...",6.064638,"[{""name"": ""Twentieth Century Fox Film Corporat...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2007-01-25,...,2007-01,106865564,4.343278,10,400,2,67,"Epic Movie: When Edward, Peter, Lucy and Susan...",epic movie edward peter lucy susan follow path...,epic movi edward peter luci susan follow path ...


### Filtered Movie Dataset: Adventure Genre

In [81]:
# Say we want to filter 
df_movies_filtered = df_movies[df_movies['Adventure'] == 1].reset_index().copy()
df_movies_filtered.sample(3)

Unnamed: 0,index,budget,genres,id,original_language,original_title,overview,popularity,production_companies,production_countries,...,release_cy_month,box_office_earning,revenue_budget_ratio,title_char_count,overview_char_count,title_word_count,overview_word_count,original_title_overview,norm_text_lemma,norm_text_stem
550,1796,30000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 16, ""...",533,en,The Curse of the Were-Rabbit,Cheese-loving eccentric Wallace and his cunnin...,30.380747,"[{""name"": ""Aardman Animations"", ""id"": 297}, {""...","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""...",...,2005-09,222452832,6.415094,28,443,5,69,The Curse of the Were-Rabbit: Cheese-loving ec...,curse rabbit cheese love eccentric wallace cun...,curs rabbit chees love eccentr wallac cun cani...
38,131,24000000,"[{""id"": 878, ""name"": ""Science Fiction""}, {""id""...",168,en,Star Trek IV: The Voyage Home,Fugitives of the Federation for their daring r...,22.258428,"[{""name"": ""Paramount Pictures"", ""id"": 4}]","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",...,1986-11,157000000,5.541667,29,573,6,101,Star Trek IV: The Voyage Home: Fugitives of th...,star trek iv voyage home fugitive federation d...,star trek iv voyag home fugit feder dare rescu...
411,1066,58000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 28, ""...",710,en,GoldenEye,James Bond must unmask the mysterious head of ...,59.824565,"[{""name"": ""United Artists"", ""id"": 60}, {""name""...","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""...",...,1995-11,410194034,6.072311,9,175,1,27,GoldenEye: James Bond must unmask the mysterio...,goldeneye james bond must unmask mysterious he...,goldeney jame bond must unmask mysteri head ja...


In [82]:
#df_movies_filtered.info()
# 659 records under adventure genre.

In [11]:
chat_function(input_data_df = df_movies_filtered)

# Sample interesting prompts:
# Tell me 3 interesting facts about this data.
# What can you infer from this data?
# Calculate trends on the data.
# What is the genre most represented?
# How many Harry Potter movies are in the dataset?
# What is the most important film?

NameError: name 'df_movies_filtered' is not defined

### Filtered Movie Dataset: Contain Word Harry

In [53]:
df_movies_filtered = df_movies[df_movies['original_title_overview'].str.contains('Harry Potter')].reset_index().copy()
df_movies_filtered.shape

(7, 47)

In [54]:
chat_function(input_data_df = df_movies_filtered)

Goodbye!


# LLM Locally with Ollama Library and Client

Note that before running this notebook you will need to load the Ollama application. Using the Ollama library allows calling various functiosn that simplify the code. Another benefit of the Ollama library is that the output can be extracted and shown in Markdown which improves readability.

In [55]:
# Import Ollama Library
import ollama

In [56]:
# Initialize the Ollama model (Need to previously have model installed locally. See Notebook 0.)
response = ollama.chat(model = model_selection, 
                       messages=[{'role': 'user',
                                  'content': 'Describe what is Python in three sentences.'
                                  }])

print(response['message']['content'])

# Sample Prompts:
# Describe what is Python in three sentences.
# What is a Python snake?
# Tell me about the Python snake animal.
# What date is it today?
# How can I setup a LLM so that it can provide me the date and time?

Python is a high-level, interpreted programming language that is widely used for various applications such as web development, data analysis, artificial intelligence, and more. Created in the late 1980s by Guido van Rossum, Python is known for its simplicity, readability, and flexibility, making it an ideal language for beginners and experienced programmers alike. Its vast number of libraries and frameworks, including NumPy, pandas, and Django, have made Python a popular choice among developers across industries.


In [58]:
prompt = 'Describe what is Python in three sentences.'

response_stream = ollama.chat(model = model_selection,
                              messages = [{'role': 'user',
                                           'content': prompt}],
                              stream = True
                             )

# Initialize an empty response string
streamed_response = ""

for token in response_stream:
    # Append the current tokens to the response
    streamed_response += token['message']['content']

# Clear previous output and display the updated response
    clear_output(wait = True)
    display(Markdown(f"**LLM Response (Streaming):**\n\n{streamed_response}"))

**LLM Response (Streaming):**

Python is a high-level, interpreted programming language that is widely used for various purposes such as web development, data analysis, artificial intelligence, and more. It is known for its simplicity, readability, and ease of use, making it an ideal choice for beginners and experienced programmers alike. Python's vast libraries and frameworks, such as NumPy and Django, provide a rich foundation for building efficient and scalable applications.

In [64]:
def ollama_lib_request(prompt):
    response_stream = ollama.chat(model = model_selection,
                                  messages = [{'role': 'user',
                                               'content': prompt}],
                                  stream = True)

    # Initialize an empty response string
    streamed_response = ""

    for token in response_stream:
        # Append the current tokens to the response
        streamed_response += token['message']['content']

    # Clear previous output and display the updated response
        clear_output(wait = True)
        display(Markdown(f"**LLM Response (Streaming):**\n\n{streamed_response}"))

In [66]:
def ollama_lib_request_wdata(prompt, input_data_df):
    df_json = input_data_df.to_json() # Convert DataFrame to JSON
    prompt_content = f'Given this data {df_json}' + ' ' + prompt # Combines data with prompt

    response_stream = ollama.chat(model = model_selection,
                                  messages = [{'role': 'user',
                                               'content': prompt_content}],
                                  stream = True)

    # Initialize an empty response string
    streamed_response = ""

    for token in response_stream:
        # Append the current tokens to the response
        streamed_response += token['message']['content']

    # Clear previous output and display the updated response
        clear_output(wait = True)
        display(Markdown(f"**LLM Response (Streaming):**\n\n{streamed_response}"))

In [79]:
ollama_lib_request_wdata(input('How can I assist you?'), 
                         input_data_df = df_names)

# Sample interesting prompts:
# Tell me 3 interesting facts about this data.
# What can you infer from this data?
# Who has the most important occupation?
# Which person has the most beneficial occupation to humanity?
# Whcih employees are doctors?
# Who is the youngest employee?
# Calculate the average age of the employees.
# Does it make sense that the youngest employee is a CEO?

**LLM Response (Streaming):**

A subjective question!

To determine the most important film, I'll analyze the text data provided. After processing the data, I can tell you that:

1. **The films have a common theme**: All six films are part of the "Harry Potter" franchise.
2. **Each film has a distinct plot**: While there is some overlap, each film has its own unique storyline, characters, and setting.
3. **The plots are interconnected**: The films build upon each other's storylines, with character arcs that span multiple movies.

Given these points, I'll use a simple approach to determine which film might be considered the most important:

1. **Which film is listed first?** (i.e., has the earliest publication date)
2. **Which film has the longest narrative thread?**

Based on this analysis:

1. The first film in the series, "Harry Potter and the Philosopher's Stone" (published as "Harry Potter and the Sorcerer's Stone" in the United States), is listed first.
2. The most complex and overarching plotline spans across multiple films, particularly "Harry Potter and the Half-Blood Prince".

Considering these factors, I'd argue that **the most important film** might be:

* **"Harry Potter and the Goblet of Fire"**, which marks a significant turning point in the series, introducing the Triwizard Tournament and cementing Voldemort's return to power.

However, this is a subjective interpretation. Other fans might argue for "Harry Potter and the Philosopher's Stone", "Harry Potter and the Half-Blood Prince", or one of the other films, depending on their individual perspectives and priorities.

What do you think? Do any of these conclusions resonate with you, or do you have another film in mind?

In [86]:
ollama_lib_request_wdata(input('How can I assist you?'), 
                         input_data_df = df_movies_filtered)
# Sample interesting prompts:
# Tell me 3 interesting facts about this data.
# Calculate trends on the data.
# What can you infer from this data?
# What is the most important film?

**LLM Response (Streaming):**

After analyzing the provided dataset, I've identified some trends:

1. **Topic distribution**: The dataset is extremely diverse, with topics ranging from action movies (e.g., "Harold & Kumar Go to White Castle") to sci-fi films (e.g., "War Inc.") to superhero movies (e.g., "Beastmaster: Portal of Power"). This suggests that the data may be representative of a wide range of interests or genres.
2. **Genre representation**: Within the diverse topic distribution, some genres are overrepresented, such as:
	* Action/Adventure (21%): Movies like "War Inc.", "Harold & Kumar Go to White Castle", and "Beastmaster: Portal of Power".
	* Comedy (18%): Films like "Super High Me" and "Six String Samurai" (although not all comedy movies are included).
3. **Franchise presence**: Several franchises are represented, including:
	* Godzilla/Kaiju (4%): "Gojira ni sen mireniamu godzilla save tokyo fli saucer transform beast orga".
	* Superman/Martian Manhunter (3%): Not explicitly stated but some movies like "Super High Me" contain characters from the DC Comics universe.
4. **Country of origin**: The majority of movies are from the United States (80%), followed by Japan (8%) and other countries (12%).
5. **Release year distribution**:
	* Most recent release years: 2000-2019 (40%)
	* Oldest release years: 1994-1999 (20%)
6. **Movie length distribution**: Movies range from approximately 80 minutes (e.g., "Super High Me") to over 2 hours (e.g., "War Inc.").
7. **Tone distribution**:
	* Most positive tone: Action/Adventure and Comedy genres tend to have more positive reviews and ratings.
	* Least positive tone: Sci-Fi/Fantasy and Horror movies often receive mixed or negative reviews.

These trends are not exhaustive, but they provide a general idea of the diversity and characteristics within the dataset.

# Concluding Remarks

- The general questions related to providing interesting facts and infering from the data seem to perform very well and do provide pattern analysis results that seem to be accurate.
- When questions start to be specific the model benefits from initial filtering before submitting the data and query.
- The API vs. Library. Although it is not clear that there should be any differences between accessing the Ollama model via the Library vs. the API, the Library client seems to respond with more detail. 

# Other Tips:

Ollama can be used with Pandas and the .apply function to answer questions within a dataset. For exampe, if a dataframe had a question, the following code would create an "Answer" column from ollama_api_request function:

- df['Answer'] = df['Question'].apply(ollama_api_request).110, Day 4: 50, Day 5: 115.'"

Other potential prompts to extract insights when submitting the data in as shown in the example above:
- "Summarize the key points from the data."
- "Summarize the following text using the same format as the submitted data: [insert text]."
- "Identify the main trends in the data."
- "What can you tell me about the sentiment of data."
- "Classify the following text/data into categories."
- "Extract the relevant information from the data."
- "Explain the significance of the data."
- "Find the correlation between the following topics: [insert topcs]."
- "Identify any anomalies in the data."

# NOTEBOOK END