# Llama Locally

## Example 1

This includes: 

- LLM Categorization of transactions 

- DASHBOARD on Ploty plot 

https://www.youtube.com/watch?v=h_GTxRFYETY

https://ollama.com/

https://github.com/thu-vu92/local-llms-analyse-finance


### Getting Ollama: 

1. Go to ollama.ai 

2. Click download 

### Installing Ollama

1. Double clink on ollama downloaded file and follow instructions 

### Get models

1. on Terminal: 

    ollama pull llama3

3. Generate own model (Optional), create a file using nano: 

```nano expense_analyzer```

Enter this information in the file: 

FROM llama3

PARAMETER temperature 0.8

SYSTEM You are a financial assistan, you help classify expenses and income from bank trnsactions

3. Run this code

```ollama create expense_analyzer_llama3 -f ./expense_analyzer```


In [1]:
from langchain_community.llms import Ollama

llm = Ollama(model ='llama3')

In [2]:
llm.invoke("Can you add an appropriate category next to each of the following expenses. Respond with a list of categories separated by commas. For example, Spotify AB by Adyen - \
Entertainment, Beta Boulders Ams Amsterdam Nld - Sports, etc.: \
ISS Catering Services De Meern, Vishandel Sier AMSTELVEEN, Ministerie van Justitie en Veiligheid, Etos AMSTERDAM NLD, Bistro Bar Amsterdam")

"Here is the list of categories with each expense:\n\nFood (ISS Catering Services De Meern), Food (Vishandel Sier AMSTELVEEN), Government/Politics (Ministerie van Justitie en Veiligheid), Retail (Etos AMSTERDAM NLD), Nightlife (Bistro Bar Amsterdam)\n\nLet me know if you'd like me to add any other categories!"

In [3]:
import pandas as pd

df = pd.read_csv('transactions_2022_2023.csv', low_memory = False )
df.head()

Unnamed: 0,Date,Name / Description,Expense/Income,Amount (EUR)
0,2023-12-30,Belastingdienst,Expense,9.96
1,2023-12-30,Tesco Breda,Expense,17.53
2,2023-12-30,Monthly Appartment Rent,Expense,451.0
3,2023-12-30,Vishandel Sier Amsterdam,Expense,12.46
4,2023-12-29,Selling Paintings,Income,13.63


In [4]:
# Get unique transactions in the Name / Description column
unique_transactions = df["Name / Description"].unique()
len(unique_transactions)

23

In [5]:
unique_transactions[1:10]


array(['Tesco Breda', 'Monthly Appartment Rent',
       'Vishandel Sier Amsterdam', 'Selling Paintings',
       'Spotify Ab By Adyen', 'Tk Maxx Amsterdam Da', 'Consulting',
       'Aidsfonds', 'Tls Bv Inz Ov-Chipkaart'], dtype=object)

In [6]:
# Get index list
#https://stackoverflow.com/questions/47518609/for-loop-range-and-interval-how-to-include-last-step
def hop(start, stop, step):
    for i in range(start, stop, step):
        yield i
    yield stop
# Spit the index into groups of 30
index_list = list(hop(0, len(unique_transactions), 30))
index_list

[0, 23]

In [7]:
# Output validation
from pydantic import BaseModel, field_validator
from typing import List

# Validate response format - check if it actually contains hyphen ("-")
class ResponseChecks(BaseModel):
    data: List[str]

    @field_validator("data")
    def check(cls, value):
        for item in value:
            if len(item) > 0:
                assert "-" in item, "String does not contain hyphen."

# Test validation
ResponseChecks(data=['Hello - World', 'Hello - there!'])

ResponseChecks(data=None)

In [8]:
def categorize_transactions(transaction_names, llm):
    response = llm.invoke("Can you add an appropriate category to the following expenses. For example: Spotify AB by Adyen - Entertainment, Beta Boulders Ams Amsterdam Nld - Sport, etc.. Categories should be less than 4 words. " + transaction_names)
    response = response.split('\n')

    # Keep only the lines in between blank lines (removing the explaination lines at the beginning and end of the response)
    blank_indexes = [index for index in range(len(response)) if response[index] == '']
    if len(blank_indexes) == 1:
        response = response[(blank_indexes[0] + 1):]
    else:
        response = response[(blank_indexes[0] + 1) : blank_indexes[1]]

    # Print response and validate if it is in the correct format
    print(response)
    response = [s.replace('* ', '') for s in response]
    # return response
    ResponseChecks(data = response)
    
    # Put in dataframe
    categories_df = pd.DataFrame({'Transaction vs category': response})
    return categories_df
    # categories_df[['Transaction', 'Category']] = categories_df['Transaction vs category'].str.split(' - ', expand=True)
    
    # return categories_df

In [9]:
# Test out the function
r = categorize_transactions('ISS Catering Services De Meern, Vishandel Sier AMSTELVEEN, Etos AMSTERDAM NLD, Bistro Bar Amsterdam',
                        llm)
r

['* ISS Catering Services De Meern - Food', '* Vishandel Sier AMSTELVEEN - Shopping', '* Etos AMSTERDAM NLD - Health', '* Bistro Bar Amsterdam - Entertainment']


Unnamed: 0,Transaction vs category
0,ISS Catering Services De Meern - Food
1,Vishandel Sier AMSTELVEEN - Shopping
2,Etos AMSTERDAM NLD - Health
3,Bistro Bar Amsterdam - Entertainment


In [10]:
r[['Transaction', 'Category']] = r['Transaction vs category'].str.split(' - ', expand=True)
r

Unnamed: 0,Transaction vs category,Transaction,Category
0,ISS Catering Services De Meern - Food,ISS Catering Services De Meern,Food
1,Vishandel Sier AMSTELVEEN - Shopping,Vishandel Sier AMSTELVEEN,Shopping
2,Etos AMSTERDAM NLD - Health,Etos AMSTERDAM NLD,Health
3,Bistro Bar Amsterdam - Entertainment,Bistro Bar Amsterdam,Entertainment


In [11]:
# Intialise the categories_df_all dataframe
categories_df_all = pd.DataFrame()
max_tries = 7

# Loop through the index_list
for i in range(0, len(index_list)-1):
    transaction_names = unique_transactions[index_list[i]:index_list[i+1]]
    transaction_names = ','.join(transaction_names)

    # Try and validate output, if it fails, try again for max_tries=7 times
    for j in range(1, max_tries):
        try:
            categories_df = categorize_transactions(transaction_names, llm)
            categories_df_all = pd.concat([categories_df_all, categories_df], ignore_index=True)
            
        except:
            if j < max_tries:
                continue
            else:
                raise Exception(f"Cannot categorise transactions indexes {i} to {i+1}.")
        break

['1. Belastingdienst - Taxes', '2. Tesco Breda - Grocery', '3. Monthly Apartment Rent - Housing', '4. Vishandel Sier Amsterdam - Fish/Meat (assuming a fish market)', '5. Selling Paintings - Business/Income', '6. Spotify Ab By Adyen - Entertainment', '7. Tk Maxx Amsterdam Da - Retail', '8. Consulting - Work', '9. Aidsfonds - Charity', '10. Tls Bv Inz Ov-Chipkaart - Transportation (assuming public transportation)', '11. Etos Amsterdam - Health/Wellness', '12. Beta Boulders Ams Amsterdam - Sport', '13. Salary - Income', '14. Bouldermuur Bv Amsterdam - Business/Investment', '15. Birat Restaurant Amsterdam - Food/Dining', '16. Freelancing - Work', '17. Tikkie - Personal/Bills (assuming a personal bill management service)', '18. Blogging - Business/Income', '19. Taxi Utrecht - Transportation', '20. Apple Services - Technology', '21. Amazon Lux - Retail/E-commerce', '22. Classpass* Monthly - Fitness/Sport', '23. Audible Uk AdblCo/Pymt Gbr - Entertainment (assuming audiobooks)']


In [12]:
categories_df.head()


Unnamed: 0,Transaction vs category
0,1. Belastingdienst - Taxes
1,2. Tesco Breda - Grocery
2,3. Monthly Apartment Rent - Housing
3,4. Vishandel Sier Amsterdam - Fish/Meat (assum...
4,5. Selling Paintings - Business/Income


In [13]:
categories_df_all[['Transaction', 'Category']] = categories_df_all['Transaction vs category'].str.split(' - ', expand=True)
categories_df_all.head()

Unnamed: 0,Transaction vs category,Transaction,Category
0,1. Belastingdienst - Taxes,1. Belastingdienst,Taxes
1,2. Tesco Breda - Grocery,2. Tesco Breda,Grocery
2,3. Monthly Apartment Rent - Housing,3. Monthly Apartment Rent,Housing
3,4. Vishandel Sier Amsterdam - Fish/Meat (assum...,4. Vishandel Sier Amsterdam,Fish/Meat (assuming a fish market)
4,5. Selling Paintings - Business/Income,5. Selling Paintings,Business/Income


In [14]:
# Get unique categories in categories_df_all
unique_categories = categories_df_all["Category"].unique()
unique_categories

array(['Taxes', 'Grocery', 'Housing',
       'Fish/Meat (assuming a fish market)', 'Business/Income',
       'Entertainment', 'Retail', 'Work', 'Charity',
       'Transportation (assuming public transportation)',
       'Health/Wellness', 'Sport', 'Income', 'Business/Investment',
       'Food/Dining',
       'Personal/Bills (assuming a personal bill management service)',
       'Transportation', 'Technology', 'Retail/E-commerce',
       'Fitness/Sport', 'Entertainment (assuming audiobooks)'],
      dtype=object)

In [15]:
# Drop NA values
categories_df_all = categories_df_all.dropna()

# If category contains "Food", then categorise as "Food and Drinks"
categories_df_all.loc[categories_df_all['Category'].str.contains("Food"), 'Category'] = "Food and Drinks"
# If category contains "Clothing", then categorise as "Clothing"
categories_df_all.loc[categories_df_all['Category'].str.contains("Clothing"), 'Category'] = "Clothing"
# If category contains "Services", then categorise as "Services"
categories_df_all.loc[categories_df_all['Category'].str.contains("Services"), 'Category'] = "Services"
# If category contains "Health" or "Wellness", then categorise as "Health and Wellness"
categories_df_all.loc[categories_df_all['Category'].str.contains("Health|Wellness"), 'Category'] = "Health and Wellness"
# If category contains "Sport", then categorise as "Sport
#  and Fitness"
categories_df_all.loc[categories_df_all['Category'].str.contains("Sport"), 'Category'] = "Sport and Fitness"
# If category contains "Travel", then categorise as "Travel"
categories_df_all.loc[categories_df_all['Category'].str.contains("Travel"), 'Category'] = "Travel"

In [16]:
# Remove the numbering eg "1. " from Transaction column
categories_df_all['Transaction'] = categories_df_all['Transaction'].str.replace(r'\d+\.\s+', '')
categories_df_all

Unnamed: 0,Transaction vs category,Transaction,Category
0,1. Belastingdienst - Taxes,1. Belastingdienst,Taxes
1,2. Tesco Breda - Grocery,2. Tesco Breda,Grocery
2,3. Monthly Apartment Rent - Housing,3. Monthly Apartment Rent,Housing
3,4. Vishandel Sier Amsterdam - Fish/Meat (assum...,4. Vishandel Sier Amsterdam,Fish/Meat (assuming a fish market)
4,5. Selling Paintings - Business/Income,5. Selling Paintings,Business/Income
5,6. Spotify Ab By Adyen - Entertainment,6. Spotify Ab By Adyen,Entertainment
6,7. Tk Maxx Amsterdam Da - Retail,7. Tk Maxx Amsterdam Da,Retail
7,8. Consulting - Work,8. Consulting,Work
8,9. Aidsfonds - Charity,9. Aidsfonds,Charity
9,10. Tls Bv Inz Ov-Chipkaart - Transportation (...,10. Tls Bv Inz Ov-Chipkaart,Transportation (assuming public transportation)


In [17]:
c = categories_df_all.copy()
c.loc[:,'Transact'] = c['Transaction'].str.split('.', expand=True)[1]
c.loc[:,'Transact'] = c.Transact.str.strip()
c.drop(['Transaction', 'Transaction vs category'], axis = 1, inplace = True)
c.rename(columns = {'Transact': 'Transaction'}, inplace = True)
c.Transaction.unique()


array(['Belastingdienst', 'Tesco Breda', 'Monthly Apartment Rent',
       'Vishandel Sier Amsterdam', 'Selling Paintings',
       'Spotify Ab By Adyen', 'Tk Maxx Amsterdam Da', 'Consulting',
       'Aidsfonds', 'Tls Bv Inz Ov-Chipkaart', 'Etos Amsterdam',
       'Beta Boulders Ams Amsterdam', 'Salary',
       'Bouldermuur Bv Amsterdam', 'Birat Restaurant Amsterdam',
       'Freelancing', 'Tikkie', 'Blogging', 'Taxi Utrecht',
       'Apple Services', 'Amazon Lux', 'ClasspassMonthly',
       'Audible Uk AdblCo/Pymt Gbr'], dtype=object)

In [18]:
# Merge the categories_df_all with the transactions_2022_2023.csv dataframe (df)
df = pd.read_csv("transactions_2022_2023.csv")
df.loc[df['Name / Description'].str.contains("Spotify"), 'Name / Description'] = "Spotify Ab By Adyen"
df = df.merge(c, left_on='Name / Description', right_on='Transaction', how='left')
df

Unnamed: 0,Date,Name / Description,Expense/Income,Amount (EUR),Category,Transaction
0,2023-12-30,Belastingdienst,Expense,9.96,Taxes,Belastingdienst
1,2023-12-30,Tesco Breda,Expense,17.53,Grocery,Tesco Breda
2,2023-12-30,Monthly Appartment Rent,Expense,451.0,,
3,2023-12-30,Vishandel Sier Amsterdam,Expense,12.46,Fish/Meat (assuming a fish market),Vishandel Sier Amsterdam
4,2023-12-29,Selling Paintings,Income,13.63,Business/Income,Selling Paintings
5,2023-12-29,Spotify Ab By Adyen,Expense,12.19,Entertainment,Spotify Ab By Adyen
6,2023-12-23,Tk Maxx Amsterdam Da,Expense,27.08,Retail,Tk Maxx Amsterdam Da
7,2023-12-22,Consulting,Income,541.57,Work,Consulting
8,2023-12-22,Aidsfonds,Expense,10.7,Charity,Aidsfonds
9,2023-12-20,Consulting,Income,2641.93,Work,Consulting


In [19]:
df.to_csv("transactions_2022_2023_categorized.csv", index=False)

In [20]:
import pandas as pd
import numpy as np
import plotly.express as px
import panel as pn

In [21]:
# Read transactions_2022_2023_categorized.csv
df = pd.read_csv('transactions_2022_2023_categorized.csv')
# Add year and month columns
df['Year'] = pd.to_datetime(df['Date']).dt.year
df['Month'] = pd.to_datetime(df['Date']).dt.month
df['Month Name'] = pd.to_datetime(df['Date']).dt.strftime("%b")
# Remove "Transaction" and "Transaction vs category" columns
df = df.drop(columns=['Transaction'])
df

Unnamed: 0,Date,Name / Description,Expense/Income,Amount (EUR),Category,Year,Month,Month Name
0,2023-12-30,Belastingdienst,Expense,9.96,Taxes,2023,12,Dec
1,2023-12-30,Tesco Breda,Expense,17.53,Grocery,2023,12,Dec
2,2023-12-30,Monthly Appartment Rent,Expense,451.0,,2023,12,Dec
3,2023-12-30,Vishandel Sier Amsterdam,Expense,12.46,Fish/Meat (assuming a fish market),2023,12,Dec
4,2023-12-29,Selling Paintings,Income,13.63,Business/Income,2023,12,Dec
5,2023-12-29,Spotify Ab By Adyen,Expense,12.19,Entertainment,2023,12,Dec
6,2023-12-23,Tk Maxx Amsterdam Da,Expense,27.08,Retail,2023,12,Dec
7,2023-12-22,Consulting,Income,541.57,Work,2023,12,Dec
8,2023-12-22,Aidsfonds,Expense,10.7,Charity,2023,12,Dec
9,2023-12-20,Consulting,Income,2641.93,Work,2023,12,Dec


In [22]:

# For Income rows, assign Name / Description to Category
df['Category'] = np.where(df['Expense/Income'] == 'Income', df['Name / Description'], df['Category'])

In [23]:
def make_pie_chart(df, year, label):
    # Filter the dataset for expense transactions
    sub_df = df[(df['Expense/Income'] == label) & (df['Year'] == year)]

    color_scale = px.colors.qualitative.Set2
    
    pie_fig = px.pie(sub_df, values='Amount (EUR)', names='Category', color_discrete_sequence = color_scale)
    pie_fig.update_traces(textposition='inside', direction ='clockwise', hole=0.3, textinfo="label+percent")

    total_expense = df[(df['Expense/Income'] == 'Expense') & (df['Year'] == year)]['Amount (EUR)'].sum() 
    total_income = df[(df['Expense/Income'] == 'Income') & (df['Year'] == year)]['Amount (EUR)'].sum()
    
    if label == 'Expense':
        total_text = "€ " + str(round(total_expense))

        # Saving rate:
        saving_rate = round((total_income - total_expense)/total_income*100)
        saving_rate_text = ": Saving rate " + str(saving_rate) + "%"
    else:
        saving_rate_text = ""
        total_text = "€ " + str(round(total_income))

    pie_fig.update_layout(uniformtext_minsize=10, 
                        uniformtext_mode='hide',
                        title=dict(text=label+" Breakdown " + str(year) + saving_rate_text),
                        # Add annotations in the center of the donut.
                        annotations=[
                            dict(
                                text=total_text, 
                                # Square unit grid starting at bottom left of page
                                x=0.5, y=0.5, font_size=12,
                                # Hide the arrow that points to the [x,y] coordinate
                                showarrow=False
                            )
                        ]
                    )
    return pie_fig

In [24]:
income_pie_fig_2022 = make_pie_chart(df, 2022, 'Income')
income_pie_fig_2022

In [25]:
def make_monthly_bar_chart(df, year, label):
    df = df[(df['Expense/Income'] == label) & (df['Year'] == year)]
    total_by_month = (df.groupby(['Month', 'Month Name'])['Amount (EUR)'].sum()
                        .to_frame()
                        .reset_index()
                        .sort_values(by='Month')  
                        .reset_index(drop=True))
    if label == "Income":
        color_scale = px.colors.sequential.YlGn
    if label == "Expense":
        color_scale = px.colors.sequential.OrRd
    
    bar_fig = px.bar(total_by_month, x='Month Name', y='Amount (EUR)', text_auto='.2s', title=label+" per month", color='Amount (EUR)', color_continuous_scale=color_scale)
    # bar_fig.update_traces(marker_color='lightslategrey')
    
    return bar_fig

In [26]:
income_monthly_2022 = make_monthly_bar_chart(df, 2022, 'Income')
income_monthly_2022

In [27]:
# Pie charts
income_pie_fig_2022 = make_pie_chart(df, 2022, 'Income')
expense_pie_fig_2022 = make_pie_chart(df, 2022, 'Expense')  
income_pie_fig_2023 = make_pie_chart(df, 2023, 'Income')
expense_pie_fig_2023 = make_pie_chart(df, 2023, 'Expense')

# Bar charts
income_monthly_2022 = make_monthly_bar_chart(df, 2022, 'Income')
expense_monthly_2022 = make_monthly_bar_chart(df, 2022, 'Expense')
income_monthly_2023 = make_monthly_bar_chart(df, 2023, 'Income')
expense_monthly_2023 = make_monthly_bar_chart(df, 2023, 'Expense')

# Create tabs
tabs = pn.Tabs(
                        ('2022', pn.Column(pn.Row(income_pie_fig_2022, expense_pie_fig_2022),
                                                pn.Row(income_monthly_2022, expense_monthly_2022))),
                        ('2023', pn.Column(pn.Row(income_pie_fig_2023, expense_pie_fig_2023),
                                                pn.Row(income_monthly_2023, expense_monthly_2023))
                        )
                )
tabs.show()

Launching server at http://localhost:51578


<panel.io.server.Server at 0x12ddafa70>

In [28]:
# Dashboard template
template = pn.template.FastListTemplate(
    title='Personal Finance Dashboard',
    sidebar=[pn.pane.Markdown("# Income Expense analysis"), 
             pn.pane.Markdown("Overview of income and expense based on my bank transactions. Categories are obtained using local LLMs."),
             pn.pane.PNG("picture.png", sizing_mode="scale_both")
             ],
    main=[pn.Row(pn.Column(pn.Row(tabs)
                           )
                ),
                ],
    # accent_base_color="#88d8b0",
    header_background="#c0b9dd",
)

template.show()

Launching server at http://localhost:51581


<panel.io.server.Server at 0x12df733b0>

In [29]:
# EXAMPLE 2: 
# https://www.youtube.com/watch?v=-ROS6gfYIts
# https://github.com/langchain-ai/langgraph/blob/main/examples/rag/langgraph_rag_agent_llama3_local.ipynb

