# Personal Finance Analysis & Prediction Project

### Author: Cooper Braun

### Class: CPSC-222 Spring 2025

## Introduction

This project analyzes my personal financial data in relation to my academic schedule to identify spending patterns and make predictions about my spending behavior.

## Data Preparation

First off, let's import all of our necessary libraries and functions

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from utils import *

### Gathering My Data

I collected 7 months of banking transactions (September 2024 - April 2025) and my academic calendar for the same period. Let's load and examine this data:

In [2]:
# Load raw data
bank_data, academic_data = load_data("data/raw/bank_data.csv", "data/raw/academic_calendar.csv")

print(f"I have {bank_data.shape[0]} banking transactions to analyze")
print(f"My academic calendar spans {academic_data.shape[0]} days")

# A few of my banking transactions to show data
print("Here is a peek at a couple of my recent transactions")
display(bank_data.head(4))

I have 205 banking transactions to analyze
My academic calendar spans 152 days
Here is a peek at a couple of my recent transactions


Unnamed: 0,Date,Description,Type,Amount,Current balance,Status
0,2025-04-01,Mobile Check Deposit,Check Deposit,2000.0,2617.77,Posted
1,2025-03-31,Interest earned,Interest Earned,0.25,617.77,Posted
2,2025-03-28,Zelle® Payment to Eme,Direct Payment,-27.0,617.52,Posted
3,2025-03-25,DAVES HOT CHICKEN 1307,Debit Card,-29.41,644.52,Posted


### Making Sense of My Data

The raw banking data is pretty hard to read and can be very difficult to understand some of the transactions. To add some context I need to:

1. Convert dates to a format I can work with
2. Categorize my transactions (Food, Transportation, etc.)
3. Create spending categories (Small, Medium, Large, etc.)
4. Add academic context to each transaction

**Cleaning Banking Data**

In [3]:
# Clean the banking data
bank_data = clean_bank_data(bank_data)

# See how my transactions are categorized
category_counts = bank_data["Category"].value_counts()
print("How my transactions are distributed across categories:")
display(category_counts)

# Check out my spending bins
bin_counts = bank_data[bank_data["Transaction_Type"] == "Debit"]["Spending_Bin"].value_counts()
print("My spending patterns by size:")
display(bin_counts)

How my transactions are distributed across categories:


Category
Dining                   83
Banking & Investments    44
Groceries                31
Other                    23
Transportation            9
Rent                      5
Utilities                 4
Subscriptions             3
Parking                   3
Name: count, dtype: int64

My spending patterns by size:


Spending_Bin
Medium ($10-$50)               95
Small (Less than $10)          50
Large ($50-$100)               20
Very Large (More than $100)    13
Name: count, dtype: int64

**Cleaning Academic Data**

In [4]:
# Clean academic data
academic_data = clean_academic_data(academic_data)

# See how most of my sophomore year breaks down
period_counts = academic_data["period_type"].value_counts()
print("How my 2024-2025 school year was structured:")
display(period_counts)

How my 2024-2025 school year was structured:


period_type
Class Period         52
Break                39
Weekend              34
Assessment Period    27
Name: count, dtype: int64

## Exploratory Data Analysis

## Classification Results

## Conclusion

Sources:

https://www.programiz.com/python-programming/datetime/strftime