<a href="https://colab.research.google.com/github/callaghan210-coder/FinaceAI/blob/main/FinHealthRecom.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Key Components:


1.   **User Profiles**: Store savings, debts, and demographic info.
2.   **Transactions**: Store income and expense data.
3. **Health Score Calculation**: Function that uses income, expenses, savings, and debt to determine financial health.
4. **Real-Time Updates**: Recalculate and update the health score with every new transaction

# 1. Loading the Dataset
We will start by loading the CSV files containing user profiles, transactions, and goals.

In [17]:
import pandas as pd

# Load the datasets and skip bad lines
df_users = pd.read_csv('user_profiles.csv', on_bad_lines='skip')
df_transactions = pd.read_csv('transactions.csv', on_bad_lines='skip')
df_goals = pd.read_csv('user_goals.csv', on_bad_lines='skip')

# Check if the datasets are loaded correctly
print("User Profiles Shape:", df_users.shape)
print("Transactions Shape:", df_transactions.shape)
print("Goals Shape:", df_goals.shape)

# Show the shape of each dataset to confirm successful loading
print("User Profiles Shape:", df_users.shape)
print("Transactions Shape:", df_transactions.shape)
print("Goals Shape:", df_goals.shape)

# Optionally preview the data
print(df_users.head())
print(df_transactions.head())
print(df_goals.head())

  df_users = pd.read_csv('user_profiles.csv', on_bad_lines='skip')


User Profiles Shape: (159010, 6)
Transactions Shape: (27355, 8)
Goals Shape: (100000, 5)
User Profiles Shape: (159010, 6)
Transactions Shape: (27355, 8)
Goals Shape: (100000, 5)
                                User ID  Occupation Age Saving Amount  \
0  0318c2de-7376-4cd9-a6f6-5c199948682d     Teacher  60       4498.78   
1  8edd7118-1865-4aaf-8254-22dd0b32fd46  Freelancer  65       2353.73   
2  8c44ee71-1e23-4785-ada7-edd158560935  Freelancer  38         813.4   
3  d1ba6a8b-d21a-4e67-8761-31a1b07d163b     Teacher  40       3717.56   
4  dd67882c-2625-4298-abec-e1f04300af25  Freelancer  60       2372.86   

  Total Debts     Debt Type  
0     17838.3   Credit Card  
1     8954.34  Student Loan  
2     5687.75  Student Loan  
3    16495.68      Mortgage  
4     9260.36  Student Loan  
                                User ID                        Date  \
0  0318c2de-7376-4cd9-a6f6-5c199948682d  2024-04-29 14:14:15.976035   
1  8edd7118-1865-4aaf-8254-22dd0b32fd46  2024-04-20 03:13:23.

1. This section loads three datasets: user profiles, transactions, and user goals from CSV files.
2. It skips any bad lines during loading to prevent errors.
3. The shapes of each DataFrame are printed to confirm that the datasets have been loaded correctly.
4. The first few rows of each dataset are printed for a quick overview of the data.

## Check the data types

In [33]:
# Check data types of each column in the DataFrames
print("Data Types in User Profiles:")
print(df_users.dtypes)

print("\nData Types in Transactions:")
print(df_transactions.dtypes)

print("\nData Types in Goals:")
print(df_goals.dtypes)

Data Types in User Profiles:
User ID                   object
Occupation                object
Age                       object
Saving Amount             object
Total Debts               object
Debt Type                 object
Financial Health Score    object
Reasons                   object
Recommendations           object
dtype: object

Data Types in Transactions:
User ID             object
Date                object
Mode of Payment     object
Category            object
Subcategory         object
Amount             float64
Income/Expense      object
Account Balance    float64
dtype: object

Data Types in Goals:
User ID             object
Goal                object
Goal Amount        float64
Period (Months)      int64
Importance          object
dtype: object


##  Convert Columns to Numeric Types

In [34]:
# Convert relevant columns in df_users to numeric
df_users['Saving Amount'] = pd.to_numeric(df_users['Saving Amount'], errors='coerce')
df_users['Total Debts'] = pd.to_numeric(df_users['Total Debts'], errors='coerce')
df_users['Financial Health Score'] = pd.to_numeric(df_users['Financial Health Score'], errors='coerce')

# Check the updated data types
print("Updated Data Types in User Profiles:")
print(df_users.dtypes)


Updated Data Types in User Profiles:
User ID                    object
Occupation                 object
Age                        object
Saving Amount             float64
Total Debts               float64
Debt Type                  object
Financial Health Score    float64
Reasons                    object
Recommendations            object
dtype: object


# 2. Define Financial Health Score Calculation
Now, we define a function to calculate the financial health score for each user based on their transactions, profile data, and goals.

In [35]:
# Function to calculate financial health score and provide reasons and recommendations
def calculate_financial_health(user_id, df_users, df_transactions, user_goal):
    # Get transactions for the user
    user_transactions = df_transactions[df_transactions['User ID'] == user_id]

    # Calculate total income and expenses
    total_income = pd.to_numeric(user_transactions[user_transactions['Income/Expense'] == 'Income']['Amount'].sum())
    total_expenses = pd.to_numeric(user_transactions[user_transactions['Income/Expense'] == 'Expense']['Amount'].sum())

    # Get user's savings and debts from the profile
    saving_amount = df_users[df_users['User ID'] == user_id]['Saving Amount'].values[0]
    total_debts = df_users[df_users['User ID'] == user_id]['Total Debts'].values[0]

    # Get goal information and determine the weight based on goal importance
    goal_importance_weight = 0.2 if user_goal['Importance'] == 'Luxury' else 0.1

    # Calculate the financial health score
    score = (saving_amount / (total_expenses + 1)) * 0.4 + (total_income / (total_debts + 1)) * 0.6 - goal_importance_weight
    score = round(score, 2)

    # Provide reasons for the financial health score
    reasons = []
    if total_income == 0:
        reasons.append("No recorded income.")
    else:
        reasons.append(f"Income is {total_income}.")

    if total_expenses > total_income:
        reasons.append(f"Expenses exceed income by {total_expenses - total_income}.")
    else:
        reasons.append(f"Expenses are {total_expenses}, within the income.")

    if total_debts > 0:
        reasons.append(f"Total debts are {total_debts}.")
    else:
        reasons.append("No debts recorded.")

    if saving_amount == 0:
        reasons.append("No savings available.")
    else:
        reasons.append(f"Savings amount is {saving_amount}.")

    # Recommendations based on the score and financial data
    recommendations = []
    if saving_amount / total_income < 0.1:  # Savings less than 10% of income
        recommendations.append("Increase savings to at least 10% of your income.")

    if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
        recommendations.append("Focus on reducing debt to improve financial health.")

    if total_expenses > total_income:
        recommendations.append("Reduce your expenses to avoid financial strain.")

    if goal_importance_weight == 0.2:
        recommendations.append("Consider adjusting your luxury goals to match your income and savings.")

    return score, reasons, recommendations


# 4. Process New Transactions

In [36]:
# Function to process a new transaction and update the financial health score for an existing user
def process_transaction(new_transaction, df_users, df_transactions, df_goals):
    user_id = new_transaction['User ID']

    if user_id in df_users['User ID'].values:
        print(f"User ID {user_id} found, processing transaction.")

        # Add new transaction to the dataset
        new_transaction_df = pd.DataFrame([new_transaction])
        df_transactions = pd.concat([df_transactions, new_transaction_df], ignore_index=True)

        # Fetch the user's goal
        user_goal = df_goals[df_goals['User ID'] == user_id].to_dict('records')[0]

        # Recalculate the financial health score and get reasons and recommendations
        new_score, reasons, recommendations = calculate_financial_health(user_id, df_users, df_transactions, user_goal)

        # Update the user's financial health score
        df_users.loc[df_users['User ID'] == user_id, 'Financial Health Score'] = new_score

        # Output the results
        print(f"Updated Financial Health Score for {user_id}: {new_score}")
        print(f"Reasons for score: {', '.join(reasons)}")
        print(f"Recommendations: {', '.join(recommendations)}")

    else:
        print(f"User ID {user_id} not found. This function is only for existing users.")

    return df_users, df_transactions, df_goals


# Example of Processing a New Transaction

In [37]:
# Example of how to process a transaction for an existing user
# New transaction data
new_transaction = {
    'User ID': 'existing_user_id',  # Example user ID
    'Date': '2024-09-29',           # Example date
    'Mode of Payment': 'Credit Card',
    'Category': 'Shopping',
    'Subcategory': 'Clothes',
    'Amount': 250.75,
    'Income/Expense': 'Expense',
    'Account Balance': 5000.00
}

# Assuming df_users, df_transactions, and df_goals have been defined and contain data for existing users
df_users, df_transactions, df_goals = process_transaction(new_transaction, df_users, df_transactions, df_goals)


User ID existing_user_id not found. This function is only for existing users.


# 3. Apply the Financial Health Score to All Users
Now, we use the function defined above to calculate the financial health score for each user.

In [None]:
# Initialize new columns for financial health scores, reasons, and recommendations
df_users['Financial Health Score'] = None
df_users['Reasons'] = None
df_users['Recommendations'] = None

# Iterate through each user and calculate their financial health score
for index, user in df_users.iterrows():
    user_id = user['User ID']

    # Get user goal information, assuming it exists for every user
    try:
        user_goal = df_goals[df_goals['User ID'] == user_id].to_dict('records')[0]
        # Calculate the financial health score
        score, reasons, recommendations = calculate_financial_health(user_id, df_users, df_transactions, user_goal)

        # Update the user's financial health score, reasons, and recommendations in df_users
        df_users.at[index, 'Financial Health Score'] = score
        df_users.at[index, 'Reasons'] = ', '.join(reasons)
        df_users.at[index, 'Recommendations'] = ', '.join(recommendations)

    except IndexError:
        print(f"No goals found for User ID {user_id}. Skipping.")

# Display updated DataFrame with financial health scores
print(df_users[['User ID', 'Financial Health Score', 'Reasons', 'Recommendations']].head())


[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amoun

No goals found for User ID caca4d2c-b2ncer. Skipping.


[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amoun

No goals found for User ID f2a54332-e559-4f2-443a-a726-b3687820a9ba. Skipping.


[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amount / total_income < 0.1:  # Savings less than 10% of income
  if total_debts / total_income > 0.3:  # Debt to income ratio more than 30%
  if saving_amoun