# Test Cases

## Overview
This test verifies that recurring events are generated correctly. It ensures events occur on expected dates, tracks the number of events per week, and compares actual vs. expected results. Three data sources are used:
- **`existing_events_df`** – Current event definitions.
- **`test_cases_data_df`** – Expected parameters for test cases.
- **`weekly_count_df`** – Actual results from `count_weekly_events`.


# A. DataFrames

## Imports

In [78]:
import pandas as pd
import os
from collections import Counter
from datetime import datetime, timedelta
from event_recurrance import *
from sample_event_list import existing_events
from test_cases import test_cases

## 1. Existing events data
**`existing_events_df`**

In [79]:
# Convert the list of events into a DataFrame
existing_events_df = pd.DataFrame([
    {
        "name": event.name,
        "start_date": event.start_date.strftime("%Y-%m-%d"),
        "end_date": event.end_date.strftime("%Y-%m-%d") if event.end_date else None,
        "recurrent_type": event.recurrent_type,
        "days": event.days,
        "interval": event.interval,
    }
    for event in existing_events
])

existing_events_df.head()

Unnamed: 0,name,start_date,end_date,recurrent_type,days,interval
0,Rent Payment,2024-03-01,,monthly,[1],1
1,Gym Membership,2024-03-05,,monthly,[5],1
2,Salary,2024-03-01,,monthly,"[1, 15]",1
3,Electric Bill,2024-03-10,,monthly,[10],1
4,Internet Bill,2024-03-15,,monthly,[15],1


## 2. Existing events occurrence
**`existing_occurrences_df`**

In [80]:
# Dates to run the occurs_on
start_date = datetime(2024, 3, 1)  # Start date
end_date = datetime(2025, 3, 1)   # End date
days_between = (end_date - start_date).days

# Sample existing DataFrame
existing_occurrences_df = pd.DataFrame(columns=["event_id", "event_name", "occurring_date"])

# Sample existing DataFrame
existing_occurrences_list = []  # Using a list for better performance

print(f'Creating list for {days_between} days. Starting {start_date.strftime("%Y-%m-%d")}.')

events_counter = 0
# Loop through existing events and date range.
for i, event in enumerate(existing_events):
    current_date = start_date # Reset current_date for each even

    for day in range(days_between):
        # Check if event occurs
        if event.occurs_on(current_date):
            # print(f'Creating event {event.name} for day {current_date.strftime("%Y-%m-%d")}.')
            existing_occurrences_list.append({"event_id": i, "event_name": event.name, "occurring_date": current_date})

        # Move to next day
        current_date += timedelta(days=1)  # Increment by 1 day
        events_counter +=1


print(events_counter)

# Convert the list to a DataFrame
existing_occurrences_df = pd.DataFrame(existing_occurrences_list)
existing_occurrences_df



Creating list for 365 days. Starting 2024-03-01.
17520


Unnamed: 0,event_id,event_name,occurring_date
0,0,Rent Payment,2024-03-01
1,0,Rent Payment,2024-04-01
2,0,Rent Payment,2024-05-01
3,0,Rent Payment,2024-06-01
4,0,Rent Payment,2024-07-01
...,...,...,...
1203,47,Expense 49,2024-12-30
1204,47,Expense 49,2025-01-13
1205,47,Expense 49,2025-01-27
1206,47,Expense 49,2025-02-10


## 3. Test Cases Data
**`test_cases_data_df`**

In [81]:
# Convert to DataFrame
test_cases_data_df = pd.DataFrame.from_dict(test_cases, orient='index')

# Reset index to include case names and add case numbers
test_cases_data_df.reset_index(inplace=True)
test_cases_data_df.rename(columns={'index': 'case_description'}, inplace=True)
test_cases_data_df.insert(0, 'case_number', range(1, len(test_cases_data_df) + 1))

test_cases_data_df.head()

Unnamed: 0,case_number,case_description,name,start_date,recurrent_type,interval,days_of_week,days_of_month,end_date
0,1,Adding a new weekly event on Mondays,Monday Sync,2024-03-04,n-weekly,1.0,[0],,NaT
1,2,Adding a new bi-weekly event on Wednesdays,Bi-Weekly Stand-up,2024-03-06,n-weekly,2.0,[2],,NaT
2,3,Adding a new monthly event on the 15th,Mid-Month Review,2024-03-15,monthly,,,[15],NaT
3,4,Adding a weekly event overlapping multiple exi...,Busy Monday,2024-03-04,n-weekly,1.0,"[0, 3]",,NaT
4,5,Adding an event far into the future,Yearly Review,2025-03-01,monthly,,,[1],NaT


## 4. Run Test
**`test_weekly_count_df`**


In [82]:
n = 1
test_weekly_count_df = pd.DataFrame()
test_tittles_dic = {}

# Test Results.
for tittle, new_event_data in test_cases.items():

    # Summary
    print(f"Test Case {n}: {tittle}")
    scenario = count_weekly_events(new_event_data, existing_events)
    print(summary_weekly_events(scenario),"\n")

    # Add test data to dataframe
    temp_df = pd.DataFrame(list(scenario.items()), columns=["week_start_date", "event_count"])
    temp_df["test_number"] = n
    test_weekly_count_df = pd.concat([test_weekly_count_df, temp_df], ignore_index=True)

    # Test tittle data
    #test_tittles_dic[n] = tittle

    n += 1

test_weekly_count_df.head(5)

Test Case 1: Adding a new weekly event on Mondays
Total events: 977 | Total weeks: 42 | Average events per week:  23 

Test Case 2: Adding a new bi-weekly event on Wednesdays
Total events: 977 | Total weeks: 42 | Average events per week:  23 

Test Case 3: Adding a new monthly event on the 15th
Total events: 987 | Total weeks: 42 | Average events per week:  24 

Test Case 4: Adding a weekly event overlapping multiple existing events
Total events: 977 | Total weeks: 42 | Average events per week:  23 

Test Case 5: Adding an event far into the future
Total events: 994 | Total weeks: 42 | Average events per week:  24 

Test Case 6: Weekly Salary Deposit
Total events: 957 | Total weeks: 42 | Average events per week:  23 

Test Case 7: Bi-weekly Grocery Shopping
Total events: 957 | Total weeks: 42 | Average events per week:  23 

Test Case 8: Monthly Rent Payment
Total events: 957 | Total weeks: 42 | Average events per week:  23 

Test Case 9: Electric Bill Payment
Total events: 987 | Total

Unnamed: 0,week_start_date,event_count,test_number
0,2024-03-04,15,1
1,2024-03-11,15,1
2,2024-03-18,24,1
3,2024-03-25,23,1
4,2024-04-01,23,1


## 4. Export DataFrame

In [83]:
folder_format = "%y-%b-%d"
sub_format = "%H-%M-%S"
file_format = "%y%m%d_%H%M%S"
folder_time = datetime.now().strftime(folder_format)
subfolder_time = datetime.now().strftime(sub_format)
file_time = datetime.now().strftime(file_format)


folder_name = f"data/{folder_time}/{subfolder_time}"
os.makedirs(folder_name, exist_ok=True)

# Export data
existing_events_df.to_csv(f"data/{folder_time}/{subfolder_time}/existing_events_{file_time}.csv")
existing_occurrences_df.to_csv(f"data/{folder_time}/{subfolder_time}/existing_occurrences_{file_time}.csv")
test_cases_data_df.to_csv(f"data/{folder_time}/{subfolder_time}/test_cases_{file_time}.csv")
test_weekly_count_df.to_csv(f"data/{folder_time}/{subfolder_time}/test_results_{file_time}.csv")

print(f"Saved to: data/{folder_time}/{subfolder_time}")

Saved to: data/25-Mar-14/19-14-55


## B. Verify Event Occurrences
**Function:** `occurs_on`

### Steps
1. **Check event dates (using `existing_events_df` and `existing_occurrences_df`)**:
   - Run occurs_on for each event in existing_events from March 1st 2024 to March 1st 2025 to get `existing_occurrences_df`
   - Ensure each event appears on the correct dates.
   - Example: A weekly event on Mondays should happen every 7 days, always on a Monday.

2. **Validate test case parameters (using `test_cases_data_df` and `existing_events_df`)**:
For both, the existing events and test cases.
   - **For n-weekly events:**
     - Ensure they repeat every `n * 7` days.
     - If an event occurs multiple times a week, treat each occurrence separately.

   - **For monthly events:**
     - Check that they occur on the correct day(s) of the month.
     - If a day doesn’t exist in a given month (e.g., 31st in February), confirm behavior:
       - **If `use_last_day` = True**, the event moves to the last valid day.
       - **If `use_last_day` = False**, the event is ignored that month.

✅ **Pass if:** Events occur on expected dates.
❌ **Fail if:** Missing or extra occurrences are found.

## 1. Existing events.

### Step 1
Check the occurrences are happening in the right Week or Month days

In [84]:
from collections import defaultdict

mismatches = defaultdict(list)


def check_days(events_df, events_occurrences):
    log_entries = []

    for event in events_df.itertuples():
        event_occurrences_df = events_occurrences.loc[events_occurrences["event_id"] == event.Index] # List of occurring_dates for this event.

        # Events Validation
        for occurrence in event_occurrences_df.itertuples():
            occurring_date = occurrence.occurring_date
            actual_day = occurring_date.day if event.recurrent_type == "monthly" else occurring_date.weekday()
            is_valid = actual_day in event.days

            log_entries.append([
                event.name, event.recurrent_type, occurring_date, actual_day, event.days, is_valid
            ]) # Append log entry

    # Convert logs into a DataFrame
    log_df = pd.DataFrame(
        log_entries,
        columns=["event", "recurrent_type", "occurring_date", "actual_day", "expected_days", "is_valid"]
    )


    '''Print results'''
    match_df = log_df[log_df["is_valid"]]
    mismatch_df = log_df[log_df["is_valid"] == False]
    rows_n = 5

    print("Monthly events:")
    display(match_df[match_df["recurrent_type"] == "monthly"].head(rows_n) )
    print("\nn-weekly events:")
    display(match_df[match_df["recurrent_type"] == "n-weekly"].head(rows_n) )
    print("\nMultiple days events:")
    display(match_df[match_df["expected_days"].apply(lambda x: len(x)>1)].head(rows_n) )

    print(f"\nCorrect matches {len(match_df)}")
    display(mismatch_df.head(10))
    print(f"\nIncorrect matches {len(mismatch_df)}")

    return log_df

In [85]:
days_log_df = check_days(existing_events_df, existing_occurrences_df)

Monthly events:


Unnamed: 0,event,recurrent_type,occurring_date,actual_day,expected_days,is_valid
0,Rent Payment,monthly,2024-03-01,1,[1],True
1,Rent Payment,monthly,2024-04-01,1,[1],True
2,Rent Payment,monthly,2024-05-01,1,[1],True
3,Rent Payment,monthly,2024-06-01,1,[1],True
4,Rent Payment,monthly,2024-07-01,1,[1],True



n-weekly events:


Unnamed: 0,event,recurrent_type,occurring_date,actual_day,expected_days,is_valid
96,Freelance Work,n-weekly,2024-03-06,2,[2],True
97,Freelance Work,n-weekly,2024-03-20,2,[2],True
98,Freelance Work,n-weekly,2024-04-03,2,[2],True
99,Freelance Work,n-weekly,2024-04-17,2,[2],True
100,Freelance Work,n-weekly,2024-05-01,2,[2],True



Multiple days events:


Unnamed: 0,event,recurrent_type,occurring_date,actual_day,expected_days,is_valid
24,Salary,monthly,2024-03-01,1,"[1, 15]",True
25,Salary,monthly,2024-03-15,15,"[1, 15]",True
26,Salary,monthly,2024-04-01,1,"[1, 15]",True
27,Salary,monthly,2024-04-15,15,"[1, 15]",True
28,Salary,monthly,2024-05-01,1,"[1, 15]",True



Correct matches 1208


Unnamed: 0,event,recurrent_type,occurring_date,actual_day,expected_days,is_valid



Incorrect matches 0


### Export Step 1 results data

In [86]:
folder_format = "%y-%b-%d"
sub_format = "%H-%M-%S"
file_format = "%y%m%d_%H%M%S"
folder_time = datetime.now().strftime(folder_format)
subfolder_time = datetime.now().strftime(sub_format)
file_time = datetime.now().strftime(file_format)


folder_name = f"data/{folder_time}/{subfolder_time}/logs"
os.makedirs(folder_name, exist_ok=True)

# Export data
days_log_df.to_csv(f"data/{folder_time}/{subfolder_time}/logs/check_days_{file_time}.csv")

print(f"Saved to: data/{folder_time}/{subfolder_time}/logs/")

Saved to: data/25-Mar-14/19-14-55/logs/


### Step 2
Check that the occurrences for existing are happening at the right frequency

In [87]:
display(existing_events_df.head(10))
display(existing_occurrences_df.head(10))

Unnamed: 0,name,start_date,end_date,recurrent_type,days,interval
0,Rent Payment,2024-03-01,,monthly,[1],1
1,Gym Membership,2024-03-05,,monthly,[5],1
2,Salary,2024-03-01,,monthly,"[1, 15]",1
3,Electric Bill,2024-03-10,,monthly,[10],1
4,Internet Bill,2024-03-15,,monthly,[15],1
5,Netflix Subscription,2024-03-20,,monthly,[20],1
6,Coffee Subscription,2024-03-07,,monthly,[7],1
7,Freelance Work,2024-03-03,,n-weekly,[2],2
8,Spotify Subscription,2024-03-25,,monthly,[25],1
9,Groceries,2024-03-03,,n-weekly,[6],1


Unnamed: 0,event_id,event_name,occurring_date
0,0,Rent Payment,2024-03-01
1,0,Rent Payment,2024-04-01
2,0,Rent Payment,2024-05-01
3,0,Rent Payment,2024-06-01
4,0,Rent Payment,2024-07-01
5,0,Rent Payment,2024-08-01
6,0,Rent Payment,2024-09-01
7,0,Rent Payment,2024-10-01
8,0,Rent Payment,2024-11-01
9,0,Rent Payment,2024-12-01


In [88]:
# Get the frequency of occurrences and compare it to the n_weekly parameters
def check_freq(events_df, events_occurrences_df):
    weekly_results = []
    monthly_results = []

    for event in events_df.itertuples():
        # Get list of occurring_dates for this event.
        event_occurrences = events_occurrences_df.loc[events_occurrences_df["event_id"] == event.Index].copy()

        for current_day in event.days:

            # Get list of occurring_dates for this day.
            if event.recurrent_type == "n-weekly":
                # Filter occurrences
                event_occurrences['weekday'] = event_occurrences['occurring_date'].dt.weekday
                event_occurrences.sort_values(by=["occurring_date"], ascending=True, inplace=True)

                day_occurrences = event_occurrences.loc[event_occurrences["weekday"] == current_day].copy()
                expected_delta = event.interval * 7

                day_occurrences["delta"] = day_occurrences["occurring_date"].diff().dt.days.dropna().astype(int)
                delta_counts = day_occurrences['delta'].value_counts() # Count occurrences of each delta
                most_frequent_delta = int(delta_counts.idxmax())
                most_frequent_count = int(delta_counts.max())
                all_same_frequency = len(delta_counts.unique()) == 1

                # Append results to the list
                weekly_results.append({
                    "event_id": event.Index,
                    "event_name": event.name,
                    "validated": all_same_frequency,
                    "most_frequent_delta": most_frequent_delta,
                    "most_frequent_count": most_frequent_count,
                    "expected_delta": expected_delta,
                    "recurrent_type": event.recurrent_type,
                    "interval": event.interval,
                    "day": current_day
                })

            else:
                # Filter occurrences
                day_occurrences = event_occurrences.loc[event_occurrences["occurring_date"].dt.day == current_day].copy()

                # Append results to the list
                monthly_results.append({
                    "event_id": event.Index,
                    "event_name": event.name,
                    "validated": day_occurrences["occurring_date"].dt.day.eq(current_day).all(),
                    "actual_day": current_day,
                    "occurrences_days": day_occurrences["occurring_date"].dt.day.tolist(),

                })


    weekly_results_df = pd.DataFrame(weekly_results)
    monthly_results_df = pd.DataFrame(monthly_results)

    # Test Stats
    def get_stats(df):
        df_total = df.shape[0]
        df_valid = df["validated"].sum()
        df_percent = df_valid / df_total
        return df_total, df_valid, df_percent

    weekly_total, weekly_valid, weekly_percent = get_stats(weekly_results_df)
    monthly_total, monthly_valid, monthly_percent = get_stats(monthly_results_df)

    overall_total = weekly_total + monthly_total
    overall_valid = weekly_valid + monthly_valid
    overall_percent = overall_valid / overall_total


    print(f"Total Events by day validated {overall_valid}/{overall_total} {overall_percent*100: 0.0f}%")

    print(f"\n Weekly Events by day | Validated: {weekly_valid}/{weekly_total} {weekly_percent*100: 0.0f}%")
    display(weekly_results_df)
    print(f"\n Monthly Events by day | Validated: {monthly_valid}/{monthly_total} {monthly_percent*100: 0.0f}%")
    display(monthly_results_df)

    return weekly_results_df, monthly_results_df


In [89]:
weekly_validation_df, monthly_validation_df = check_freq(existing_events_df, existing_occurrences_df)

Total Events by day validated 49/49  100%

 Weekly Events by day | Validated: 38/38  100%


Unnamed: 0,event_id,event_name,validated,most_frequent_delta,most_frequent_count,expected_delta,recurrent_type,interval,day
0,7,Freelance Work,True,14,25,14,n-weekly,2,2
1,9,Groceries,True,7,51,7,n-weekly,1,6
2,12,Savings Deposit,True,14,25,14,n-weekly,2,4
3,13,Dining Out,True,7,51,7,n-weekly,1,1
4,13,Dining Out,True,7,50,7,n-weekly,1,5
5,15,Expense 17,True,14,24,14,n-weekly,2,3
6,16,Expense 18,True,21,16,21,n-weekly,3,4
7,17,Expense 19,True,28,12,28,n-weekly,4,5
8,18,Expense 20,True,7,48,7,n-weekly,1,6
9,19,Expense 21,True,14,24,14,n-weekly,2,0



 Monthly Events by day | Validated: 11/11  100%


Unnamed: 0,event_id,event_name,validated,actual_day,occurrences_days
0,0,Rent Payment,True,1,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]"
1,1,Gym Membership,True,5,"[5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5]"
2,2,Salary,True,1,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]"
3,2,Salary,True,15,"[15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15]"
4,3,Electric Bill,True,10,"[10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10]"
5,4,Internet Bill,True,15,"[15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15]"
6,5,Netflix Subscription,True,20,"[20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20]"
7,6,Coffee Subscription,True,7,"[7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7]"
8,8,Spotify Subscription,True,25,"[25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25]"
9,10,Insurance Payment,True,12,"[12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12]"


### Export Step 2 results

In [90]:
folder_format = "%y-%b-%d"
sub_format = "%H-%M-%S"
file_format = "%y%m%d_%H%M%S"
folder_time = datetime.now().strftime(folder_format)
subfolder_time = datetime.now().strftime(sub_format)
file_time = datetime.now().strftime(file_format)


folder_name = f"data/{folder_time}/{subfolder_time}/logs"
os.makedirs(folder_name, exist_ok=True)

# Export data
weekly_validation_df.to_csv(f"data/{folder_time}/{subfolder_time}/logs/weekly_validation_{file_time}.csv")
monthly_validation_df.to_csv(f"data/{folder_time}/{subfolder_time}/logs/monthly_validation_{file_time}.csv")

print(f"Saved to: data/{folder_time}/{subfolder_time}/logs/")

Saved to: data/25-Mar-14/19-14-56/logs/


## 2. Test Cases

- For each test case:
    - Verify that the week_start_date matches the case parameters. (and that no week is missing)
    - Check that the weekly count matches the numbers in the actual_weekly_count_df

#### Actual weekly_count

In [103]:
events_weekly_count_df = existing_occurrences_df.copy()
events_weekly_count_df['week_start_date'] = events_weekly_count_df['occurring_date'] - pd.to_timedelta(events_weekly_count_df['occurring_date'].dt.weekday, unit='D')
display(events_weekly_count_df)

# Count occurrences of each Monday date
events_weekly_count_df = events_weekly_count_df['week_start_date'].value_counts().reset_index()
events_weekly_count_df.columns = ['week_start_date', 'count']
print("Actual events weekly count")
display(events_weekly_count_df.describe())

# Count on test_weekly_count
test_weekly_count_agg_df = test_weekly_count_df[(test_weekly_count_df['week_start_date'] >= start_date) & (test_weekly_count_df['week_start_date'] <= end_date)].copy()
test_weekly_count_agg_df = test_weekly_count_agg_df[(  test_weekly_count_agg_df["test_number"] == 2 )]
test_weekly_count_agg_df = test_weekly_count_agg_df['week_start_date'].value_counts().reset_index()
print("Test cases weekly count")
test_weekly_count_agg_df

Unnamed: 0,event_id,event_name,occurring_date,week_start_date
0,0,Rent Payment,2024-03-01,2024-02-26
1,0,Rent Payment,2024-04-01,2024-04-01
2,0,Rent Payment,2024-05-01,2024-04-29
3,0,Rent Payment,2024-06-01,2024-05-27
4,0,Rent Payment,2024-07-01,2024-07-01
...,...,...,...,...
1203,47,Expense 49,2024-12-30,2024-12-30
1204,47,Expense 49,2025-01-13,2025-01-13
1205,47,Expense 49,2025-01-27,2025-01-27
1206,47,Expense 49,2025-02-10,2025-02-10


Actual events weekly count


Unnamed: 0,week_start_date,count
count,53,53.0
mean,2024-08-26 00:00:00,22.792453
min,2024-02-26 00:00:00,3.0
25%,2024-05-27 00:00:00,22.0
50%,2024-08-26 00:00:00,24.0
75%,2024-11-25 00:00:00,24.0
max,2025-02-24 00:00:00,27.0
std,,3.819779


Test cases weekly count


Unnamed: 0,week_start_date,count
0,2024-03-04,1
1,2024-03-11,1
2,2024-03-18,1
3,2024-03-25,1
4,2024-04-01,1
5,2024-04-08,1
6,2024-04-15,1
7,2024-04-22,1
8,2024-04-29,1
9,2024-05-06,1


In [92]:
display(test_cases_data_df.head(10))
display(test_weekly_count_df)


Unnamed: 0,case_number,case_description,name,start_date,recurrent_type,interval,days_of_week,days_of_month,end_date
0,1,Adding a new weekly event on Mondays,Monday Sync,2024-03-04,n-weekly,1.0,[0],,NaT
1,2,Adding a new bi-weekly event on Wednesdays,Bi-Weekly Stand-up,2024-03-06,n-weekly,2.0,[2],,NaT
2,3,Adding a new monthly event on the 15th,Mid-Month Review,2024-03-15,monthly,,,[15],NaT
3,4,Adding a weekly event overlapping multiple exi...,Busy Monday,2024-03-04,n-weekly,1.0,"[0, 3]",,NaT
4,5,Adding an event far into the future,Yearly Review,2025-03-01,monthly,,,[1],NaT
5,6,Weekly Salary Deposit,Weekly Salary,2024-03-01,n-weekly,1.0,[4],,NaT
6,7,Bi-weekly Grocery Shopping,Grocery Shopping,2024-03-03,n-weekly,2.0,[6],,NaT
7,8,Monthly Rent Payment,Rent,2024-03-01,monthly,,,[1],NaT
8,9,Electric Bill Payment,Electric Bill,2024-03-15,monthly,,,[15],NaT
9,10,One-time Medical Expense,Medical Check-up,2024-04-05,,,,,2024-04-05


Unnamed: 0,week_start_date,event_count,test_number
0,2024-03-04,15,1
1,2024-03-11,15,1
2,2024-03-18,24,1
3,2024-03-25,23,1
4,2024-04-01,23,1
...,...,...,...
500,2024-12-02,24,13
501,2024-12-09,23,13
502,2024-12-16,23,13
503,2024-12-23,25,13
