# Friday Purchases II using Pandas

Fridays have always been a popular day for shopping, whether it's payday splurges, weekend preparations, or early holiday shopping in November. But how much do people actually spend on Fridays, and how does it vary throughout the month?

In this analysis, we take a closer look at consumer spending trends on Fridays in November 2023. By filtering transaction data to isolate purchases made on Fridays, we aim to uncover patterns in weekly spending and determine whether certain Fridays see more significant activity than others.

Using Python and Pandas, we efficiently extract all Fridays from the month, categorize them by their respective weeks, and calculate total spending for each Friday. The result? A structured dataset that reveals intriguing insights into Friday spending behavior.

Join us as we walk through the process, from data filtering to aggregation, and discover how purchase patterns shift across different weeks. Whether you're a data enthusiast, a business analyst, or just curious about consumer behavior, this deep dive will provide valuable takeaways!

**Table: Purchases**

| Column Name   | Type |
|---------------|------|
| user_id       | int  |
| purchase_date | date |
| amount_spend  | int  |

(user_id, purchase_date, amount_spend) is the primary key (combination of columns with unique values) for this table.
purchase_date will range from November 1, 2023, to November 30, 2023, inclusive of both dates.
Each row contains user id, purchase date, and amount spend.
Write a solution to calculate the total spending by users on each Friday of every week in November 2023. If there are no purchases on a particular Friday of a week, it will be considered as 0.

Return the result table ordered by week of month in ascending order.

The result format is in the following example.


Example 1:

Input:
**Purchases table:**

| user_id | purchase_date | amount_spend |
|---------|---------------|--------------|
| 11      | 2023-11-07    | 1126         |
| 15      | 2023-11-30    | 7473         |
| 17      | 2023-11-14    | 2414         |
| 12      | 2023-11-24    | 9692         |
| 8       | 2023-11-03    | 5117         |
| 1       | 2023-11-16    | 5241         |
| 10      | 2023-11-12    | 8266         |
| 13      | 2023-11-24    | 12000        |

Output:

| week_of_month | purchase_date | total_amount |
|---------------|---------------|--------------|
| 1             | 2023-11-03    | 5117         |
| 2             | 2023-11-10    | 0            |
| 3             | 2023-11-17    | 0            |
| 4             | 2023-11-24    | 21692        |

Explanation:
- During the first week of November 2023, transactions amounting to \$5,117 occurred on Friday, 2023-11-03.
- For the second week of November 2023, there were no transactions on Friday, 2023-11-10, resulting in a value of 0 in the output table for that day.
- Similarly, during the third week of November 2023, there were no transactions on Friday, 2023-11-17, reflected as 0 in the output table for that specific day.
- In the fourth week of November 2023, two transactions took place on Friday, 2023-11-24, amounting to \$12,000 and \$9,692 respectively, summing up to a total of \$21,692.
Output table is ordered by week_of_month in ascending order.

In [28]:
import pandas as pd

data = [[11, '2023-11-07', 1126],
        [15, '2023-11-30', 7473],
        [17, '2023-11-14', 2414],
        [12, '2023-11-24', 9692],
        [8, '2023-11-03', 5117],
        [1, '2023-11-16', 5241],
        [10, '2023-11-12', 8266],
        [13, '2023-11-24', 12000]]
purchases = pd.DataFrame(data,
                         columns=['user_id',
                                  'purchase_date',
                                  'amount_spend']).astype({
                                  'user_id':'Int64',
                                  'purchase_date':'datetime64[ns]',
                                  'amount_spend':'Int64'})
display(purchases)

Unnamed: 0,user_id,purchase_date,amount_spend
0,11,2023-11-07,1126
1,15,2023-11-30,7473
2,17,2023-11-14,2414
3,12,2023-11-24,9692
4,8,2023-11-03,5117
5,1,2023-11-16,5241
6,10,2023-11-12,8266
7,13,2023-11-24,12000


**Step 1: Generate a Date Range for November 2023**
- This creates a range of dates from November 1, 2023, to November 30, 2023, with a daily frequency.
- A DataFrame df is created with a column purchase_date containing all these dates.

In [29]:
date_range = pd.date_range(start='2023-11-01', end='2023-11-30', freq='D')
df = pd.DataFrame({'purchase_date': date_range})
display(df.head())
display(df.tail())

Unnamed: 0,purchase_date
0,2023-11-01
1,2023-11-02
2,2023-11-03
3,2023-11-04
4,2023-11-05


Unnamed: 0,purchase_date
25,2023-11-26
26,2023-11-27
27,2023-11-28
28,2023-11-29
29,2023-11-30


**Step 2: Extract the Day of the Week**
- A new column day is added to df, containing the name of the day (e.g., "Monday", "Tuesday", etc.) for each date.

In [30]:
df["day"] = df["purchase_date"].dt.day_name()
display(df.head(7))

Unnamed: 0,purchase_date,day
0,2023-11-01,Wednesday
1,2023-11-02,Thursday
2,2023-11-03,Friday
3,2023-11-04,Saturday
4,2023-11-05,Sunday
5,2023-11-06,Monday
6,2023-11-07,Tuesday


**Step 3: Filter for Fridays Only**
- The DataFrame df is filtered to retain only the rows where the day is "Friday".

In [31]:
df = df[df["day"]=="Friday"]
display(df)

Unnamed: 0,purchase_date,day
2,2023-11-03,Friday
9,2023-11-10,Friday
16,2023-11-17,Friday
23,2023-11-24,Friday


**Step 4: Assign a Week Number to Each Friday**
- A new column week_of_month is created, numbering each Friday sequentially (1 for the first Friday, 2 for the second, etc.).

In [32]:
df["week_of_month"] = [i+1 for i in range(df.shape[0])]
display(df)

Unnamed: 0,purchase_date,day,week_of_month
2,2023-11-03,Friday,1
9,2023-11-10,Friday,2
16,2023-11-17,Friday,3
23,2023-11-24,Friday,4


**Step 5: Convert Dates to String Format**
- The purchase_date column is converted from datetime format to string format.

In [33]:
df["purchase_date"] = df["purchase_date"].astype(str)
display(df)

Unnamed: 0,purchase_date,day,week_of_month
2,2023-11-03,Friday,1
9,2023-11-10,Friday,2
16,2023-11-17,Friday,3
23,2023-11-24,Friday,4


**Step 6: Create a List of Fridays**
- A list friday_list is created, containing only the string representations of the Fridays in November.

In [34]:
friday_list = df["purchase_date"].tolist()
print(friday_list)

['2023-11-03', '2023-11-10', '2023-11-17', '2023-11-24']


**Step 7: Filter Purchases for Fridays Only**
- The purchase_date column in purchases is converted to a string format.
- The purchases DataFrame is filtered to keep only those rows where purchase_date is in friday_list, meaning only purchases made on Fridays are retained.

In [35]:
purchases["purchase_date"] = purchases["purchase_date"].astype(str)
purchases = purchases[purchases["purchase_date"].astype(str).isin(friday_list)]
display(purchases)

Unnamed: 0,user_id,purchase_date,amount_spend
3,12,2023-11-24,9692
4,8,2023-11-03,5117
7,13,2023-11-24,12000


**Step 8: Maintain Friday Order in Purchases**
- The purchase_date column is converted into a categorical variable, with categories explicitly set to friday_list.
- This ensures that the Fridays appear in the correct order when sorting or grouping.

In [36]:
purchases["purchase_date"] = purchases["purchase_date"].astype("category").cat.set_categories(friday_list)
display(purchases)

Unnamed: 0,user_id,purchase_date,amount_spend
3,12,2023-11-24,9692
4,8,2023-11-03,5117
7,13,2023-11-24,12000


**Step 9: Group Purchases by Date and Sum the Amounts**
- The purchases DataFrame is grouped by purchase_date, and the total amount_spend is summed for each Friday.
- The result is a DataFrame with purchase_date and total amount spent per Friday.

In [37]:
purchases = purchases.groupby(["purchase_date"], observed=False)["amount_spend"].sum().reset_index()
display(purchases)

Unnamed: 0,purchase_date,amount_spend
0,2023-11-03,5117
1,2023-11-10,0
2,2023-11-17,0
3,2023-11-24,21692


**Step 10: Rename Column for Clarity**
- The column amount_spend is renamed to total_amount for better readability

In [38]:
purchases = purchases.rename(columns={"amount_spend": "total_amount"})
display(purchases)

Unnamed: 0,purchase_date,total_amount
0,2023-11-03,5117
1,2023-11-10,0
2,2023-11-17,0
3,2023-11-24,21692


**Step 11: Merge with the Original Friday DataFrame**
- The DataFrame purchases is merged with df (which contains week_of_month), ensuring that each row also has the corresponding week number.

In [39]:
purchases = purchases.merge(df, how="left")
display(purchases)

Unnamed: 0,purchase_date,total_amount,day,week_of_month
0,2023-11-03,5117,Friday,1
1,2023-11-10,0,Friday,2
2,2023-11-17,0,Friday,3
3,2023-11-24,21692,Friday,4


**Step 12: Select Final Columns**
- The final DataFrame retains only three columns:
- week_of_month (week number of the month for the Friday)
- purchase_date (date of the Friday)
- total_amount (total spend on that Friday)

In [40]:
purchases = purchases[['week_of_month', 'purchase_date', 'total_amount']]
display(purchases)

Unnamed: 0,week_of_month,purchase_date,total_amount
0,1,2023-11-03,5117
1,2,2023-11-10,0
2,3,2023-11-17,0
3,4,2023-11-24,21692


Reference: [1] https://leetcode.com/problems/friday-purchases-ii/description/