<a href="https://colab.research.google.com/github/AnamHJ24/datascience-python-challenges/blob/main/notebooks/Day8.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Day 8 - Lululemon
You are a Product Analyst for the **Lululemon** Online Store team investigating how alternative
payment methods might influence sales performance. The team wants to understand the
potential impact of introducing a new installment payment option. Your analysis will predict
sales lift and customer conversion for the proposed payment method.

In [2]:
# Import required libraries
import pandas as pd
import numpy as np

# Import data file
url ="https://raw.githubusercontent.com/AnamHJ24/datascience-python-challenges/refs/heads/main/Data/Day8.txt"
fct_transactions = pd.read_csv(url)
fct_transactions.head()

Unnamed: 0,customer_id,order_value,payment_method,transaction_id,transaction_date
0,201,250.0,credit_card,1,2025-03-15
1,202,95.0,debit_card,2,2025-03-20
2,203,75.0,paypal,3,2025-03-25
3,204,310.0,credit_card,4,2024-11-10
4,205,65.0,paypal,5,2024-12-05


## Question 1
Between April 1st and June 30th, 2025, what is the count of transactions for each payment method?
This analysis will establish the baseline distribution of how customers currently pay.

## Solution

In [3]:
# Convert required columns to datetime
fct_transactions['transaction_date'] = pd.to_datetime(fct_transactions['transaction_date'])

# Filter data and count number of transactions
filtered_df = fct_transactions[fct_transactions['transaction_date'].between('4-1-2025','6-30-2025')]
transaction_count = filtered_df.groupby('payment_method')['transaction_id'].count()
print("Number of transactions for each payment method:")
print(transaction_count)


Number of transactions for each payment method:
payment_method
credit_card    25
debit_card      8
paypal          7
Name: transaction_id, dtype: int64


## Question 2
Between April 1st and June 30th, 2025, what is the average order value for each payment method? This
metric will help us assess which payment methods are tied to higher spending levels.

## Solution

In [6]:
# Calculate average order value
avg_order = filtered_df.groupby('payment_method')['order_value'].mean()
print("Average order value for each payment method:")
print(avg_order)

Average order value for each payment method:
payment_method
credit_card    281.6
debit_card      90.0
paypal          70.0
Name: order_value, dtype: float64


## Question 3
Between April 1st and June 30th, 2025, what would be the predicted sales lift if a 'pay over time' option
were introduced? Assume that 20% of credit card transactions during this period would switch to using
the 'pay over time' option. And that for these switched transactions, the order value is expected to
increase by 15% based on the average order value of all credit card transactions in that same time
period.


## Solution

In [9]:
# Calculate total sum of order value
total_sales = filtered_df['order_value'].sum()

# Filter credit card users data
credit_card_users = filtered_df[filtered_df['payment_method'] == 'credit_card']

# Calculate 20% of credit card user transactions
user_count = credit_card_users['customer_id'].nunique()
user_count_20 = int(user_count * 0.20)
avg_order_credit = credit_card_users['order_value'].mean( )

# Calculate sales lift
sales_lift = user_count_20 * (avg_order_credit * 0.15)
print("Predicted sales lift if a 'pay over time' option is introduced:",round(sales_lift,2))


Predicted sales lift if a 'pay over time' option is introduced: 211.2
