## Credit Card Engagement and Share of Wallet Analysis

### Project Description

This project involved the comprehensive analysis of customer engagement with Internal cards (An imaginary company i work for) versus external cards, utilizing dummy data created specifically for this purpose. The primary focus was on tracking and diagnosing Share of Wallet (SoW) trends, identifying drivers of customer disengagement, and delivering actionable insights to enhance customer engagement.

**Key Objectives:**
- Analyze customer behavior and spending patterns across different card types using generated dummy data.
- Track Share of Wallet trends to understand market penetration and customer loyalty.
- Identify pockets and drivers of customer disengagement to mitigate churn.
- Provide strategic recommendations to optimize customer engagement and increase market share.

**Approach:**
The project utilized a combination of transactional data analysis, statistical modeling, and machine learning techniques applied to the dummy data. Both internal (American Express cards) and external market data were simulated to provide a comprehensive view of customer preferences and behaviors.

**Outcome:**
The analysis enabled a deeper understanding of customer satisfaction, spending habits, and engagement levels across card types. Insights derived from the project informed strategic decisions aimed at improving customer retention and maximizing profitability.

This project demonstrates proficiency in data analytics, strategic thinking, and the ability to leverage insights from simulated data to address real-world challenges in the financial services industry.


In [2]:
# Approach to create dummy data for analysis purposes 

# 1) Create a dummy data for Customer Demographics 

# What is customer demographics data?

# Customer Demographics data is basic information about customers characteristics and attributes. It includes the following details:

# Customer ID: A unique identifier for each customer.
#Age: Age of the customer.
#Gender: Gender identity of the customer (e.g., male, female, non-binary).
#Income Level: Income bracket or range of the customer.
#Location: Geographical location where the customer resides (often includes city, state, or region).
#Education Level: Highest level of education completed by the customer.
#Other Demographic Variables: These could include marital status, household size, occupation, and more, depending on the specific needs of the analysis or project.


# Python code to create the following dummy data

import pandas as pd
import numpy as np
import random

# Generate Customer Demographics Data
num_customers = 5000  # Increased number of customers
customer_ids = range(1, num_customers + 1)
ages = np.random.randint(18, 80, size=num_customers)  # Increased age range
genders = np.random.choice(['Male', 'Female', 'Other'], size=num_customers)  # Added 'Other' gender
income_levels = np.random.choice(['High', 'Medium', 'Low'], size=num_customers)
locations = np.random.choice(['New York, NY', 'Los Angeles, CA', 'Chicago, IL', 'Houston, TX', 'Miami, FL'], size=num_customers)  # Expanded location choices
education_levels = np.random.choice(['Graduate', 'Undergraduate', 'High School'], size=num_customers)

# Create DataFrame
customer_demographics = pd.DataFrame({
    'Customer ID': customer_ids,
    'Age': ages,
    'Gender': genders,
    'Income Level': income_levels,
    'Location': locations,
    'Education Level': education_levels
})

# Display first few rows
print(customer_demographics)

# Save to CSV
customer_demographics.to_csv('customer_demographics.csv', index=False)


      Customer ID  Age  Gender Income Level         Location Education Level
0               1   61  Female         High  Los Angeles, CA   Undergraduate
1               2   44    Male         High      Houston, TX        Graduate
2               3   43    Male          Low        Miami, FL     High School
3               4   26   Other       Medium  Los Angeles, CA        Graduate
4               5   47    Male         High  Los Angeles, CA   Undergraduate
...           ...  ...     ...          ...              ...             ...
4995         4996   71  Female       Medium  Los Angeles, CA        Graduate
4996         4997   30  Female       Medium      Chicago, IL        Graduate
4997         4998   34  Female         High     New York, NY     High School
4998         4999   58    Male         High  Los Angeles, CA     High School
4999         5000   65  Female       Medium        Miami, FL     High School

[5000 rows x 6 columns]


In [6]:
#2) Python code to create Transactional Data

# What is transactional data?

# Transactional Data contains the data about customers spending and transaction on different merchant types. It contains the followig details :

#Transaction ID: A unique identifier for each transaction.
#Customer ID: Identifier for the customer associated with the transaction.
#Transaction Date: Date and time when the transaction occurred.
#Amount Spent: The monetary value of the transaction.
#Merchant Category: Category of the merchant where the transaction took place (e.g., dining, retail, groceries).
#Reward Points Earned: Number of loyalty or reward points accrued for the transaction.


# Python code to create the following dummy data

 # Generate Transactional Data
import pandas as pd
import numpy as np
import random
from datetime import datetime, timedelta

# Function to generate dummy transactional data
def generate_dummy_data(num_rows):
    transaction_ids = range(1, num_rows + 1)
    customer_ids = np.random.choice(range(1, 1001), size=num_rows)
    
    # Generate random transaction dates within the past year
    end_date = datetime.now()
    start_date = end_date - timedelta(days=365)
    transaction_dates = [start_date + timedelta(days=random.randint(0, 365), hours=random.randint(0, 23), minutes=random.randint(0, 59)) for _ in range(num_rows)]
    
    amount_spent = np.round(np.random.uniform(5, 500, size=num_rows), 2)
    merchant_categories = np.random.choice(['Dining', 'Retail', 'Groceries', 'Entertainment', 'Travel'], size=num_rows)
    reward_points = np.round(amount_spent * 0.1).astype(int)
    
    # Generating card types randomly
    card_types = np.random.choice(['Internal', 'External'], size=num_rows)

    transactional_data = pd.DataFrame({
        'transaction_id': transaction_ids,
        'customer_id': customer_ids,
        'transaction_date': transaction_dates,
        'amount_spent': amount_spent,
        'merchant_category': merchant_categories,
        'reward_points_earned': reward_points,
        'card_type': card_types
    })
    return transactional_data

# Generating 50000 rows of dummy data
num_rows = 50000
transactional_data = generate_dummy_data(num_rows)

# Displaying the generated DataFrame
print(transactional_data.head())

# Cleaning the data
transactional_data['transaction_date'] = pd.to_datetime(transactional_data['transaction_date'])
transactional_data['amount_spent'] = transactional_data['amount_spent'].astype(float)

# Saving to CSV
transactional_data.to_csv('transactional_data.csv', index=False)

print("Data generation, cleaning, and saving completed.")




   transaction_id  customer_id           transaction_date  amount_spent  \
0               1           76 2023-10-01 06:29:38.789946        305.03   
1               2          884 2023-08-11 18:10:38.789946        486.60   
2               3          698 2024-06-13 02:56:38.789946        106.86   
3               4          161 2023-09-05 01:26:38.789946        490.99   
4               5          678 2024-04-25 10:01:38.789946        163.20   

  merchant_category  reward_points_earned card_type  
0         Groceries                    31  External  
1            Retail                    49  External  
2         Groceries                    11  External  
3            Dining                    49  External  
4            Dining                    16  External  
Data generation, cleaning, and saving completed.


In [4]:
# Competitor data refers to information about other companies or entities
# operating in the same industry or market as a particular organization.
# typically includes details such as:
#
# - Competitor Name: Name of the competing company or entity.
# - Market Share (%): The percentage of the market controlled or captured by the competitor.
# - Key Features: Unique selling points or attributes of the competitor's products or services
#   (e.g., cashback offers, low interest rates, rewards programs).
# - Customer Satisfaction Rating: Ratings or feedback from customers regarding their
#   satisfaction with the competitor's products or services.
#
# Analyzing competitor data provides businesses with valuable insights into market dynamics,
# competitive landscape, customer preferences, and areas for improvement.
# This information helps organizations understand their position
# relative to competitors, identify opportunities for differentiation, and develop strategies
# to enhance their market presence and customer engagement.


#Python code to generate data for competitors

# Generate Competitor Data
competitors = ['Competitor A', 'Competitor B', 'Competitor C', 'Competitor D', 'Competitor E']
market_share = [random.randint(20, 40) for _ in range(len(competitors))]  # Random market share percentages
possible_key_features = [
    'Cashback, Lower Interest Rates',
    'Better Rewards, Lower Annual Fees',
    'Points for Every Purchase',
    'Exclusive Benefits',
    'No Annual Fees'
]

key_features = [random.choice(possible_key_features) for _ in range(len(competitors))]
customer_satisfaction = [round(random.uniform(3.5, 5.0), 1) for _ in range(len(competitors))]  # Random ratings

# Create DataFrame
competitor_data = pd.DataFrame({
    'Competitor Name': competitors,
    'Market Share (%)': market_share,
    'Key Features': key_features,
    'Customer Satisfaction Rating': customer_satisfaction
})

# Display DataFrame
print(competitor_data)

# Save to CSV
competitor_data.to_csv('competitor_data.csv', index=False)



  Competitor Name  Market Share (%)                       Key Features  \
0    Competitor A                24                     No Annual Fees   
1    Competitor B                36                     No Annual Fees   
2    Competitor C                28  Better Rewards, Lower Annual Fees   
3    Competitor D                28          Points for Every Purchase   
4    Competitor E                37  Better Rewards, Lower Annual Fees   

   Customer Satisfaction Rating  
0                           4.8  
1                           4.2  
2                           4.3  
3                           4.5  
4                           4.4  
