# CODTECH INTERNSHIP TASKS: DATA ANALYSIS



## INTERNSHIP TASK 3: DASHBOARD  DEVELOPMENT



## Customer Segmentation Analysis 


### Dataset Link:  https://www.kaggle.com/code/analystoleksandra/marketing-analytics-customer-segmentation?scriptVersionId=148503162&cellId=6



### Step 1: Import Required Libraries And Load The Dataset


In [1]:
pip install dash dash-bootstrap-components plotly pandas

Note: you may need to restart the kernel to use updated packages.


In [3]:
import dash
from dash import dcc, html
from dash.dependencies import Input, Output
import plotly.express as px
import pandas as pd
import numpy as np

# Load the dataset
df = pd.read_csv('ifood_df.csv')

In [4]:
df

Unnamed: 0,Income,Kidhome,Teenhome,Recency,MntWines,MntFruits,MntMeatProducts,MntFishProducts,MntSweetProducts,MntGoldProds,...,marital_Together,marital_Widow,education_2n Cycle,education_Basic,education_Graduation,education_Master,education_PhD,MntTotal,MntRegularProds,AcceptedCmpOverall
0,58138.0,0,0,58,635,88,546,172,88,88,...,0,0,0,0,1,0,0,1529,1441,0
1,46344.0,1,1,38,11,1,6,2,1,6,...,0,0,0,0,1,0,0,21,15,0
2,71613.0,0,0,26,426,49,127,111,21,42,...,1,0,0,0,1,0,0,734,692,0
3,26646.0,1,0,26,11,4,20,10,3,5,...,1,0,0,0,1,0,0,48,43,0
4,58293.0,1,0,94,173,43,118,46,27,15,...,0,0,0,0,0,0,1,407,392,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2200,61223.0,0,1,46,709,43,182,42,118,247,...,0,0,0,0,1,0,0,1094,847,0
2201,64014.0,2,1,56,406,0,30,0,0,8,...,1,0,0,0,0,0,1,436,428,1
2202,56981.0,0,0,91,908,48,217,32,12,24,...,0,0,0,0,1,0,0,1217,1193,1
2203,69245.0,0,1,8,428,30,214,80,30,61,...,1,0,0,0,0,1,0,782,721,0


### Step 2 : Initialize the Dash App

#### Initialize the Dash app and set up the layout.

In [5]:
# Initialize the Dash app
app = dash.Dash(__name__)

In [6]:
# Define the layout
app.layout = html.Div([
    html.H1("Customer Segmentation Analysis Dashboard", style={'textAlign': 'center'}),
    
    # Dropdown for selecting the type of visualization
    dcc.Dropdown(
        id='visualization-dropdown',
        options=[
            {'label': 'Income Distribution', 'value': 'income'},
            {'label': 'Total Spending by Category', 'value': 'spending'},
            {'label': 'Purchase Channels', 'value': 'purchase_channels'},
            {'label': 'Customer Age Distribution', 'value': 'age'},
            {'label': 'Marital Status Distribution', 'value': 'marital_status'},
            {'label': 'Education Level Distribution', 'value': 'education'}
        ],
        value='income',  # Default value
        style={'width': '50%', 'margin': 'auto'}
    ),
    
    # Graph component to display the visualizations
    dcc.Graph(id='visualization-graph')
])

### Step 3: Define Callbacks

#### Define the callback function to update the graph based on the selected visualization.

In [7]:
# Define the callback
@app.callback(
    Output('visualization-graph', 'figure'),
    [Input('visualization-dropdown', 'value')]
)
def update_graph(selected_visualization):
    if selected_visualization == 'income':
        # Histogram for Income Distribution
        fig = px.histogram(df, x='Income', nbins=30, title='Income Distribution')
        fig.update_layout(xaxis_title='Income', yaxis_title='Count')
    
    elif selected_visualization == 'spending':
        # Bar chart for Total Spending by Category
        spending_columns = ['MntWines', 'MntFruits', 'MntMeatProducts', 'MntFishProducts', 'MntSweetProducts', 'MntGoldProds']
        total_spending = df[spending_columns].sum()
        fig = px.bar(x=spending_columns, y=total_spending, title='Total Spending by Category')
        fig.update_layout(xaxis_title='Category', yaxis_title='Total Spending')
    
    elif selected_visualization == 'purchase_channels':
        # Bar chart for Purchase Channels
        purchase_channels = ['NumWebPurchases', 'NumCatalogPurchases', 'NumStorePurchases']
        total_purchases = df[purchase_channels].sum()
        fig = px.bar(x=purchase_channels, y=total_purchases, title='Purchase Channels')
        fig.update_layout(xaxis_title='Channel', yaxis_title='Total Purchases')
    
    elif selected_visualization == 'age':
        # Histogram for Customer Age Distribution
        fig = px.histogram(df, x='Age', nbins=30, title='Customer Age Distribution')
        fig.update_layout(xaxis_title='Age', yaxis_title='Count')
    
    elif selected_visualization == 'marital_status':
        # Bar chart for Marital Status Distribution
        marital_status = df[['marital_Divorced', 'marital_Married', 'marital_Single', 'marital_Together', 'marital_Widow']].sum()
        fig = px.bar(x=marital_status.index, y=marital_status.values, title='Marital Status Distribution')
        fig.update_layout(xaxis_title='Marital Status', yaxis_title='Count')
    
    elif selected_visualization == 'education':
        # Bar chart for Education Level Distribution
        education_levels = df[['education_2n Cycle', 'education_Basic', 'education_Graduation', 'education_Master', 'education_PhD']].sum()
        fig = px.bar(x=education_levels.index, y=education_levels.values, title='Education Level Distribution')
        fig.update_layout(xaxis_title='Education Level', yaxis_title='Count')
    
    return fig

### Step 4: Run the App

#### Finally, run the app.

In [8]:
# Run the app
if __name__ == '__main__':
    app.run_server(debug=True, port=8051) 


### Step 5:  Insights and Conclusions


#### 1.   Income Distribution: Most customers fall in the 30k-45k income range, with a significant drop in higher income brackets (e.g., 100k-105k).
#### Conclusion: Focus on middle-income customers (30k-45k) for targeted marketing, while offering premium products to high-income groups (70k+).


#### 2. Total Spending by Category:  Wines dominate spending (675k), followed by Meat Products (364k). Fruits and Sweets have the lowest spending.
#### Conclusion: Prioritize promoting wines and meat products. Introduce discounts or bundles for fruits and sweets to boost sales.


####  3. Purchase Channels:  Store Purchases (12.8k) are the most popular, followed by Web Purchases (9k) and Catalog Purchases (5.8k).
#### Conclusion: Enhance in-store experiences and promotions. Improve the online shopping platform to increase web purchases.


####  4. Customer Age Distribution:   The majority of customers are aged 50-51 (145), followed by 40-41 (92) and 60-61 (99).
#### Conclusion: Tailor marketing campaigns to middle-aged and senior customers (40-60 years), as they form the largest customer base.


#### 5. Marital Status:  Married customers (854) dominate, followed by those Together (568) and Single (477).
#### Conclusion: Create family-oriented campaigns for married customers and personalized offers for singles.


#### 6. Education Level:   Most customers are Graduates (1113), followed by PhD holders (476) and Master’s degree holders (364).
#### Conclusion: Offer specialized or premium products to highly educated customers (Graduates, PhDs) and value-for-money deals to others.

### Step 6: Conclusion


#### The e-commerce company should:
#### Target middle-income, middle-aged, married, and educated customers as they form the largest customer base.
#### Focus on high-spending categories like wines and meat products.
#### Optimize purchase channels, especially in-store and web platforms.
#### Personalize marketing campaigns based on income, age, marital status, and education level to improve customer engagement and satisfaction.