#**Clustering for effective marketing strategy**

# <div class="header1">1. | Introduction 👋</div>
<center>
    <img src="https://images.unsplash.com/photo-1612351978641-ecdafe9caaa5?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1169&q=80" alt="Credit Cards" width="80%">
</center>

## <div class="header2">🤔 Dataset Problems</div>
<div class="explain-box">
    The dataset that will be used contains the <b>usage behavior of around 9000 credit card users</b> for the last six months. It is required to <mark><b>group credit card customers into several groups according to customer behavior</b></mark> to get an <b>effective and efficient credit card marketing strategy</b>.
</div>

## <div class="header2">📌 Notebook Objectives</div>
<div class="explain-box">
    This notebook <b>aims</b> to:
    <ul>
        <li><mark><b>Perform dataset exploration</b></mark> using various types of data visualization.</li>
        <li><mark><b>Perform data preprocessing</b></mark> before using models.</li>
        <li><mark><b>Grouping customers into clusters</b></mark> using various clustering models.</li>
        <li><mark><b>Perform interpretation and analysis of the groups (profiling)</b></mark> that have been created.</li>
        <li><mark><b>Provide marketing suggestions</b></mark> based on profiling results and analysis conducted.</li>
    </ul>
</div>

## <div class="header2">👨 💻 Clustering Models</div>
<div class="explain-box">
    The <b>clustering models</b> used in this notebook are:
    <ol>
        <li><b>Partition based (K-Means)</b>,</li>
        <li><b>Density based (DBSCAN)</b>, and</li>
        <li><b>Hierarchical Clustering (Agglomerative)</b>.</li>
    </ol>
</div>

## <div class="header2">🧾 Dataset Description</div>
<div class="explain-box">
    The following is the <b>structure of the dataset</b>.<br>
    
<table style="font-family: Merriweather; font-weight: 300; font-size: 12px; text-align: left; padding: 8px; border-collapse: collapse; width: 100%;">
  <thead>
    <tr>
      <th style="font-family: Merriweather; font-weight: 900; text-align: center; font-size: 14px; background-color: #FFCC00">Variable Name</th>
      <th style="font-family: Merriweather; font-weight: 900; text-align: center; font-size: 14px; background-color: #FFCC00">Description</th>
      <th style="font-family: Merriweather; font-weight: 900; text-align: center; font-size: 14px; background-color: #FFCC00">Sample Data</th>
    </tr>
  </thead>
  <tbody>
      <tr>
          <td>CUST_ID</td>
          <td>Credit card holder ID</td>
          <td>C10001; C10002; ...</td>
      </tr>
      <tr>
          <td>BALANCE</td>
          <td>Remaining account balance available for purchases</td>
          <td>40.900749; 3202.467416; ...</td>
      </tr>
      <tr>
          <td>BALANCE_FREQUENCY</td>
          <td>Balance update frequency (between 0 and 1)<br><br>1 = frequently updated<br>0 = not frequently updated</td>
          <td>0.818182; 0.909091; ...</td>
      </tr>
      <tr>
          <td>PURCHASES</td>
          <td>Account purchases amount</td>
          <td>95.4; 773.17; ...</td>
      </tr>
      <tr>
          <td>ONEOFF_PURCHASES</td>
          <td>Maximum purchase amount in single transaction</td>
          <td>1499; 16; ...</td>
      </tr>
      <tr>
          <td>INSTALLMENTS_PURCHASES</td>
          <td>Amount purchase in installment</td>
          <td>95.4; 1333.28; ...</td>
      </tr>
      <tr>
          <td>CASH_ADVANCE</td>
          <td>The user's advance payment in cash</td>
          <td>6442.945483; 205.788017; ...</td>
      </tr>
      <tr>
          <td>PURCHASES_FREQUENCY</td>
          <td>Frequency of purchases made on a regular basis (between 0 and 1)<br><br>1 = frequently purchased<br>0 = not frequently purchased</td>
          <td>0.166667; 0.083333; ...</td>
      </tr>
      <tr>
          <td>ONEOFF_PURCHASES_FREQUENCY</td>
          <td>Frequency of purchases made in single transaction (between 0 and 1)<br><br>1 = frequently purchased<br>0 = not frequently purchased</td>
          <td>0.083333; 0.083333; ...</td>
      </tr>
      <tr>
          <td>PURCHASES_INSTALLMENTS_FREQUENCY</td>
          <td>Frequency of done purchases in installments (between 0 and 1)<br><br>1 = frequently done<br>0 = not frequently done</td>
          <td>0.083333; 0.583333; ...</td>
      </tr>
      <tr>
          <td>CASH_ADVANCE_FREQUENCY</td>
          <td>Frequency of cash in advance</td>
          <td>0.25; 0.083333; ...</td>
      </tr>
      <tr>
          <td>CASH_ADVANCE_TRX</td>
          <td>"Cash in advance" total transactions</td>
          <td>0; 4; ...</td>
      </tr>
      <tr>
          <td>PURCHASES_TRX</td>
          <td>Purchase total transactions</td>
          <td>2; 12; ...</td>
      </tr>
      <tr>
          <td>CREDIT_LIMIT</td>
          <td>Credit card limit of an user</td>
          <td>1000; 7000; ...</td>
      </tr>
      <tr>
          <td>PAYMENTS</td>
          <td>Total amount paid by the user</td>
          <td>201.802084; 4103.032597; ...</td>
      </tr>
      <tr>
          <td>MINIMUM_PAYMENTS</td>
          <td>Minimum payment amount made by user</td>
          <td>139.509787; 1072.340217; ...</td>
      </tr>
      <tr>
          <td>PRC_FULL_PAYMENT</td>
          <td>Percent of total charge paid by the user</td>
          <td>0; 0.222222; ...</td>
      </tr>
      <tr>
          <td>TENURE</td>
          <td>Credit card tenure of an user</td>
          <td>12; 8; ...</td>
      </tr>
  </tbody>
</table>

# <div class="header1">2. | Preparing Notebooks 📚</div>
<div class="explain-box">
    <b>Installing and importing libraries, preparing the notebook settings and color palettes</b> that will be used in this notebook.
</div>

In [None]:
!pip install --upgrade pandas
!pip uninstall -y pandas-profiling
!pip install pandas-profiling

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pandas
  Downloading pandas-2.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.3/12.3 MB[0m [31m49.0 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: pandas
  Attempting uninstall: pandas
    Found existing installation: pandas 1.5.3
    Uninstalling pandas-1.5.3:
      Successfully uninstalled pandas-1.5.3
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires pandas~=1.5.3, but you have pandas 2.0.1 which is incompatible.[0m[31m
[0mSuccessfully installed pandas-2.0.1
Found existing installation: pandas-profiling 3.2.0
Uninstalling pandas-profiling-3.2.0:
  Successfully uninstalled pandas-profiling-3.2.0
Looking in index

In [None]:
# --- Installing Libraries ---
!pip install pandas-profiling==3.2.0
!pip install pywaffle
!pip install pandas_profiling
# --- Importing Libraries ---
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
import os
import yellowbrick
import scipy.cluster.hierarchy as shc
import matplotlib.patches as patches

from matplotlib.patches import Rectangle
from pywaffle import Waffle
from math import isnan
from random import sample
from numpy.random import uniform
from sklearn.neighbors import NearestNeighbors
from sklearn.impute import KNNImputer
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans, DBSCAN, AgglomerativeClustering
from sklearn.metrics import davies_bouldin_score, silhouette_score, calinski_harabasz_score
from yellowbrick.cluster import KElbowVisualizer, SilhouetteVisualizer
from yellowbrick.style import set_palette
from yellowbrick.contrib.wrapper import wrap

# --- Libraries Settings ---
warnings.filterwarnings('ignore')
sns.set_style('whitegrid')
plt.rcParams['figure.dpi'] = 600
sns.set(rc = {'axes.facecolor': '#FBFBFB', 'figure.facecolor': '#FBFBFB'})
class clr:
    start = '\033[93m'+'\033[1m'
    color = '\033[93m'
    end = '\033[0m'

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
