# Credit Card Clustering
Credit card clustering means grouping credit card holders based on their buying habits, credit limits, and many more financial factors. It is also known as credit card segmentation. Such clustering analysis helps businesses find their potential customers and many more marketing strategies.

In [None]:
#import libraries
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from sklearn import cluster
import warnings
warnings.filterwarnings('ignore')

In [None]:
#load the dataset
cc_data = pd.read_csv('/kaggle/input/credit-card-clustering/CC GENERAL.csv')

In [None]:
print(cc_data.head(10))

In [None]:
cc_data.info()

In [None]:
#checking for num values
cc_data.isnull().sum()

In [None]:
#Dropping the null values because it is small and dropping it will not affect the model
cc_data = cc_data.dropna()

In [None]:
cc_data.isnull().sum()

There are three features in the dataset which are very valuable for the task of credit card segmentation:

BALANCE: The balance left in the accounts of credit card customers.
PURCHASES: Amount of purchases made from the accounts of credit card customers.
CREDIT_LIMIT: The limit of the credit card.

In [None]:
clustering_data = cc_data[['BALANCE','PURCHASES','CREDIT_LIMIT']]
from sklearn.preprocessing import MinMaxScaler
for i in clustering_data.columns:
    MinMaxScaler(i)

In [None]:
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters= 5)
clusters = kmeans.fit_predict(clustering_data)
cc_data['Credit_Card_Segment'] = clusters

Now, we have created a new column, Credit_Card_Segment with values 0 to 4, for easily readability the numeric values will be transformed into something relatable.

In [None]:
cc_data['Credit_Card_Segment'] = cc_data['Credit_Card_Segment'].map({0:'Cluster 1',
                                                                     1:'Cluster 2',
                                                                     2:'Cluster 3',
                                                                     3:'Cluster 4',
                                                                     4:'Cluster 5'})
print(cc_data['Credit_Card_Segment'].head(10))

In [None]:
cc_data['Credit_Card_Segment'].value_counts()

In [None]:
list(cc_data['Credit_Card_Segment'].unique())

Now let’s visualize the credit card clusters we found from our cluster analysis:

In [None]:
plot= go.Figure()
for i in list(cc_data['Credit_Card_Segment'].unique()):
    plot.add_trace(go.Scatter3d(x= cc_data[cc_data['Credit_Card_Segment']==i]['BALANCE'],
                                y= cc_data[cc_data['Credit_Card_Segment']==i]['PURCHASES'],
                                z= cc_data[cc_data['Credit_Card_Segment']==i]['CREDIT_LIMIT'],
                                mode = 'markers',marker_size=6,marker_line_width = 1, name= str(i)))

plot.update_traces(hovertemplate = 'BALANCE: %{x} <br>PURCHASES %{y} <br> DCREDIT_LIMIT: %{z}')

plot.update_layout(width = 800,height=800,autosize=True,showlegend = True,
                  scene = dict(xaxis=dict(title='BALANCE', titlefont_color='black'),
                                      yaxis=dict(title='PURCHASES', titlefont_color='black'),
                                        zaxis=dict (title='CREDIT_LIMIT', titlefont_color='black')),
                   font = dict(family = 'Gilroy',color= 'black', size=12))

Credit card cluster analysis means grouping credit card holders based on their buying habits, credit limits, and many more financial factors. Such clustering analysis helps businesses find their potential customers and many more marketing strategies