# Clustering Analysis - Visualization of Clustering Results

In this notebook, we will generate various visualizations to aid in the interpretation of the clustering results. These visualizations will help us understand the distribution and characteristics of the clusters formed in the customer segmentation analysis.


## Step 1: Importing Libraries and Loading Configuration

In this step, we will import the necessary libraries and load the configuration settings. This will set up the environment needed to generate the visualizations. The configuration file will guide the paths to the clustered datasets, which we'll use for visualization.


In [3]:
import os
import json  # Import the json module
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.cluster import KMeans

# Define the path to the config file
config_path = os.path.join(os.path.dirname(os.path.abspath('')), '..', 'config.json')

# Load the configuration
with open(config_path, 'r') as f:
    config = json.load(f)

# Utility function to convert relative path to absolute path
def to_absolute_path(relative_path, start_path):
    return os.path.abspath(os.path.join(start_path, relative_path))

# Define the project root and load paths from the config
project_root = os.path.dirname(os.path.dirname(os.path.abspath('')))
min_max_clusters_path = to_absolute_path(config['min-max_scaled_4_clusters_path'], project_root)
standard_clusters_path = to_absolute_path(config['standard_scaled_4_clusters_path'], project_root)


## Step 2: Loading the Clustered Datasets

Next, we'll load the clustered datasets for both Min-Max scaled and Standard scaled data. This data contains the cluster assignments from our previous K-means clustering analysis. These datasets will be used to create various visualizations that help us interpret the clustering results.


In [5]:
# Load the datasets
df_min_max_clusters = pd.read_csv(min_max_clusters_path)
print(f"Min-Max scaled clusters data loaded successfully from {min_max_clusters_path}")

df_standard_clusters = pd.read_csv(standard_clusters_path)
print(f"Standard scaled clusters data loaded successfully from {standard_clusters_path}")


FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\kusha\\OneDrive\\Documents\\Customer-Churn-Analysis-main\\Clustering_Analysis\\kmeans_model\\min-max_scaled_4_clusters.csv'