# Retail Customer Categorization - Data Exploration

This notebook provides an initial exploration of customer data for categorization using fuzzy clustering.

## Objectives
1. Load and inspect raw customer data
2. Perform exploratory data analysis (EDA)
3. Identify features relevant for customer categorization
4. Visualize data distributions and relationships

## Setup

In [None]:
# Import required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path

# Set visualization style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

# Configure pandas display options
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)

print("Libraries imported successfully!")

## 1. Load Data

Load your customer data from the `data/raw/` directory.

In [None]:
# Define paths
DATA_DIR = Path('../data/raw')

# TODO: Load your customer data
# Example:
# df = pd.read_csv(DATA_DIR / 'customers.csv')
# df.head()

print("Data directory:", DATA_DIR.absolute())

## 2. Initial Data Inspection

Examine the structure and basic statistics of the dataset.

In [None]:
# TODO: Inspect your data
# df.info()
# df.describe()
# df.isnull().sum()

## 3. Exploratory Data Analysis

Visualize distributions and relationships in the data.

In [None]:
# TODO: Create visualizations
# Example: Distribution plots
# fig, axes = plt.subplots(2, 2, figsize=(15, 10))
# df['feature1'].hist(ax=axes[0, 0])
# plt.tight_layout()
# plt.show()

## 4. Feature Engineering

Identify and create features for customer categorization.

In [None]:
# TODO: Feature engineering
# Create new features based on domain knowledge

## 5. Save Processed Data

Save cleaned and processed data for modeling.

In [None]:
# TODO: Save processed data
# PROCESSED_DIR = Path('../data/processed')
# df.to_csv(PROCESSED_DIR / 'customers_cleaned.csv', index=False)
# print("Processed data saved successfully!")

## Next Steps

1. Move to `02_fuzzy_clustering_traditional_ml.ipynb` for traditional ML approaches
2. Or proceed to `03_fuzzy_clustering_neural_network.ipynb` for neural network approaches