# Load the dataset from a CSV file

 Data Extraction (SQL)
Here's a simple SQL query to extract the needed customer data from an e-commerce database:

In [None]:
SELECT customer_id, age, gender, total_spent, number_of_orders, last_purchase_date
FROM customers
WHERE active_status = 'Active';


2. Data Cleaning and Preparation (Python with Pandas)
Here's how you might clean and prepare your data using Python:

In [None]:
import pandas as pd

# Load data
df = pd.read_csv('customers.csv')

# Fill missing values
df['age'].fillna(df['age'].mean(), inplace=True)

# Convert last_purchase_date to datetime
df['last_purchase_date'] = pd.to_datetime(df['last_purchase_date'])

# Create a new feature: Days since last purchase
df['days_since_last_purchase'] = (pd.to_datetime('today') - df['last_purchase_date']).dt.days


3. Customer Segmentation using K-means Clustering (Python with Scikit-learn)
Here's a basic example of applying K-means clustering:

In [None]:
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler

# Feature selection and scaling
features = df[['age', 'total_spent', 'number_of_orders', 'days_since_last_purchase']]
scaler = StandardScaler()
features_scaled = scaler.fit_transform(features)

# Clustering
kmeans = KMeans(n_clusters=4, random_state=42)
df['segment'] = kmeans.fit_predict(features_scaled)

# Save the model for later use
import joblib
joblib.dump(kmeans, 'customer_segmentation_model.pkl')


4. Visualization (Python with Matplotlib/Seaborn)
Visualize the segments to understand their characteristics:

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

# Pairplot of the clustered data
sns.pairplot(df, hue='segment')
plt.title('Customer Segments Visualization')
plt.show()
