# Exploratory Data Analysis for Smartphone Purchase Behavior

In this notebook, we will perform exploratory data analysis (EDA) on the smartphone purchase behavior dataset. The goal is to understand the dataset, visualize trends, and identify patterns that can inform our predictive modeling efforts.

In [None]:
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set visualization style
sns.set(style='whitegrid')

In [None]:
# Load the dataset
data = pd.read_csv('../data/processed/smartphone_purchase_data.csv')

# Display the first few rows of the dataset
data.head()

In [None]:
# Summary statistics
data.describe()

In [None]:
# Check for missing values
missing_values = data.isnull().sum()
missing_values[missing_values > 0]

In [None]:
# Visualize the distribution of purchase behavior
plt.figure(figsize=(10, 6))
sns.countplot(x='purchase', data=data)
plt.title('Distribution of Smartphone Purchase Behavior')
plt.xlabel('Purchase')
plt.ylabel('Count')
plt.show()

In [None]:
# Correlation heatmap
plt.figure(figsize=(12, 8))
correlation_matrix = data.corr()
sns.heatmap(correlation_matrix, annot=True, fmt='.2f', cmap='coolwarm')
plt.title('Correlation Heatmap')
plt.show()

## Conclusion

In this notebook, we performed exploratory data analysis on the smartphone purchase behavior dataset. We visualized the distribution of purchase behavior, checked for missing values, and examined correlations between features. These insights will guide our next steps in data preprocessing and model training.