# Exploratory Data Analysis for Tide Dynamic Pricing

This notebook contains exploratory data analysis (EDA) for the dynamic pricing model of Tide at GlobalMart. The goal is to understand the data, identify patterns, and derive insights that can inform the pricing strategy.

In [None]:
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set visualization style
sns.set(style='whitegrid')

In [None]:
# Load the dataset
data_path = '../data/processed/tide_pricing_data.csv'
tide_data = pd.read_csv(data_path)

# Display the first few rows of the dataset
tide_data.head()

In [None]:
# Summary statistics
tide_data.describe()

In [None]:
# Check for missing values
missing_values = tide_data.isnull().sum()
missing_values[missing_values > 0]

In [None]:
# Visualize the distribution of prices
plt.figure(figsize=(10, 6))
sns.histplot(tide_data['price'], bins=30, kde=True)
plt.title('Price Distribution of Tide Products')
plt.xlabel('Price')
plt.ylabel('Frequency')
plt.show()

In [None]:
# Correlation heatmap
plt.figure(figsize=(12, 8))
correlation_matrix = tide_data.corr()
sns.heatmap(correlation_matrix, annot=True, fmt='.2f', cmap='coolwarm')
plt.title('Correlation Heatmap')
plt.show()

## Insights and Next Steps

Based on the exploratory data analysis, we can derive insights that will help in refining our dynamic pricing model. The next steps will involve feature engineering and model training.