In [None]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

**Summary** - **The EDA provides valuable insights into the Women's Clothing Ecommerce business. Understanding the highest selling size, trending colors, the relationship between sales, prices, and quantity, and monthly earnings patterns equips the business with actionable information for inventory management, marketing strategies, and overall performance optimization**

In [None]:
df= pd.read_csv("/kaggle/input/-women-clothing-ecommerce-sales-data/women_clothing_ecommerce_sales.csv")
df

In [None]:
df.info()

In [None]:
df.describe(include= "all")

In [None]:
df['size'].fillna(df['size'].mode()[0], inplace=True)

In [None]:
df['size'].isnull().sum()

**Which highest is highest selling size?**

In [None]:
# Visualization distribution of size
plt.figure(figsize=(10, 6))
sns.countplot(x='size', data=df, palette="RdPu")
plt.title('Distribution of size')
plt.xlabel('size')
plt.ylabel('quantity')
plt.show()

**Which color of clothing is in trend?**

In [None]:
# Calculate percentage
counts = df['color'].value_counts()
percentages = counts / counts.sum() * 100

# 
color_mapping = {
    'Dark Blue': '#96b1d8',
    'Light Blue': '#b7dde3',
    'Black': '#d3d3d3',
}

# Create a new column "colors_grouped" and "Others"  categoríes < 2%
df['colors_grouped'] = df['color'].apply(lambda x: x if percentages[x] >= 2 else 'Others')

# Calculate group porcentage 
grouped_counts = df['colors_grouped'].value_counts()
grouped_percentages = grouped_counts / grouped_counts.sum() * 100

# Create group parcentage color pastel  personalizes
colors = [color_mapping.get(color, '#e3c8d6') for color in grouped_percentages.index]

plt.figure(figsize=(10, 6))
plt.pie(grouped_percentages, labels=None, autopct='%1.1f%%', startangle=140, colors=colors)


plt.axis('equal')  # Equal aspect ratio ensures that pie is drawn as a circle.

plt.legend(grouped_percentages.index, loc='center left', bbox_to_anchor=(1, 0.5))
plt.title('Distribution of Colors')
plt.show()

**Relation between sale, price & quantity**

In [None]:

correlation_matrix_numeric = df.drop(['order_id'], axis=1).select_dtypes(include=['float64', 'int64']).corr()

# Crear un mapa de calor con seaborn
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix_numeric, annot=True, cmap='RdPu', fmt=".2f", linewidths=.5)
plt.title("Corr Matrix of sale, price and quantity")
plt.show()

**What is the revenue per month?**

In [None]:
import datetime as dt
df['order_date'] = pd.to_datetime(df['order_date'], format='%Y/%m/%d %H:%M:%S')
df['month'] = df['order_date'].dt.strftime('%b')
df

In [None]:
df['month'] = pd.Categorical(df['month'], ordered=True)
monthly_revenue = df.groupby('month')['revenue'].sum().reset_index()
sns.set_palette("RdPu")
month_order = [ 'Jun','Jul', 'Aug', 'Sep']
# Plot the monthly revenue
plt.figure(figsize=(10, 6))
sns.barplot(x='month', y='revenue', data=monthly_revenue, order= month_order)
plt.title('Monthly Revenue')
plt.xlabel('Month')
plt.ylabel('Total Revenue')
plt.show()

**Conclusion** - The Women's Clothing Ecommerce Exploratory Data Analysis (EDA) reveals valuable insights into various aspects of the business.

1. through the analysis, it was identified that among the available sizes, a **XL** size stands out as the highest-selling, providing valuable information for inventory management and marketing strategies.

2. the examination of color trends in clothing indicates dark **blue color** are ost popular among customers. This information can guide the product design and stocking decisions to meet customer preferences effectively.

3. The relationship between sales, prices, and quantity was explored, shedding light on the dynamics of these variables. Understanding how changes in one factor impact the others is crucial for pricing strategies and optimizing inventory levels.

3. The analysis delved into the earnings per month, providing a comprehensive view of the business's financial performance over time. This insight is essential for assessing the seasonality of sales, identifying peak months, and making informed decisions about resource allocation and marketing efforts.

In conclusion, the Women's Clothing Ecommerce EDA not only provides a snapshot of current business performance but also offers actionable insights for strategic decision-making, ensuring that the company can adapt and thrive in a competitive market.