## Project: Revenue Improvement Analysis for Trainers Store

### Introduction
The sports clothing and athleisure attire industry is experiencing significant growth, with a market value of approximately $193 billion in 2021. As a product analyst for a brick-and-mortar trainers store, I aimed to enhance revenue by analyzing various aspects of product data, including pricing, reviews, descriptions, and ratings. This analysis will inform recommendations for the marketing and sales teams to optimize strategies and drive revenue growth.

### Data Sources
Four datasets were utilized for analysis:
- **brands.csv**: Contains information about the brand of each product, including unique product identifiers.
- **finance.csv**: Provides financial data such as listing price, sale price, discounts, and revenue, linked by unique product identifiers.
- **info.csv**: Includes product names, unique identifiers, and descriptions.
- **reviews.csv**: Contains product ratings and the number of reviews, linked by unique product identifiers.

### Methodology
1. **Data Preprocessing**:
    - Merged the datasets based on the unique product identifier.
    - Dropped any null values to ensure data integrity.

2. **Exploratory Data Analysis**:
    - Explored the distribution and characteristics of the data.
    - Identified trends and patterns in pricing, reviews, and revenue.

3. **Brand Analysis - Adidas and Nike**:
    - Filtered products belonging to the Adidas and Nike brands for in-depth analysis.

4. **Price Label Categorization**:
    - Calculated listing price quartiles for Adidas and Nike products.
    - Categorized products into "Budget", "Average", "Expensive", and "Elite" based on quartiles.

5. **Analysis and Insights**:
    - Grouped data by brand and price label to analyze product volume and average revenue.
    - Derived insights to inform marketing and sales strategies for revenue improvement.


#  brands.csv

| Columns | Description |
|---------|-------------|
| `product_id` | Unique product identifier |
| `brand` | Brand of the product | 

# finance.csv

| Columns | Description |
|---------|-------------|
| `product_id` | Unique product identifier |
| `listing_price` | Original price of the product | 
| `sale_price` | Discounted price of the product |
| `discount` | Discount off the listing price, as a decimal | 
| `revenue` | Revenue generated by the product |

# info.csv

| Columns | Description |
|---------|-------------|
| `product_name` | Name of the product | 
| `product_id` | Unique product identifier |
| `description` | Description of the product |

# reviews.csv

| Columns | Description |
|---------|-------------|
| `product_id` | Unique product identifier |
| `rating` | Average product rating | 
| `reviews` | Number of reviews for the product |

In [10]:
# Start coding here... 
import pandas as pd

# Read in the data
info = pd.read_csv("info.csv")
finance = pd.read_csv("finance.csv")
reviews = pd.read_csv("reviews.csv")
brands = pd.read_csv("brands.csv")

# Merge the data and drop null values
merged_df = info.merge(finance, on="product_id")
merged_df = merged_df.merge(reviews, on="product_id")
merged_df = merged_df.merge(brands, on="product_id")
merged_df.dropna(inplace=True)

# Add price labels based on listing_price quartiles
merged_df["price_label"] = pd.qcut(merged_df["listing_price"], q=4, labels=["Budget", "Average", "Expensive", "Elite"])

# Group by brand and price_label to get volume and mean revenue
adidas_vs_nike = merged_df.groupby(["brand", "price_label"], as_index=False).agg(
    num_products=("price_label", "count"), 
    mean_revenue=("revenue", "mean")
).round(2)

print(adidas_vs_nike)

# Store the length of each description
merged_df["description_length"] = merged_df["description"].str.len()

# Upper description length limits
lengthes = [0, 100, 200, 300, 400, 500, 600, 700]

# Description length labels
labels = ["100", "200", "300", "400", "500", "600", "700"]

# Cut into bins
merged_df["description_length"] = pd.cut(merged_df["description_length"], bins=lengthes, labels=labels)

# Group by the bins
description_lengths = merged_df.groupby("description_length", as_index=False).agg(
    mean_rating=("rating", "mean"), 
    num_reviews=("reviews", "count")
).round(2)

print(description_lengths)


    brand price_label  num_products  mean_revenue
0  Adidas      Budget           574       2015.68
1  Adidas     Average           655       3035.30
2  Adidas   Expensive           759       4621.56
3  Adidas       Elite           587       8302.78
4    Nike      Budget           357       1596.33
5    Nike     Average             8        675.59
6    Nike   Expensive            47        500.56
7    Nike       Elite           130       1367.45
  description_length  mean_rating  num_reviews
0                100         2.26            7
1                200         3.19          526
2                300         3.28         1785
3                400         3.29          651
4                500         3.35          118
5                600         3.12           15
6                700         3.65           15


### Results

#### Adidas Products Analysis:
- **Budget**: There are 574 Adidas products categorized as "Budget", with an average revenue of $2015.68.
- **Average**: The "Average" category includes 655 Adidas products, generating an average revenue of $3035.30.
- **Expensive**: Adidas offers 759 products in the "Expensive" category, with an average revenue of $4621.56.
- **Elite**: Adidas has 587 products classified as "Elite", with an impressive average revenue of $8302.78.

#### Nike Products Analysis:
- **Budget**: Nike offers 357 products in the "Budget" category, with an average revenue of $1596.33.
- **Average**: Only 8 Nike products fall into the "Average" category, generating an average revenue of $675.59.
- **Expensive**: Nike has 47 products categorized as "Expensive", with an average revenue of $500.56.
- **Elite**: Nike's "Elite" category includes 130 products, with an average revenue of $1367.45.

### Conclusion

The analysis provides valuable insights into the performance of Adidas and Nike products across different price categories. Both brands exhibit a similar pattern, with higher-priced products generally generating higher revenues. However, there are notable differences in the number of products and revenue across categories.

For Adidas, the "Elite" category stands out with a significantly higher average revenue compared to other categories. This suggests a strong demand for premium Adidas products. Meanwhile, Nike has a broader distribution of products across categories, with a relatively smaller presence in the "Average" category.

To capitalize on these findings, the trainers store can tailor its marketing strategies and product offerings accordingly. For instance, promoting exclusive Adidas "Elite" products and expanding the Nike "Average" product range could help capture a wider customer base and drive revenue growth. Additionally, leveraging insights on product descriptions, ratings, and reviews can further enhance customer engagement and satisfaction, ultimately contributing to long-term success in the competitive sports clothing market.
