## About Dataset
This dataset was collected via Python scraper in March 2023 and contains:

information about all beauty products (over 8,000) from the Sephora online store, including product and brand names, prices, ingredients, ratings, and all features.
user reviews (about 1 million on over 2,000 products) of all products from the Skincare category, including user appearances, and review ratings by other users

# 💼 Business Use Case: Product Pricing & Rating Optimization
🧠 Goal:
Help Sephora (or a competitor) identify which product features, brands, or categories are associated with higher customer ratings and higher prices, in order to:

Optimize product pricing strategies

Curate high-performing product selections

Understand what customers value most (e.g., ingredients, brand prestige, packaging features)

In [99]:
# import pandas as pd

product_info=pd.read_csv(r"C:\Users\soumy\Downloads\archive (29)\product_info.csv", delimiter=',')
product_info.head()

product_info.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8494 entries, 0 to 8493
Data columns (total 27 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   product_id          8494 non-null   object 
 1   product_name        8494 non-null   object 
 2   brand_id            8494 non-null   int64  
 3   brand_name          8494 non-null   object 
 4   loves_count         8494 non-null   int64  
 5   rating              8216 non-null   float64
 6   reviews             8216 non-null   float64
 7   size                6863 non-null   object 
 8   variation_type      7050 non-null   object 
 9   variation_value     6896 non-null   object 
 10  variation_desc      1250 non-null   object 
 11  ingredients         7549 non-null   object 
 12  price_usd           8494 non-null   float64
 13  value_price_usd     451 non-null    float64
 14  sale_price_usd      270 non-null    float64
 15  limited_edition     8494 non-null   int64  
 16  new   

metric details:
The 'Child count' is just a measure of how many options a given product has. For example, a product can have:
6 different colors (child count = 6)
3 different sizes small/medium/large (child count = 3)
3 different sizes and 6 color options each (child count = 18)


In [61]:
#total products across 8000 Sephora Store
product_info['product_id'].nunique()

8494

In [63]:
#total brands
product_info['brand_id'].nunique()

304

In [64]:
#total primary categories
product_info['primary_category'].nunique()

9

In [68]:
#total secondary categories in each primary categories
product_info.groupby('primary_category')['secondary_category'].nunique().sort_values()

primary_category
Gifts               0
Tools & Brushes     4
Fragrance           5
Men                 5
Hair                6
Mini Size           6
Bath & Body         9
Makeup             10
Skincare           13
Name: secondary_category, dtype: int64

In [100]:
#Overview of categories
product_info.groupby(['primary_category', 'secondary_category','tertiary_category']).size()
#.reset_index(name='count').head(134)


#products with more than 4.5 rating
product_info[product_info['rating'] >= 4.5].sort_values(by='price_usd', ascending=False)



Unnamed: 0,product_id,product_name,brand_id,brand_name,loves_count,rating,reviews,size,variation_type,variation_value,...,online_only,out_of_stock,sephora_exclusive,highlights,primary_category,secondary_category,tertiary_category,child_count,child_max_price,child_min_price
4313,P461949,The Concentrate Serum,6201,La Mer,7514,4.5187,563.0,1 oz/ 30 mL,Size,1 oz/ 30 mL,...,0,0,0,"['Without Phthalates', 'Good for: Dryness', 'W...",Skincare,Treatments,Face Serums,1,220.0,220.0
6799,P471097,Facial Sculpting Wand,6314,Shani Darden Skin Care,7753,4.5462,130.0,,,,...,1,0,1,"['Vegan', 'Hyaluronic Acid', 'Good for: Loss o...",Skincare,High Tech Tools,Anti-Aging,0,,
5555,P504221,Trinity+ Starter Kit,6001,NuFACE,1085,5.0000,2.0,,,,...,0,0,1,"['allure 2022 Best of Beauty Award Winner', 'H...",Skincare,High Tech Tools,Anti-Aging,0,,
7821,P501479,Bitter Peach Perfume Set,5869,TOM FORD,3644,4.6667,3.0,,,,...,0,0,0,"['Unisex/ Genderless Scent', 'Layerable Scent'...",Fragrance,Value & Gift Sets,Perfume Gift Sets,0,,
7803,P501561,Lost Cherry Perfume Set,5869,TOM FORD,6342,5.0000,3.0,,,,...,0,0,0,"['Unisex/ Genderless Scent', 'Layerable Scent'...",Fragrance,Value & Gift Sets,Perfume Gift Sets,0,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6685,P473724,Vitamin Tonic,3902,SEPHORA COLLECTION,12475,4.6875,48.0,3.38 oz/ 100 mL,Type,Cucumber,...,0,0,1,"['Best for Dry, Combo, Normal Skin', 'Good for...",Skincare,Cleansers,Toners,0,,
6719,P502481,Lip Gloss Ornament,3902,SEPHORA COLLECTION,3899,5.0000,2.0,0.16 oz / 4.8 g,,,...,0,1,1,"['Cream Formula', 'Shimmer Finish']",Makeup,Value & Gift Sets,,0,,
6720,P461200,Clean Watermelon After Sun Mask,3902,SEPHORA COLLECTION,9000,4.6667,27.0,1 Mask,Size,1 Mask,...,0,0,1,"['Vegan', 'Clean at Sephora']",Skincare,Masks,Sheet Masks,0,,
6579,P467138,Beauty on the Fly Reusable Bag,3902,SEPHORA COLLECTION,16473,4.7000,60.0,,,,...,0,0,1,,Tools & Brushes,Beauty Accessories,Makeup & Travel Cases,0,,
