# Smart Gift Planner

### Objective

Create and design a High-Fidelity Mobile App Prototype for a Smart Gift Planner. The prototype must demonstrate the user flow from defining a gift recipient to viewing the personalized, algorithmic recommendations.

## Import Libraries and Data

### Import Libraries

In [10]:
# Dataframes
import pandas as pd
pd.set_option('display.float_format', '{:.2f}'.format)

# AI
from google import genai
from dotenv import load_dotenv
import time
import os
import math
import ast

# Visualizations
import plotly.express as px

### Import Data

In [2]:
products = pd.read_csv("amazon_products.csv")
categories = pd.read_csv("amazon_categories.csv")

In [3]:
products.head()

Unnamed: 0,asin,title,imgUrl,productURL,stars,reviews,price,listPrice,category_id,isBestSeller,boughtInLastMonth
0,B014TMV5YE,"Sion Softside Expandable Roller Luggage, Black...",https://m.media-amazon.com/images/I/815dLQKYIY...,https://www.amazon.com/dp/B014TMV5YE,4.5,0,139.99,0.0,104,False,2000
1,B07GDLCQXV,Luggage Sets Expandable PC+ABS Durable Suitcas...,https://m.media-amazon.com/images/I/81bQlm7vf6...,https://www.amazon.com/dp/B07GDLCQXV,4.5,0,169.99,209.99,104,False,1000
2,B07XSCCZYG,Platinum Elite Softside Expandable Checked Lug...,https://m.media-amazon.com/images/I/71EA35zvJB...,https://www.amazon.com/dp/B07XSCCZYG,4.6,0,365.49,429.99,104,False,300
3,B08MVFKGJM,Freeform Hardside Expandable with Double Spinn...,https://m.media-amazon.com/images/I/91k6NYLQyI...,https://www.amazon.com/dp/B08MVFKGJM,4.6,0,291.59,354.37,104,False,400
4,B01DJLKZBA,Winfield 2 Hardside Expandable Luggage with Sp...,https://m.media-amazon.com/images/I/61NJoaZcP9...,https://www.amazon.com/dp/B01DJLKZBA,4.5,0,174.99,309.99,104,False,400


In [4]:
categories.head()

Unnamed: 0,id,category_name
0,1,Beading & Jewelry Making
1,2,Fabric Decorating
2,3,Knitting & Crochet Supplies
3,4,Printmaking Supplies
4,5,Scrapbooking & Stamping Supplies


In [5]:
# Merge the 2 dfs on category ID
merged = products.merge(
    categories,
    left_on="category_id",
    right_on="id",
    how="left"
)

# Remove numerical category identifiers
merged = merged.drop(columns=["id", "category_id"])

# Save to new .csv
merged.to_csv("merged_products.csv", index=False, encoding="utf-8")

# Save to JSON for SE/UI/UX
merged.to_json("merged_products.json", orient="records", lines=True)

merged.head()

Unnamed: 0,asin,title,imgUrl,productURL,stars,reviews,price,listPrice,isBestSeller,boughtInLastMonth,category_name
0,B014TMV5YE,"Sion Softside Expandable Roller Luggage, Black...",https://m.media-amazon.com/images/I/815dLQKYIY...,https://www.amazon.com/dp/B014TMV5YE,4.5,0,139.99,0.0,False,2000,Suitcases
1,B07GDLCQXV,Luggage Sets Expandable PC+ABS Durable Suitcas...,https://m.media-amazon.com/images/I/81bQlm7vf6...,https://www.amazon.com/dp/B07GDLCQXV,4.5,0,169.99,209.99,False,1000,Suitcases
2,B07XSCCZYG,Platinum Elite Softside Expandable Checked Lug...,https://m.media-amazon.com/images/I/71EA35zvJB...,https://www.amazon.com/dp/B07XSCCZYG,4.6,0,365.49,429.99,False,300,Suitcases
3,B08MVFKGJM,Freeform Hardside Expandable with Double Spinn...,https://m.media-amazon.com/images/I/91k6NYLQyI...,https://www.amazon.com/dp/B08MVFKGJM,4.6,0,291.59,354.37,False,400,Suitcases
4,B01DJLKZBA,Winfield 2 Hardside Expandable Luggage with Sp...,https://m.media-amazon.com/images/I/61NJoaZcP9...,https://www.amazon.com/dp/B01DJLKZBA,4.5,0,174.99,309.99,False,400,Suitcases


## Exploratory Data Analysis

In [6]:
merged.info()
print(merged.isna().sum())
merged.describe()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1426337 entries, 0 to 1426336
Data columns (total 11 columns):
 #   Column             Non-Null Count    Dtype  
---  ------             --------------    -----  
 0   asin               1426337 non-null  object 
 1   title              1426336 non-null  object 
 2   imgUrl             1426337 non-null  object 
 3   productURL         1426337 non-null  object 
 4   stars              1426337 non-null  float64
 5   reviews            1426337 non-null  int64  
 6   price              1426337 non-null  float64
 7   listPrice          1426337 non-null  float64
 8   isBestSeller       1426337 non-null  bool   
 9   boughtInLastMonth  1426337 non-null  int64  
 10  category_name      1426337 non-null  object 
dtypes: bool(1), float64(3), int64(2), object(5)
memory usage: 110.2+ MB
asin                 0
title                1
imgUrl               0
productURL           0
stars                0
reviews              0
price                0
lis

Unnamed: 0,stars,reviews,price,listPrice,boughtInLastMonth
count,1426337.0,1426337.0,1426337.0,1426337.0,1426337.0
mean,4.0,180.75,43.38,12.45,141.98
std,1.34,1761.45,130.29,46.11,836.27
min,0.0,0.0,0.0,0.0,0.0
25%,4.1,0.0,11.99,0.0,0.0
50%,4.4,0.0,19.95,0.0,0.0
75%,4.6,0.0,35.99,0.0,50.0
max,5.0,346563.0,19731.81,999.99,100000.0


In [7]:
# How many products have reviews?
print((merged['reviews'] != 0).sum())

# How many products are best sellers?
print((merged['isBestSeller'] == True).sum())

295834
8520


In [8]:
# Drop row with missing product title
merged = merged.dropna(subset=['title'])

In [20]:
# Check for unique product categories before using AI to categorize them
print(merged['category_name'].unique())
print(len(merged['category_name'].unique()))

['Suitcases' "Men's Clothing" 'Xbox 360 Games, Consoles & Accessories'
 "Men's Shoes" "Men's Accessories" 'Vacuum Cleaners & Floor Care'
 'Televisions & Video Products' 'Additive Manufacturing Products'
 'Headphones & Earbuds' 'PlayStation Vita Games, Consoles & Accessories'
 'Wii U Games, Consoles & Accessories'
 'PlayStation 4 Games, Consoles & Accessories' "Boys' Watches"
 "Girls' Clothing" "Boys' Clothing" 'Pregnancy & Maternity Products'
 'Shaving & Hair Removal Products' 'Fabric Decorating'
 'Industrial Materials' 'Smart Home: Security Cameras and Systems'
 'Office Electronics' 'Sports & Outdoor Play Toys' "Kids' Play Tractors"
 'Slot Cars, Race Tracks & Accessories' 'Video Games'
 'Smart Home: Voice Assistants and Hubs' 'Light Bulbs' 'Toys & Games'
 "Kids' Furniture" 'Automotive Tires & Wheels'
 'Wellness & Relaxation Products' 'Automotive Tools & Equipment'
 'Baby & Toddler Toys' "Kids' Play Boats" 'Computer Monitors'
 "Girls' Jewelry" 'Luggage' 'Printmaking Supplies' "Women's 

The merged df only contains products from 248 out of a total of 270 possible product categories. Regardless, I will categorize all of them with Gemini.

## Data Preprocessing

### Broad Product Categorization (Gemini)

This isn't really needed, I used ChatGPT below. At first I was going to run all products but the category is enough.

In [None]:
"""
# -----------------------------
# Setup client
# -----------------------------
load_dotenv()
api_key_env = os.getenv("API_KEY")  # from local environment variable
client = genai.Client(api_key=api_key_env)

MODEL_NAME = "gemini-2.5-flash"

# -----------------------------
# Free tier batching parameters
# -----------------------------
batch_size = 5                # number of articles per batch (≤ free tier limit)
sleep_time = 35               # seconds to sleep between batches

# List of 270 Amazon categories
amazon_categories = merged['category_name'].unique()
total_categories = len(amazon_categories)
total_batches = (total_categories + batch_size - 1) // batch_size

# Function to categorize into broader categories
def categorize_batch(categories):
    prompt = f"""
#You are an AI assistant. I have a list of Amazon product categories. 
#Please categorize each of the following categories into a broader, high-level category such as 'Electronics', 'Clothing', 'Home', 'Beauty', 'Food', 'Sports', 'Toys', etc. 

#Return the result as a Python dictionary where keys are the original categories and values are the broad category.
#Categories:
#{categories}
"""
    response = client.models.generate_content(
        model=MODEL_NAME,
        contents=prompt
    )
    
# Access the text
    text_response = response.text.strip()
    
    cleaned_response = (
    text_response
    .replace("```python", "")
    .replace("```", "")
    .strip()
)
    
    print(cleaned_response)
    try:
        return ast.literal_eval(text_response)  # convert string dict to Python dict
    except:
        print("Error parsing batch, returning raw text")
        return text_response

# Loop over batches
all_results = {}
for i in range(total_batches):
    start_idx = i * batch_size
    end_idx = min(start_idx + batch_size, total_categories)
    batch = amazon_categories[start_idx:end_idx].tolist()
    
    print(f"Processing batch {i+1}/{total_batches}...")
    batch_result = categorize_batch(batch)
    
    if isinstance(batch_result, dict):
        all_results.update(batch_result)
    else:
        print(f"Batch {i+1} returned invalid format, saving as text")
        all_results[f"batch_{i+1}"] = batch_result
    
    if i < total_batches - 1:
        time.sleep(sleep_time)  # avoid exceeding free-tier rate limit

# Final results
print("Categorization complete!")
print(all_results)

# Use all_results to create a new df column
merged['broad_category'] = merged['category_name'].map(all_results)
"""

Processing batch 1/50...


ServerError: 503 UNAVAILABLE. {'error': {'code': 503, 'message': 'The model is overloaded. Please try again later.', 'status': 'UNAVAILABLE'}}

### Broad Product Categorization (ChatGPT)

In [None]:
categories_dict = {
    'Suitcases': 'Travel',
    "Men's Clothing": 'Clothing',
    'Xbox 360 Games, Consoles & Accessories': 'Electronics',
    "Men's Shoes": 'Clothing',
    "Men's Accessories": 'Clothing',
    'Vacuum Cleaners & Floor Care': 'Home',
    'Televisions & Video Products': 'Electronics',
    'Additive Manufacturing Products': 'Industrial',
    'Headphones & Earbuds': 'Electronics',
    'PlayStation Vita Games, Consoles & Accessories': 'Electronics',
    'Wii U Games, Consoles & Accessories': 'Electronics',
    'PlayStation 4 Games, Consoles & Accessories': 'Electronics',
    "Boys' Watches": 'Clothing',
    "Girls' Clothing": 'Clothing',
    "Boys' Clothing": 'Clothing',
    'Pregnancy & Maternity Products': 'Baby',
    'Shaving & Hair Removal Products': 'Beauty',
    'Fabric Decorating': 'Arts & Crafts',
    'Industrial Materials': 'Industrial',
    'Smart Home: Security Cameras and Systems': 'Smart Home',
    'Office Electronics': 'Electronics',
    'Sports & Outdoor Play Toys': 'Toys',
    "Kids' Play Tractors": 'Toys',
    'Slot Cars, Race Tracks & Accessories': 'Toys',
    'Video Games': 'Electronics',
    'Smart Home: Voice Assistants and Hubs': 'Smart Home',
    'Light Bulbs': 'Home',
    'Toys & Games': 'Toys',
    "Kids' Furniture": 'Home',
    'Automotive Tires & Wheels': 'Automotive',
    'Wellness & Relaxation Products': 'Health',
    'Automotive Tools & Equipment': 'Automotive',
    'Baby & Toddler Toys': 'Baby',
    "Kids' Play Boats": 'Toys',
    'Computer Monitors': 'Electronics',
    "Girls' Jewelry": 'Clothing',
    'Travel': 'Travel',
    'Printmaking Supplies': 'Arts & Crafts',
    "Women's Handbags": 'Clothing',
    'Foot, Hand & Nail Care Products': 'Beauty',
    'Baby & Toddler Feeding Supplies': 'Baby',
    'Computers': 'Electronics',
    'Home Décor Products': 'Home',
    'Industrial Hardware': 'Industrial',
    'Automotive Exterior Accessories': 'Automotive',
    'Skin Care Products': 'Beauty',
    'Wearable Technology': 'Electronics',
    'Reptiles & Amphibian Supplies': 'Pet Supplies',
    'Smart Home: Lawn and Garden': 'Smart Home',
    'Horse Supplies': 'Pet Supplies',
    "Kids' Party Supplies": 'Party Supplies',
    'Tablet Replacement Parts': 'Electronics',
    'Baby Care Products': 'Baby',
    "Kids' Electronics": 'Electronics',
    'Beading & Jewelry Making': 'Arts & Crafts',
    'Computer External Components': 'Electronics',
    'Furniture': 'Home',
    'Nintendo Switch Consoles, Games & Accessories': 'Electronics',
    'Party Decorations': 'Party Supplies',
    'Accessories & Supplies': 'Electronics',
    'Industrial Power & Hand Tools': 'Industrial',
    'Sexual Wellness Products': 'Health',
    'Perfumes & Fragrances': 'Beauty',
    'Household Cleaning Supplies': 'Home',
    "Kids' Play Cars & Race Cars": 'Toys',
    'Pet Bird Supplies': 'Pet Supplies',
    'Janitorial & Sanitation Supplies': 'Industrial',
    'Electrical Equipment': 'Industrial',
    'Dolls & Accessories': 'Toys',
    'RV Parts & Accessories': 'Automotive',
    "Kids' Play Buses": 'Toys',
    'Health & Household': 'Health',
    'Baby Stationery': 'Baby',
    "Girls' School Uniforms": 'Clothing',
    'Travel Duffel Bags': 'Travel',
    'Electronic Components': 'Electronics',
    'Mac Games & Accessories': 'Electronics',
    'Bath Products': 'Beauty',
    'Security & Surveillance Equipment': 'Electronics',
    'Bedding': 'Home',
    'Baby & Child Care Products': 'Baby',
    'Computers & Tablets': 'Electronics',
    'Smart Home: Smart Locks and Entry': 'Smart Devices',
    'Smart Home: WiFi and Networking': 'Smart Devices',
    'Smart Home: Lighting': 'Smart Home',
    'Portable Audio & Video': 'Electronics',
    'Lighting & Ceiling Fans': 'Home',
    'PC Games & Accessories': 'Electronics',
    'Wall Art': 'Home',
    "Baby Girls' Clothing & Shoes": 'Baby',
    'Vehicle Electronics': 'Automotive',
    'Oral Care Products': 'Health',
    'Needlework Supplies': 'Arts & Crafts',
    'Games & Accessories': 'Electronics',
    "Girls' Watches": 'Clothing',
    "Girls' Shoes": 'Clothing',
    'Hair Care Products': 'Beauty',
    'Packaging & Shipping Supplies': 'Industrial',
    "Kids' Dress Up & Pretend Play": 'Toys',
    'Gift Wrapping Supplies': 'Party Supplies',
    'Virtual Reality Hardware & Accessories': 'Electronics',
    'Health Care Products': 'Health',
    'Finger Toys': 'Toys',
    "Kids' Play Trains & Trams": 'Toys',
    "Baby Boys' Clothing & Shoes": 'Baby',
    'Home Appliances': 'Home Appliances',
    'Tricycles, Scooters & Wagons': 'Toys',
    'Fasteners': 'Industrial',
    'Video Projectors': 'Electronics',
    'Vision Products': 'Health',
    'Arts, Crafts & Sewing Storage': 'Arts & Crafts',
    'Science Education Supplies': 'Education',
    'Camera & Photo': 'Electronics',
    'Home Storage & Organization': 'Home',
    'Safety & Security': 'Industrial',
    'Data Storage': 'Electronics',
    'Material Handling Products': 'Industrial',
    'eBook Readers & Accessories': 'Electronics',
    'Baby Strollers & Accessories': 'Baby',
    'Seasonal Décor': 'Home',
    'Heating, Cooling & Air Quality': 'Home Appliances',
    'Lab & Scientific Products': 'Industrial',
    'Smart Home: Other Solutions': 'Smart Devices',
    'Smart Home - Heating & Cooling': 'Smart Devices',
    'Sports & Fitness': 'Sports',
    'Building Supplies': 'Tools & Home Improvement',
    'Building Toys': 'Toys',
    'Xbox One Games, Consoles & Accessories': 'Electronics',
    "Men's Watches": 'Clothing',
    'Novelty Toys & Amusements': 'Toys',
    'Cutting Tools': 'Tools & Home Improvement',
    'Laptop Accessories': 'Electronics',
    'Industrial & Scientific': 'Industrial',
    'Household Supplies': 'Home',
    "Boys' Jewelry": 'Clothing',
    'Filtration': 'Industrial',
    'Small Animal Supplies': 'Pet Supplies',
    'Toy Figures & Playsets': 'Toys',
    'PlayStation 3 Games, Consoles & Accessories': 'Electronics',
    'Xbox Series X & S Consoles, Games & Accessories': 'Electronics',
    'Smart Home: Home Entertainment': 'Smart Home',
    'Professional Dental Supplies': 'Industrial',
    "Women's Accessories": 'Clothing',
    'Heavy Duty & Commercial Vehicle Equipment': 'Automotive',
    'Computer Components': 'Electronics',
    'Baby Activity & Entertainment Products': 'Baby',
    'Painting, Drawing & Art Supplies': 'Arts & Crafts',
    'Lights, Bulbs & Indicators': 'Automotive',
    'Welding & Soldering': 'Industrial',
    'Home Use Medical Supplies & Equipment': 'Health',
    "Women's Clothing": 'Clothing',
    'Knitting & Crochet Supplies': 'Arts & Crafts',
    'Commercial Door Products': 'Industrial',
    'Automotive Enthusiast Merchandise': 'Automotive',
    'Sports Nutrition Products': 'Health',
    'Beauty & Personal Care': 'Beauty',
    'Makeup': 'Beauty',
    'Baby': 'Baby',
    'Learning & Education Toys': 'Toys',
    'GPS & Navigation': 'Electronics',
    'Motorcycle & Powersports': 'Automotive',
    'Video Game Consoles & Accessories': 'Electronics',
    "Boys' School Uniforms": 'Clothing',
    'Online Video Game Services': 'Electronics',
    'PlayStation 5 Consoles, Games & Accessories': 'Electronics',
    'Smart Home: Plugs and Outlets': 'Smart Home',
    'Smart Home: Vacuums and Mops': 'Smart Home',
    'Outdoor Recreation': 'Sports & Outdoors',
    'Sony PSP Games, Consoles & Accessories': 'Electronics',
    'Sports & Outdoors': 'Sports & Outdoors',
    'Kitchen & Bath Fixtures': 'Home',
    'Gift Cards': 'Gift Cards',   # manually updated to match instead of Other
    "Women's Jewelry": 'Clothing',
    'Oils & Fluids': 'Automotive',
    'Toilet Training Products': 'Baby',
    'Baby Safety Products': 'Baby',
    'Messenger Bags': 'Travel',
    "Boys' Accessories": 'Clothing',
    'Garment Bags': 'Travel',
    'Nursery Furniture, Bedding & Décor': 'Baby',
    'Power Transmission Products': 'Industrial',
    'Kitchen & Dining': 'Home & Kitchen',
    'Beauty Tools & Accessories': 'Beauty',
    'Computer Servers': 'Electronics',
    'Hydraulics, Pneumatics & Plumbing': 'Industrial',
    'Party Supplies': 'Party Supplies',
    'Dog Supplies': 'Pet Supplies',
    'Occupational Health & Safety Products': 'Industrial',
    'Cell Phones & Accessories': 'Electronics',
    'Craft & Hobby Fabric': 'Arts & Crafts',
    'Child Safety Car Seats & Accessories': 'Baby',
    'Diet & Sports Nutrition': 'Health',
    'Industrial Adhesives, Sealants & Lubricants': 'Industrial',
    "Kids' Home Store": 'Home',
    'Cat Supplies': 'Pet Supplies',
    'Hardware': 'Tools & Home Improvement',
    'Arts & Crafts Supplies': 'Arts & Crafts',
    "Women's Watches": 'Clothing',
    'Professional Medical Supplies': 'Industrial',
    'Toy Vehicle Playsets': 'Toys',
    'Sewing Products': 'Arts & Crafts',
    'Abrasive & Finishing Products': 'Industrial',
    'Computer Networking': 'Electronics',
    'Home Lighting & Ceiling Fans': 'Home',
    'Ironing Products': 'Home',
    'Retail Store Fixtures & Equipment': 'Industrial',
    'Home Audio & Theater Products': 'Electronics',
    "Boys' Shoes": 'Clothing',
    'Tools & Home Improvement': 'Tools & Home Improvement',
    'Fish & Aquatic Pets': 'Pet Supplies',
    'Pumps & Plumbing Equipment': 'Industrial',
    'Nintendo DS Games, Consoles & Accessories': 'Electronics',
    'Wii Games, Consoles & Accessories': 'Electronics',
    'Power Tools & Hand Tools': 'Tools & Home Improvement',
    'Smart Home Thermostats - Compatibility Checker': 'Smart Home',
    "Women's Shoes": 'Clothing',
    'Car Care': 'Automotive',
    'Baby Travel Gear': 'Baby',
    'Baby Gifts': 'Baby',
    'Luggage Sets': 'Travel',
    'Automotive Paint & Paint Supplies': 'Automotive',
    'Personal Care Products': 'Beauty',
    'Baby Diapering Products': 'Baby',
    'Puzzles': 'Toys',
    'Nintendo 3DS & 2DS Consoles, Games & Accessories': 'Electronics',
    'Legacy Systems': 'Electronics',
    'Measuring & Layout': 'Tools & Home Improvement',
    'Smart Home: New Smart Devices': 'Smart Home',
    'Stuffed Animals & Plush Toys': 'Toys',
    "Kids' Play Trucks": 'Toys',
    'Craft Supplies & Materials': 'Arts & Crafts',
    'Automotive Performance Parts & Accessories': 'Automotive',
    'Automotive Replacement Parts': 'Automotive',
    'Puppets & Puppet Theaters': 'Toys',
    'Tablet Accessories': 'Electronics',
    "Girls' Accessories": 'Clothing',
    'Laptop Bags': 'Travel',
    'Backpacks': 'Travel',
    'Scrapbooking & Stamping Supplies': 'Arts & Crafts',
    'Food Service Equipment & Supplies': 'Industrial',
    'Test, Measure & Inspect': 'Industrial',
    'Travel Tote Bags': 'Travel',
    'Automotive Interior Accessories': 'Automotive',
    'Paint, Wall Treatments & Supplies': 'Tools & Home Improvement',
    'Rain Umbrellas': 'Travel',
    'Travel Accessories': 'Travel',
    'Stationery & Gift Wrapping Supplies': 'Party Supplies',
    'Car Electronics & Accessories': 'Automotive',
}

# Use categories_dict results to create a new df column
merged['broad_category'] = merged['category_name'].map(categories_dict)

# Check
merged.head()

Unnamed: 0,asin,title,imgUrl,productURL,stars,reviews,price,listPrice,isBestSeller,boughtInLastMonth,category_name,broad_category
0,B014TMV5YE,"Sion Softside Expandable Roller Luggage, Black...",https://m.media-amazon.com/images/I/815dLQKYIY...,https://www.amazon.com/dp/B014TMV5YE,4.5,0,139.99,0.0,False,2000,Suitcases,Luggage
1,B07GDLCQXV,Luggage Sets Expandable PC+ABS Durable Suitcas...,https://m.media-amazon.com/images/I/81bQlm7vf6...,https://www.amazon.com/dp/B07GDLCQXV,4.5,0,169.99,209.99,False,1000,Suitcases,Luggage
2,B07XSCCZYG,Platinum Elite Softside Expandable Checked Lug...,https://m.media-amazon.com/images/I/71EA35zvJB...,https://www.amazon.com/dp/B07XSCCZYG,4.6,0,365.49,429.99,False,300,Suitcases,Luggage
3,B08MVFKGJM,Freeform Hardside Expandable with Double Spinn...,https://m.media-amazon.com/images/I/91k6NYLQyI...,https://www.amazon.com/dp/B08MVFKGJM,4.6,0,291.59,354.37,False,400,Suitcases,Luggage
4,B01DJLKZBA,Winfield 2 Hardside Expandable Luggage with Sp...,https://m.media-amazon.com/images/I/61NJoaZcP9...,https://www.amazon.com/dp/B01DJLKZBA,4.5,0,174.99,309.99,False,400,Suitcases,Luggage


## Create Final Dataset .csv and .json

In [37]:
# Save to final .csv
merged.to_csv("final_products.csv", index=False, encoding="utf-8")

# Save to final JSON for SE/UI/UX
merged.to_json("final_products.json", orient="records", lines=True)

## Remove data no longer sold, missing image links, etc.?

Not sure if needed.

## Embeddings

## Visualizations for Dashboard

### # Items per Category

In [29]:
category_counts = merged['broad_category'].value_counts().reset_index()

fig = px.bar(category_counts, 
             x='broad_category',
             y='count',
             color='broad_category',
             title="Products per Category")
#fig.show()
fig.write_html("items_per_broad_category.html")

### Price Distribution

In [None]:
max_price = 300
filtered = merged[merged['price'] <= max_price]

fig = px.histogram(filtered, 
                   x='price', 
                   nbins=100, 
                   title=f"Price Distribution (<= ${max_price})")
#fig.show()
fig.write_html("price_distribution.html")

### Rating vs. Reviews

In [30]:
fig = px.scatter(
    merged,
    x='stars',
    y='reviews',
    size='reviews',
    color='broad_category',
    hover_data=['title', 'price', 'stars', 'reviews'],
    title="Product Ratings vs Review Count",
    size_max=30
)

fig.update_layout(yaxis_type='log')  # optional, if reviews range widely
#fig.show()
fig.write_html("rating_vs_reviews_broad_category.html")