This repository contains a data analysis project performed using a Jupyter Notebook to extract meaningful insights from Amazon sales data.
The project focuses on analyzing Amazon product data, including pricing, discounts, ratings, and categories. It involves data cleaning, exploratory data analysis (EDA), and visualization to understand trends and patterns.
- Python
- Pandas
- NumPy
- Matplotlib
- Seaborn
- WordCloud
-
Contains 1465 records with 16 features
-
Includes:
- Product Name
- Category
- Price (Original & Discounted)
- Rating & Rating Count
- Discount Percentage
- Data Cleaning (handling missing values, duplicates)
- Data Type Conversion (price & percentage columns)
- Exploratory Data Analysis (EDA)
- Data Visualization (graphs & word clouds)
- Most products fall in 41β60% discount range (~35%)
- Average discounted price: βΉ3129.98
- Average discount: 47.67%
- Weak correlation (0.12) between price and rating
- Top categories: Computers & Accessories, Home & Kitchen
- Category-wise rating analysis
- Discount distribution plots
- Word clouds for product keywords
- Clone the repository
- Place
amazon.csvin the root directory - Open and run
Amazon-Sales-Analysis.ipynbin Jupyter Notebook
- Build a recommendation system
- Apply machine learning for price prediction
- Deploy as a dashboard (Streamlit/Power BI)
Kunal BCA Final Year Student | Aspiring Data Scientist