# **Introduction**

The e-commerce industry thrives on customer feedback, and Amazon product reviews play a crucial role in influencing purchasing decisions. However, raw review data is often unstructured, containing inconsistencies, missing values, and duplicate entries. Analyzing this data can provide insights into customer preferences, pricing effectiveness, and product ratings.
This project focuses on cleaning, analyzing, and visualizing Amazon product reviews using SQL and Power BI. It highlights the importance of data preprocessing for accurate business insights and presents a structured workflow that can be applied to other industries

# **Objective**

The aim of this project is to transform raw Amazon product review data into meaningful insights through data cleaning, analysis, and visualization. By leveraging SQL for data preprocessing and Power BI for interactive dashboards, this project demonstrates the ability to:

✅ Clean and preprocess raw review data to remove inconsistencies and improve data quality.

✅ Analyze customer ratings and reviews to identify patterns and trends.

✅ Examine the relationship between pricing, discounts, and product ratings to understand purchasing behavior.

✅ Develop an interactive Power BI dashboard to present insights in a visually appealing and business-friendly manner.

#### **1.Import Packages**


In [22]:
import sqlite3  # For SQLite database interaction
import pandas as pd  # For handling data


**2. Load data**

In [23]:
df = pd.read_csv('C:\\Users\\vero9\\Desktop\\Amazon-reviews-analysis\\amazon.csv')


In [24]:
# Connect to SQLite database (it will create a new database file if it doesn't exist)
conn = sqlite3.connect('amazon_reviews.db')

In [25]:
# Write the DataFrame to an SQLite table
df.to_sql('amazon_reviews', conn, if_exists='replace', index=False)
conn.close()

In [None]:
%load_ext sql



The sql extension is already loaded. To reload it, use:
  %reload_ext sql
MetaData.__init__() got an unexpected keyword argument 'bind'
Connection info needed in SQLAlchemy format, example:
               postgresql://username:password@hostname/dbname
               or an existing connection: dict_keys([])


**3. Data Querring**

In [29]:
# View the first few rows of the dataset
df.head()

Unnamed: 0,product_id,product_name,category,discounted_price,actual_price,discount_percentage,rating,rating_count,about_product,user_id,user_name,review_id,review_title,review_content,img_link,product_link
0,B07JW9H4J1,Wayona Nylon Braided USB to Lightning Fast Cha...,Computers&Accessories|Accessories&Peripherals|...,₹399,"₹1,099",64%,4.2,24269,High Compatibility : Compatible With iPhone 12...,"AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/W/WEBP_40237...,https://www.amazon.in/Wayona-Braided-WN3LG1-Sy...
1,B098NS6PVG,Ambrane Unbreakable 60W / 3A Fast Charging 1.5...,Computers&Accessories|Accessories&Peripherals|...,₹199,₹349,43%,4.0,43994,"Compatible with all Type C enabled devices, be...","AECPFYFQVRUWC3KGNLJIOREFP5LQ,AGYYVPDD7YG7FYNBX...","ArdKn,Nirbhay kumar,Sagar Viswanathan,Asp,Plac...","RGIQEG07R9HS2,R1SMWZQ86XIN8U,R2J3Y1WL29GWDE,RY...","A Good Braided Cable for Your Type C Device,Go...",I ordered this cable to connect my phone to An...,https://m.media-amazon.com/images/W/WEBP_40237...,https://www.amazon.in/Ambrane-Unbreakable-Char...
2,B096MSW6CT,Sounce Fast Phone Charging Cable & Data Sync U...,Computers&Accessories|Accessories&Peripherals|...,₹199,"₹1,899",90%,3.9,7928,【 Fast Charger& Data Sync】-With built-in safet...,"AGU3BBQ2V2DDAMOAKGFAWDDQ6QHA,AESFLDV2PT363T2AQ...","Kunal,Himanshu,viswanath,sai niharka,saqib mal...","R3J3EQQ9TZI5ZJ,R3E7WBGK7ID0KV,RWU79XKQ6I1QF,R2...","Good speed for earlier versions,Good Product,W...","Not quite durable and sturdy,https://m.media-a...",https://m.media-amazon.com/images/W/WEBP_40237...,https://www.amazon.in/Sounce-iPhone-Charging-C...
3,B08HDJ86NZ,boAt Deuce USB 300 2 in 1 Type-C & Micro USB S...,Computers&Accessories|Accessories&Peripherals|...,₹329,₹699,53%,4.2,94363,The boAt Deuce USB 300 2 in 1 cable is compati...,"AEWAZDZZJLQUYVOVGBEUKSLXHQ5A,AG5HTSFRRE6NL3M5S...","Omkar dhale,JD,HEMALATHA,Ajwadh a.,amar singh ...","R3EEUZKKK9J36I,R3HJVYCLYOY554,REDECAZ7AMPQC,R1...","Good product,Good one,Nice,Really nice product...","Good product,long wire,Charges good,Nice,I bou...",https://m.media-amazon.com/images/I/41V5FtEWPk...,https://www.amazon.in/Deuce-300-Resistant-Tang...
4,B08CF3B7N1,Portronics Konnect L 1.2M Fast Charging 3A 8 P...,Computers&Accessories|Accessories&Peripherals|...,₹154,₹399,61%,4.2,16905,[CHARGE & SYNC FUNCTION]- This cable comes wit...,"AE3Q6KSUK5P75D5HFYHCRAOLODSA,AFUGIFH5ZAFXRDSZH...","rahuls6099,Swasat Borah,Ajay Wadke,Pranali,RVK...","R1BP4L2HH9TFUP,R16PVJEXKV6QZS,R2UPDB81N66T4P,R...","As good as original,Decent,Good one for second...","Bought this instead of original apple, does th...",https://m.media-amazon.com/images/W/WEBP_40237...,https://www.amazon.in/Portronics-Konnect-POR-1...


In [30]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1465 entries, 0 to 1464
Data columns (total 16 columns):
 #   Column               Non-Null Count  Dtype 
---  ------               --------------  ----- 
 0   product_id           1465 non-null   object
 1   product_name         1465 non-null   object
 2   category             1465 non-null   object
 3   discounted_price     1465 non-null   object
 4   actual_price         1465 non-null   object
 5   discount_percentage  1465 non-null   object
 6   rating               1465 non-null   object
 7   rating_count         1463 non-null   object
 8   about_product        1465 non-null   object
 9   user_id              1465 non-null   object
 10  user_name            1465 non-null   object
 11  review_id            1465 non-null   object
 12  review_title         1465 non-null   object
 13  review_content       1465 non-null   object
 14  img_link             1465 non-null   object
 15  product_link         1465 non-null   object
dtypes: obj

In [9]:
# Connect to the SQLite database (no need to mention the database name in queries)
conn = sqlite3.connect('amazon_reviews.db')

# Get the column names from the amazon_reviews table
query_columns = "PRAGMA table_info(amazon_reviews);"
columns_info = pd.read_sql(query_columns, conn)

# Extract column names from the result
columns = columns_info['name']

# Loop through each column and check for missing values (NULL)
for column in columns:
    query = f"""
    SELECT COUNT(*) AS missing_{column}
    FROM amazon_reviews
    WHERE {column} IS NULL;
    """
    missing_values = pd.read_sql(query, conn)
    print(f"Missing values in {column}: {missing_values.iloc[0, 0]}")



Missing values in product_id: 0
Missing values in product_name: 0
Missing values in category: 0
Missing values in discounted_price: 0
Missing values in actual_price: 0
Missing values in discount_percentage: 0
Missing values in rating: 0
Missing values in rating_count: 2
Missing values in about_product: 0
Missing values in user_id: 0
Missing values in user_name: 0
Missing values in review_id: 0
Missing values in review_title: 0
Missing values in review_content: 0
Missing values in img_link: 0
Missing values in product_link: 0
