What "Inspecting Data" Means:
Inspecting data means checking and understanding the contents and structure of your dataset (DataFrame or Series) before you start analyzing or cleaning it.

It helps you answer questions like:

What does the data look like?

How many rows and columns are there?

What are the column names?

What type of data is in each column?

Are there any missing values?

What are the minimum, maximum, and average values?

In [None]:
# Complete Data Inspection in Pandas

# Function                           | Description
# ----------------------------------- | ----------------------------------------------
# df.head()                           | Shows the first 5 rows of the DataFrame
# df.tail()                           | Shows the last 5 rows
# df.shape                            | Returns the number of rows and columns in a tuple (rows, columns)
# df.columns                          | Lists all column names
# df.index                            | Lists all row indexes (labels)
# df.dtypes                           | Shows the data types of each column
# df.info()                           | Provides a summary of the DataFrame, including index, columns, non-null counts, and data types
# df.describe()                       | Gives statistical summary of numeric columns (mean, std, min, max, etc.)
# df.isnull()                         | Detects missing values in each cell (returns `True` for nulls)
# df.isnull().sum()                   | Counts the number of missing (null) values per column
# df.duplicated()                     | Flags duplicate rows in the DataFrame
# df.duplicated().sum()               | Counts the number of duplicate rows
# df.nunique()                        | Counts the number of unique values in each column
# df.value_counts()                   | Returns the frequency of unique values in a Series (column)
# df.sample(n)                        | Randomly selects `n` rows from the DataFrame
# df.sort_values(by='col')            | Sorts the DataFrame by a specific column
# df.corr()                           | Computes correlation between numeric columns
# df.cov()                            | Computes covariance between numeric columns
# df.memory_usage()                   | Returns memory usage of each column
# df.groupby('col')                   | Groups data by a column (use with aggregation functions like `.sum()`, `.mean()`)
# df.describe(include='all')          | Includes both numeric and non-numeric columns in the summary statistics
# df.mode()                           | Returns the most frequent value(s) per column
# df.quantile([0.25, 0.5, 0.75])      | Returns the 25th, 50th (median), and 75th percentiles of the data
# df.skew()                           | Measures the skewness (asymmetry) of data distribution
# df.kurt()                           | Measures the kurtosis (peakedness) of data distribution
# df.select_dtypes(include='number')  | Selects only numeric columns from the DataFrame
# df.cumsum()                         | Calculates the cumulative sum of values across columns (useful in time-series data)


In [1]:
import pandas as pd

In [2]:
datacsv = pd.read_csv("laptops_dataset.csv")

In [None]:
# df.head()
# Shows the first 5 rows.
head = datacsv.head()
head

Unnamed: 0,product_name,overall_rating,no_ratings,no_reviews,rating,title,review
0,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,5,Perfect product!,"Loved it, it's my first MacBook that I earned ..."
1,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,5,Fabulous!,Battery lasted longer than my first relationsh...
2,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,5,Fabulous!,Such a great deal.. very happy with the perfor...
3,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,4,Delightful,"Awesome build quality and very good display, b..."
4,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,5,Awesome,When i ordered and came to know about seller r...


In [6]:
# # Shows the first 10 rows.
head2 = datacsv.head(10)
head2

Unnamed: 0,product_name,overall_rating,no_ratings,no_reviews,rating,title,review
0,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,5,Perfect product!,"Loved it, it's my first MacBook that I earned ..."
1,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,5,Fabulous!,Battery lasted longer than my first relationsh...
2,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,5,Fabulous!,Such a great deal.. very happy with the perfor...
3,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,4,Delightful,"Awesome build quality and very good display, b..."
4,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,5,Awesome,When i ordered and came to know about seller r...
5,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,5,Super!,Super product
6,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,5,Super!,Go for it..its awesome
7,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,5,Mind-blowing purchase,"Best , best and best 🫶🏻👑🍎"
8,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,5,Just wow!,Its really very good and compact device.
9,Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/...,4.7,15210,900,5,Brilliant,"Superb built quality, Amazing performance and ..."


In [None]:
# df.tail()
# Shows the last 5 rows.
tail = datacsv.tail()
tail

Unnamed: 0,product_name,overall_rating,no_ratings,no_reviews,rating,title,review
24108,MSI Modern 14 Intel Core i5 13th Gen 1335U - (...,4.3,156,24,5,Perfect product!,MSI Laptop is high performance and the best. c...
24109,MSI Modern 14 Intel Core i5 13th Gen 1335U - (...,4.3,156,24,5,Perfect product!,Excellent performance best laptop.
24110,MSI Modern 14 Intel Core i5 13th Gen 1335U - (...,4.3,156,24,4,"Good product, Lacks features.",Decent battery life. Exceptional build quality...
24111,Lenovo IdeaPad 5 2-in-1 WUXGA IPS AMD Ryzen 7 ...,4.4,7,2,3,Nice,The product does not support facial recognitio...
24112,Lenovo IdeaPad 5 2-in-1 WUXGA IPS AMD Ryzen 7 ...,4.4,7,2,4,Very Good,DISPLAY IS A LET DOWN. But the lenovo Pen work...


In [8]:
# Shows the last 20 rows.
tail2 = datacsv.tail(20)
tail2

Unnamed: 0,product_name,overall_rating,no_ratings,no_reviews,rating,title,review
24093,MSI Modern 14 Intel Core i5 13th Gen 1335U - (...,4.3,156,24,5,Best in the market!,Value For Money Product.Superb Performance.
24094,MSI Modern 14 Intel Core i5 13th Gen 1335U - (...,4.3,156,24,4,Simply awesome,Got this for 36990 during sale 😁\n* great valu...
24095,MSI Modern 14 Intel Core i5 13th Gen 1335U - (...,4.3,156,24,5,superb product and quick delivery by Flipkart,superb product and quick delivery by Flipkart
24096,MSI Modern 14 Intel Core i5 13th Gen 1335U - (...,4.3,156,24,2,Bad quality,I didn't like the heating noise in the laptop....
24097,MSI Modern 14 Intel Core i5 13th Gen 1335U - (...,4.3,156,24,5,Wonderful,Value for money laptop best laptop.
24098,MSI Modern 14 Intel Core i5 13th Gen 1335U - (...,4.3,156,24,5,Fabulous!,Best performance\nNice product\nNice built qua...
24099,MSI Modern 14 Intel Core i5 13th Gen 1335U - (...,4.3,156,24,5,Excellent,awesome laptop..
24100,MSI Modern 14 Intel Core i5 13th Gen 1335U - (...,4.3,156,24,4,Very Good,Performance is very good\nBattery backup is no...
24101,MSI Modern 14 Intel Core i5 13th Gen 1335U - (...,4.3,156,24,4,Worth the money,Product is good.\nDelivery got 10 days late du...
24102,MSI Modern 14 Intel Core i5 13th Gen 1335U - (...,4.3,156,24,5,Fabulous!,nice product


In [9]:
# df.shape
# Returns the number of rows and columns.
datacsv.shape

(24113, 7)

In [10]:
# df.columns
# Lists all column names.
datacsv.columns

Index(['product_name', 'overall_rating', 'no_ratings', 'no_reviews', 'rating',
       'title', 'review'],
      dtype='object')

In [11]:
# df.index
# Lists all row indexes (labels).
datacsv.index

RangeIndex(start=0, stop=24113, step=1)

In [12]:
# df.dtypes
# Shows the data types of each column.
datacsv.dtypes

product_name       object
overall_rating    float64
no_ratings         object
no_reviews         object
rating              int64
title              object
review             object
dtype: object

In [None]:
# df.info()
# Provides a summary of the DataFrame, including index, columns, non-null counts, and data types.
datacsv.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 24113 entries, 0 to 24112
Data columns (total 7 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   product_name    24113 non-null  object 
 1   overall_rating  24113 non-null  float64
 2   no_ratings      24113 non-null  object 
 3   no_reviews      24113 non-null  object 
 4   rating          24113 non-null  int64  
 5   title           24113 non-null  object 
 6   review          24113 non-null  object 
dtypes: float64(1), int64(1), object(5)
memory usage: 1.3+ MB


In [14]:
# df.describe()
# Gives statistical summary of numeric columns (mean, std, min, max, etc.).
datacsv.describe()

Unnamed: 0,overall_rating,rating
count,24113.0,24113.0
mean,4.186273,4.214573
std,0.228392,1.184845
min,3.3,1.0
25%,4.1,4.0
50%,4.2,5.0
75%,4.3,5.0
max,5.0,5.0


In [15]:
# df.isnull()
# Detects missing values in each cell (returns True for nulls).
datacsv.isnull()


Unnamed: 0,product_name,overall_rating,no_ratings,no_reviews,rating,title,review
0,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...
24108,False,False,False,False,False,False,False
24109,False,False,False,False,False,False,False
24110,False,False,False,False,False,False,False
24111,False,False,False,False,False,False,False


In [16]:
# df.isnull().sum()
# Counts the number of missing (null) values per column.
datacsv.isnull().sum()


product_name      0
overall_rating    0
no_ratings        0
no_reviews        0
rating            0
title             0
review            0
dtype: int64

In [17]:
# df.duplicated()
# Flags duplicate rows in the DataFrame.
datacsv.duplicated()

0        False
1        False
2        False
3        False
4        False
         ...  
24108    False
24109    False
24110    False
24111    False
24112    False
Length: 24113, dtype: bool

In [18]:
# df.duplicated().sum()
# Counts the number of duplicate rows.
datacsv.duplicated().sum()

np.int64(7122)

In [19]:
# df.nunique()
# Counts the number of unique values in each column.
datacsv.nunique()

product_name        365
overall_rating       16
no_ratings          288
no_reviews          147
rating                5
title               213
review            11223
dtype: int64

In [21]:
# df.value_counts()
# Returns the frequency of unique values in a Series (column).
datacsv['product_name'].value_counts()

product_name
Apple MacBook AIR Apple M2 - (8 GB/256 GB SSD/Mac OS Monterey)...    400
ASUS Vivobook 15 Intel Core i5 12th Gen 1235U - (8 GB/512 GB S...    400
CHUWI Intel Celeron Dual Core 11th Gen N4020 - (8 GB/256 GB SS...    400
HP Backlit Intel Core i5 12th Gen 1235U - (8 GB/512 GB SSD/Win...    330
ASUS TUF Gaming F15 - AI Powered Gaming Intel Core i5 11th Gen...    314
                                                                    ... 
HP Intel Core Ultra 7 155H - (16 GB/512 GB SSD/Windows 11 Home...      1
Lenovo IdeaPad Slim 3 Intel Intel Core i7 13th Gen Core™ i7-13...      1
HP ENVY AI PC Intel Core Ultra 7 155U - (16 GB/512 GB SSD/Wind...      1
Lenovo Legion Go AMD Ryzen Z1 Extreme - (16 GB/512 GB SSD/Wind...      1
HP Victus Intel Core i7 12th Gen 12650H - (16 GB/1 TB SSD/Wind...      1
Name: count, Length: 365, dtype: int64

In [27]:
# df.sample(n)
# Randomly selects n rows from the DataFrame.
datacsv.sample()

Unnamed: 0,product_name,overall_rating,no_ratings,no_reviews,rating,title,review
2635,HP Intel Core i5 12th Gen 1235U - (16 GB/512 G...,4.2,5248,272,4,Pretty good,Performance and design are good but battery sh...


In [28]:
datacsv.sample(10)

Unnamed: 0,product_name,overall_rating,no_ratings,no_reviews,rating,title,review
10025,Acer Predator Neo Intel Core i7 13th Gen 13700...,4.4,1195,105,5,Wonderful,This gotta be the best laptop to buy in the se...
19165,Acer One14 Backlit Intel Core i5 11th Gen 1155...,3.7,2518,307,5,Highly recommended,Been using this for the past few weeks... Abso...
1703,DELL 15 AMD Ryzen 3 Quad Core 7320U - (8 GB/51...,4.3,1362,203,5,Awesome,Good loptop
6010,ASUS Vivobook 15 Intel Core i3 12th Gen 1215U ...,4.3,9156,698,5,Best in the market!,Overall Good Laptop for daily and office uses ...
22177,ASUS Vivobook 15 Intel Core i3 11th Gen 1115G4...,4.3,8968,758,5,Terrific purchase,This product is very good. Value for money. Th...
22516,HP Pavilion Aero AMD Ryzen 7 Octa Core 7735U -...,4.3,147,14,5,Must buy!,Excellent performance.Thank you.
3341,Acer Aspire 7 Intel Core i5 13th Gen 13420H - ...,4.3,1298,98,5,Perfect product!,Good
19371,HP 15s Intel Core i3 11th Gen 1115G4 - (8 GB/5...,4.2,1163,97,5,Excellent,Very nice product 😃
22495,MSI Modern 15 Intel Core i7 12th Gen 1255U - (...,4.2,133,18,1,Horrible,Very bad battery backup around 2 hours only.
18093,Acer Aspire Lite AMD Ryzen 5 Hexa Core 5625U -...,4.0,607,54,1,Bad quality,Bettry is not good


In [None]:
# df.sort_values(by='col')
# Sorts the DataFrame by a specific column. (by dafault it sort in  ascending order)
datacsv.sort_values(by='no_ratings')

Unnamed: 0,product_name,overall_rating,no_ratings,no_reviews,rating,title,review
23238,ASUS ProArt PX13 OLED (2024) - AI PC for Creat...,5.0,1,1,5,Great laptop. High price,Perfect laptop for work and gaming on the side...
9378,HP Pavilion 15 (2024) AMD Ryzen 5 Hexa Core 75...,4.3,1006,102,5,Just wow!,"Awesome Product, Thank You Flipkart"
9365,HP Pavilion 15 (2024) AMD Ryzen 5 Hexa Core 75...,4.3,1006,102,5,Must buy!,Very good product
9366,HP Pavilion 15 (2024) AMD Ryzen 5 Hexa Core 75...,4.3,1006,102,1,Worthless,This review is only for flipkart bad service a...
9367,HP Pavilion 15 (2024) AMD Ryzen 5 Hexa Core 75...,4.3,1006,102,3,Nice,Best price under 50k ..and just build quality ...
...,...,...,...,...,...,...,...
21767,Lenovo IdeaPad Flex 5 Intel Core i3 13th Gen 1...,4.3,99,9,3,Nice,Good product. But have little heating issue.
21771,Lenovo IdeaPad Flex 5 Intel Core i3 13th Gen 1...,4.3,99,9,4,Really Nice,Professional
21770,Lenovo IdeaPad Flex 5 Intel Core i3 13th Gen 1...,4.3,99,9,1,Worthless,Very poor products don't buy it\nAfter 10 moos...
21769,Lenovo IdeaPad Flex 5 Intel Core i3 13th Gen 1...,4.3,99,9,5,Excellent,Nice 🙂


In [30]:
# To sort in descending order
datacsv.sort_values(by='no_ratings', ascending=False)

Unnamed: 0,product_name,overall_rating,no_ratings,no_reviews,rating,title,review
21771,Lenovo IdeaPad Flex 5 Intel Core i3 13th Gen 1...,4.3,99,9,4,Really Nice,Professional
21770,Lenovo IdeaPad Flex 5 Intel Core i3 13th Gen 1...,4.3,99,9,1,Worthless,Very poor products don't buy it\nAfter 10 moos...
21769,Lenovo IdeaPad Flex 5 Intel Core i3 13th Gen 1...,4.3,99,9,5,Excellent,Nice 🙂
21768,Lenovo IdeaPad Flex 5 Intel Core i3 13th Gen 1...,4.3,99,9,3,Just okay,Performace is little bit slow. Laptop opening ...
21767,Lenovo IdeaPad Flex 5 Intel Core i3 13th Gen 1...,4.3,99,9,3,Nice,Good product. But have little heating issue.
...,...,...,...,...,...,...,...
9338,HP Pavilion 15 (2024) AMD Ryzen 5 Hexa Core 75...,4.3,1006,102,5,Wonderful,Value for money laptop.performance is good 👍🏿😊...
9337,HP Pavilion 15 (2024) AMD Ryzen 5 Hexa Core 75...,4.3,1006,102,5,Perfect product!,Very nice laptop all things are good nice disp...
9336,HP Pavilion 15 (2024) AMD Ryzen 5 Hexa Core 75...,4.3,1006,102,5,Nice product,Love the way she dances in pavilion lapii
9354,HP Pavilion 15 (2024) AMD Ryzen 5 Hexa Core 75...,4.3,1006,102,4,Value-for-money,good laptop for student
