In [1]:
# Importing libraries

import numpy as np
import pandas as pd
pd.set_option('display.max_columns', None)

import warnings
warnings.filterwarnings("ignore")

In [2]:
# Importing the dataset

df = pd.read_csv("datasets/Data Analysis/smartphones - smartphones.csv")

##### Viewing the dataset

In [3]:
df.head()

Unnamed: 0,model,price,rating,sim,processor,ram,battery,display,camera,card,os
0,OnePlus 11 5G,"₹54,999",89.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 8 Gen2, Octa Core, 3.2 GHz Processor","12 GB RAM, 256 GB inbuilt",5000 mAh Battery with 100W Fast Charging,"6.7 inches, 1440 x 3216 px, 120 Hz Display wit...",50 MP + 48 MP + 32 MP Triple Rear & 16 MP Fron...,Memory Card Not Supported,Android v13
1,OnePlus Nord CE 2 Lite 5G,"₹19,989",81.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 33W Fast Charging,"6.59 inches, 1080 x 2412 px, 120 Hz Display wi...",64 MP + 2 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
2,Samsung Galaxy A14 5G,"₹16,499",75.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Exynos 1330, Octa Core, 2.4 GHz Processor","4 GB RAM, 64 GB inbuilt",5000 mAh Battery with 15W Fast Charging,"6.6 inches, 1080 x 2408 px, 90 Hz Display with...",50 MP + 2 MP + 2 MP Triple Rear & 13 MP Front ...,"Memory Card Supported, upto 1 TB",Android v13
3,Motorola Moto G62 5G,"₹14,999",81.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with Fast Charging,"6.55 inches, 1080 x 2400 px, 120 Hz Display wi...",50 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
4,Realme 10 Pro Plus,"₹24,999",82.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Dimensity 1080, Octa Core, 2.6 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 67W Fast Charging,"6.7 inches, 1080 x 2412 px, 120 Hz Display wit...",108 MP + 8 MP + 2 MP Triple Rear & 16 MP Front...,Memory Card Not Supported,Android v13


In [4]:
print("The shape of patients dataset is: ", df.shape)

The shape of patients dataset is:  (1020, 11)


#### Summary of the data:

- This is a dataset of about 1000 mobile phones along with their specifications.


#### Column Descriptions:

- `model` : The model of the mobile phone.
- `price` : The price of the mobile phone.
- `rating` : The rating for the phone out of 100.
- `sim` : The number of sim slots, bands and connection features available for the sim card (4G, 5G, NFC, Wi-Fi) in the mobile phone.
- `processor` : The processor brand along with it's specifications (Generation of the processor, Number of cores, freequency).
- `ram` : RAM and storage space of the mobile.
- `battery` : Battery capacity along with charging speed of the mobile.
- `display` : Display size, screen resolution, refresh rate of the screen and front camera design of the mobile.
- `camera` : Number of cameras along with their specifications.
- `card` : Whether the phone support external memory card or not and if yes then the supported size of the memory card.
- `os` : Operating system and it's version for the mobile phone.

### `Data Assessing`:

- ***Dirty Data (Quality related problems)***
  
  - **Problems found using Manual Assessment**
    - some brands are written diiferently like `OPPO` in `model` column. Issue is `consistency` related.
    - There is an `ipod` as `model` on row `756`. Issue is `validity` related.
    - The `price` column has unnecesarry `₹` sign. Issue is `validity` related.
    - The `price` column also has `,` between numbers. Issue is `validity` related.
    - In `camera` column words like `Dual`, `Triple` and `Quad` are used to represent number of cameras and front and rear cameras are separated by `&`. Issue is `validity` related.
    - In `card` column it sometimes contains information about `os` and `camera`. Issue is `validity` related.
    - In `os` column it sometimes contains information about `bluetooth` and `fm`. Issue is `validity` related.
    - The `os` column also contains some os version name like `lollipop`, `oreo`, `pie`. Issue is `consistency` related.
  
  - **Problems found using Automatic Assessment**
    - The `ratings`, `card`, `os` and `camera` columns has some missing values. Issue is `completion` related.
    - Incorrect data type assigned to `price` and `rating` columns. Issue is `validity` related.
    - In `price` column One phone (index: 608) has a value of `99`. Issue is `accuracy` related. 
    - The `processor` has some incorrect values for some phones. Issue is `validity` related.
    - The `processor` has some spaces in some names. Issue is `consistency` related.
    - The `memory`, `battery`, `display` has some incorrect values in some rows. Issue is `validity` related.
    - In `display` sometimes refresh rate is not available. Issue is `completion` related. 
  
<br><br>

- ***Messy Data (Structure related problems)***
  - **Problems found using Manual Assessment**
    - The `sim` column can be split into 3 cols `has_5g`, `has_NFC` and `has_IR_Blaster`.
    - The `ram` column can be split into 2 cols `RAM` and `ROM`.
    - The `processor` column can be split into `processor name`, `cores` and `cpu speed`.
    - The `battery` column can be split into `battery capacity` and `fast_charging_available`.
    - The `display` column can be split into `size`, `resolution_width`, `resolution_height` and `frequency`.
    - The `camera` column can be split into `front` and `rear` camera.
    - The `card` column can be split into `supported` and `extended_upto`.

#### ` Automatic Assessment`

In [5]:
# Taking 10 samples of the dataset

df.sample(10)

Unnamed: 0,model,price,rating,sim,processor,ram,battery,display,camera,card,os
570,LG Wing 5G,"₹54,999",89.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 765G , Octa Core, 2.4 GHz Processor","8 GB RAM, 128 GB inbuilt",4000 mAh Battery with Fast Charging,"6.8 inches, 1080 x 2460 px Display",Dual Display,64 MP + 13 MP + 12 MP Triple Rear & 32 MP Fron...,"Memory Card (Hybrid), upto 2 TB"
183,Motorola Moto X40,"₹39,999",89.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 8 Gen2, Octa Core, 3.2 GHz Processor","8 GB RAM, 128 GB inbuilt",4600 mAh Battery with 125W Fast Charging,"6.7 inches, 1080 x 2400 px, 165 Hz Display",50 MP + 50 MP + 12 MP Triple Rear & 60 MP Fron...,Android v13,No FM Radio
15,Apple iPhone 13,"₹62,999",79.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Bionic A15, Hexa Core, 3.22 GHz Processor","4 GB RAM, 128 GB inbuilt",3240 mAh Battery with Fast Charging,"6.1 inches, 1170 x 2532 px Display with Small ...",12 MP + 12 MP Dual Rear & 12 MP Front Camera,Memory Card Not Supported,iOS v15
670,Tecno Spark 9T,"₹8,968",72.0,"Dual Sim, 3G, 4G, VoLTE, Wi-Fi","Helio G35, Octa Core, 2.3 GHz Processor","4 GB RAM, 64 GB inbuilt",5000 mAh Battery with 18W Fast Charging,"6.6 inches, 1080 x 2408 px Display with Water ...",50 MP + 2 MP Triple Rear & 8 MP Front Camera,Memory Card Supported,Android v12
163,Oppo Reno 10 Pro Plus,"₹58,990",88.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 8 Gen2, Octa Core, 3.2 GHz Processor","12 GB RAM, 256 GB inbuilt",4700 mAh Battery with 80W Fast Charging,"6.73 inches, 1080 x 2412 px, 120 Hz Display wi...",64 MP Quad Rear & 32 MP Front Camera,Android v13,No FM Radio
148,Xiaomi Redmi 10,"₹9,589",71.0,"Dual Sim, 3G, 4G, VoLTE, Wi-Fi","Snapdragon 680, Octa Core, 2.4 GHz Processor","4 GB RAM, 64 GB inbuilt",6000 mAh Battery with 18W Fast Charging,"6.7 inches, 720 x 1600 px Display with Water D...",50 MP + 2 MP Dual Rear & 5 MP Front Camera,"Memory Card Supported, upto 512 GB",Android v11
852,Nokia 215 4G,"₹3,099",,"Dual Sim, 3G, 4G, VoLTE",Unisoc T117,"64 MB RAM, 128 MB inbuilt",1150 mAh Battery,"2.4 inches, 240 x 320 px Display",No Rear Camera,"Memory Card Supported, upto 32 GB",RTOS (Series 30+)
53,Motorola Moto G82 5G,"₹18,999",83.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 30W Fast Charging,"6.6 inches, 1080 x 2400 px, 120 Hz Display wit...",50 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card Supported, upto 1 TB",Android v12
359,Poco F4 (12GB RAM + 256GB),"₹29,999",86.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 870, Octa Core, 3.2 GHz Processor","12 GB RAM, 256 GB inbuilt",4500 mAh Battery with 67W Fast Charging,"6.67 inches, 1080 x 2400 px, 120 Hz Display wi...",64 MP + 8 MP + 2 MP Triple Rear & 20 MP Front ...,Android v12,No FM Radio
371,Vivo V21 Pro,"₹32,999",85.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Snapdragon 765G , Octa Core, 2.4 GHz Processor","8 GB RAM, 128 GB inbuilt",4300 mAh Battery with 33W Fast Charging,"6.44 inches, 1080 x 2400 px Display with Small...",64 MP Quad Rear & 44 MP + 8 MP Dual Front Camera,Android v11,No FM Radio


In [6]:
# info

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1020 entries, 0 to 1019
Data columns (total 11 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   model      1020 non-null   object 
 1   price      1020 non-null   object 
 2   rating     879 non-null    float64
 3   sim        1020 non-null   object 
 4   processor  1020 non-null   object 
 5   ram        1020 non-null   object 
 6   battery    1020 non-null   object 
 7   display    1020 non-null   object 
 8   camera     1019 non-null   object 
 9   card       1013 non-null   object 
 10  os         1003 non-null   object 
dtypes: float64(1), object(10)
memory usage: 87.8+ KB


In [7]:
# There are some missing values in camera, card and os
# Checking if in each of these rows the "rating" data is also missing or not

df[df["os"].isnull()]

Unnamed: 0,model,price,rating,sim,processor,ram,battery,display,camera,card,os
473,Nokia 110 4G,"₹1,762",,"Dual Sim, 3G, 4G, VoLTE",No Wifi,"128 MB RAM, 48 MB inbuilt",1020 mAh Battery,"1.8 inches, 120 x 160 px Display",0.3 MP Rear Camera,"Memory Card Supported, upto 32 GB",
532,Samsung Guru Music 2 Dual Sim,"₹1,949",,Dual Sim,No Wifi,"Single Core, 208 MHz Processor",800 mAh Battery,"2 inches, 128 x 160 px Display",No Rear Camera,"Memory Card Supported, upto 16 GB",
573,Nokia 105 (2019),"₹1,299",,Single Sim,No Wifi,"4 MB RAM, 4 MB inbuilt",800 mAh Battery,"1.77 inches, 120 x 160 px Display",No Rear Camera,,
608,Namotel Achhe Din,₹99,,"Dual Sim, 3G, Wi-Fi","1 GB RAM, 4 GB inbuilt",1325 mAh Battery,"4 inches, 720 x 1280 px Display",2 MP Rear & 0.3 MP Front Camera,Android v5.0 (Lollipop),Bluetooth,
640,Nokia 105 Plus,"₹1,299",,Dual Sim,"4 MB RAM, 4 MB inbuilt",800 mAh Battery,"1.77 inches, 128 x 160 px Display",No Rear Camera,"Memory Card Supported, upto 32 GB",Bluetooth,
645,Nokia 2760 Flip,"₹5,490",,"Dual Sim, 3G, 4G, Wi-Fi",1450 mAh Battery,"3.6 inches, 240 x 320 px Display",5 MP Rear & 5 MP Front Camera,"Memory Card Supported, upto 32 GB",Kaios v3.0,Bluetooth,
647,Motorola Moto A10,"₹1,339",,Dual Sim,"4 MB RAM, 4 MB inbuilt",1750 mAh Battery,"1.8 inches, 160 x 128 px Display",No Rear Camera,"Memory Card Supported, upto 32 GB",,
657,Zanco Tiny T1,"₹2,799",,Single Sim,"32 MB RAM, 32 MB inbuilt",200 mAh Battery,"0.49 inches, 64 x 32 px Display",No Rear Camera,No FM Radio,Bluetooth,
665,itel it2163S,₹958,,Dual Sim,"4 MB RAM, 4 MB inbuilt",1200 mAh Battery,"1.8 inches, 160 x 128 px Display",No Rear Camera,"Memory Card Supported, upto 32 GB",Bluetooth,
699,Samsung Guru GT-E1215,"₹1,850",,Single Sim,800 mAh Battery,"1.5 inches, 120 x 120 px Display",No Rear Camera,No FM Radio,,,


In [8]:
# Checking for duplicated data

df.duplicated().sum()

0

In [11]:
# describe

df.describe(include='all')

Unnamed: 0,model,price,rating,sim,processor,ram,battery,display,camera,card,os
count,1020,1020,879.0,1020,1020,1020,1020,1020,1019,1013,1003
unique,1020,412,,28,298,58,256,369,285,63,48
top,OnePlus 11 5G,"₹14,999",,"Dual Sim, 3G, 4G, VoLTE, Wi-Fi","Dimensity 700 5G, Octa Core, 2.2 GHz Processor","8 GB RAM, 128 GB inbuilt",5000 mAh Battery with 33W Fast Charging,"6.67 inches, 1080 x 2400 px, 120 Hz Display wi...",50 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card Supported, upto 1 TB",Android v12
freq,1,21,,324,29,267,103,54,40,171,287
mean,,,78.258248,,,,,,,,
std,,,7.402854,,,,,,,,
min,,,60.0,,,,,,,,
25%,,,74.0,,,,,,,,
50%,,,80.0,,,,,,,,
75%,,,84.0,,,,,,,,


In [12]:
# Checking for uniaue values in following columns 
# "model", "sim", "processor", "ram", "battery", "display", "camera", "card", "os"

df1 = df[["model", "sim", "processor", "ram", "battery", "display", "camera", "card", "os"]]
df1.head()

Unnamed: 0,model,sim,processor,ram,battery,display,camera,card,os
0,OnePlus 11 5G,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 8 Gen2, Octa Core, 3.2 GHz Processor","12 GB RAM, 256 GB inbuilt",5000 mAh Battery with 100W Fast Charging,"6.7 inches, 1440 x 3216 px, 120 Hz Display wit...",50 MP + 48 MP + 32 MP Triple Rear & 16 MP Fron...,Memory Card Not Supported,Android v13
1,OnePlus Nord CE 2 Lite 5G,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 33W Fast Charging,"6.59 inches, 1080 x 2412 px, 120 Hz Display wi...",64 MP + 2 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
2,Samsung Galaxy A14 5G,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Exynos 1330, Octa Core, 2.4 GHz Processor","4 GB RAM, 64 GB inbuilt",5000 mAh Battery with 15W Fast Charging,"6.6 inches, 1080 x 2408 px, 90 Hz Display with...",50 MP + 2 MP + 2 MP Triple Rear & 13 MP Front ...,"Memory Card Supported, upto 1 TB",Android v13
3,Motorola Moto G62 5G,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with Fast Charging,"6.55 inches, 1080 x 2400 px, 120 Hz Display wi...",50 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
4,Realme 10 Pro Plus,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Dimensity 1080, Octa Core, 2.6 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 67W Fast Charging,"6.7 inches, 1080 x 2412 px, 120 Hz Display wit...",108 MP + 8 MP + 2 MP Triple Rear & 16 MP Front...,Memory Card Not Supported,Android v13


In [16]:
for column in df1.columns:
    print(df1[column].value_counts())
    print("*" * 125)

OnePlus 11 5G                           1
Xiaomi Qin 1                            1
Realme C25Y (4GB RAM + 64GB)            1
Samsung Galaxy A12 (6GB RAM + 128GB)    1
Tecno Camon 19                          1
                                       ..
POCO M4 Pro 4G (6GB RAM + 128GB)        1
OnePlus Nord N20 5G                     1
Apple iPhone 13 (256GB)                 1
Oppo Find X6 Pro                        1
Samsung Galaxy M52s 5G                  1
Name: model, Length: 1020, dtype: int64
*****************************************************************************************************************************
Dual Sim, 3G, 4G, VoLTE, Wi-Fi                               324
Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC                      268
Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi                           155
Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, IR Blaster                54
Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC, IR Blaster           52
Dual Sim, 3G, 4G, VoLTE, Wi-Fi, NFC            

In [17]:
# Checking for phones where "os" column has some other information than os

df[~ (df["os"].str.contains("Android") | df["os"].str.contains("OS"))]

Unnamed: 0,model,price,rating,sim,processor,ram,battery,display,camera,card,os
8,Nothing Phone 1,"₹26,749",85.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 778G Plus, Octa Core, 2.5 GHz Proce...","8 GB RAM, 128 GB inbuilt",4500 mAh Battery with 33W Fast Charging,"6.55 inches, 1080 x 2400 px, 120 Hz Display wi...",50 MP + 50 MP Dual Rear & 16 MP Front Camera,Android v12,No FM Radio
9,OnePlus Nord 2T 5G,"₹28,999",84.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Dimensity 1300, Octa Core, 3 GHz Processor","8 GB RAM, 128 GB inbuilt",4500 mAh Battery with 80W Fast Charging,"6.43 inches, 1080 x 2400 px, 90 Hz Display wit...",50 MP + 8 MP + 2 MP Triple Rear & 32 MP Front ...,Android v12,No FM Radio
12,Xiaomi Redmi Note 12 Pro 5G,"₹24,762",79.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, IR Blaster","Dimensity 1080, Octa Core, 2.6 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 67W Fast Charging,"6.67 inches, 1080 x 2400 px, 120 Hz Display wi...",50 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,Android v12,No FM Radio
17,OPPO Reno 9 Pro Plus,"₹45,999",86.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 8+ Gen1, Octa Core, 3.2 GHz Processor","16 GB RAM, 256 GB inbuilt",4700 mAh Battery with 80W Fast Charging,"6.7 inches, 1080 x 2412 px, 120 Hz Display wit...",50 MP + 8 MP + 2 MP Triple Rear & 32 MP Front ...,Android v13,No FM Radio
18,OnePlus 10R 5G,"₹32,999",86.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Dimensity 8100 Max, Octa Core, 2.85 GHz Processor","8 GB RAM, 128 GB inbuilt",5000 mAh Battery with 80W Fast Charging,"6.7 inches, 1080 x 2412 px, 120 Hz Display wit...",50 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,Android v12,Bluetooth
...,...,...,...,...,...,...,...,...,...,...,...
1009,Xiaomi Civi 3,"₹32,990",86.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC, IR Bl...","Dimensity 8200, Octa Core, 3.1 GHz Processor","8 GB RAM, 256 GB inbuilt",5000 mAh Battery with 80W Fast Charging,"6.7 inches, 1080 x 2400 px, 120 Hz Display wit...",64 MP + 20 MP + 2 MP Triple Rear & 32 MP + 32 ...,Android v13,No FM Radio
1011,Oppo Find X6,"₹69,990",89.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 8 Gen2, Octa Core, 3.2 GHz Processor","8 GB RAM, 256 GB inbuilt",4700 mAh Battery with 120W Fast Charging,"6.73 inches, 1080 x 2400 px, 120 Hz Display wi...",50 MP + 48 MP + 32 MP Triple Rear & 32 MP Fron...,Android v12,No FM Radio
1012,itel A23s,"₹4,787",,"Dual Sim, 3G, 4G, Wi-Fi","Spreadtrum SC9832E, Quad Core, 1.4 GHz Processor","2 GB RAM, 32 GB inbuilt",3020 mAh Battery,"5 inches, 854 x 480 px Display",2 MP Rear Camera,Android v11,No FM Radio
1013,Google Pixel 8 Pro,"₹70,990",80.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Google Tensor 3, Octa Core Processor","12 GB RAM, 256 GB inbuilt",5000 mAh Battery with 67W Fast Charging,"6.73 inches, 1440 x 3120 px, 120 Hz Display wi...",50 MP + 50 MP + 50 MP Triple Rear & 12 MP Fron...,Android v13,No FM Radio


#### `Data Cleaning`

In [46]:
# Making a copy of the original dataset

phones = df.copy()
phones

Unnamed: 0,model,price,rating,sim,processor,ram,battery,display,camera,card,os
0,OnePlus 11 5G,"₹54,999",89.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 8 Gen2, Octa Core, 3.2 GHz Processor","12 GB RAM, 256 GB inbuilt",5000 mAh Battery with 100W Fast Charging,"6.7 inches, 1440 x 3216 px, 120 Hz Display wit...",50 MP + 48 MP + 32 MP Triple Rear & 16 MP Fron...,Memory Card Not Supported,Android v13
1,OnePlus Nord CE 2 Lite 5G,"₹19,989",81.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 33W Fast Charging,"6.59 inches, 1080 x 2412 px, 120 Hz Display wi...",64 MP + 2 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
2,Samsung Galaxy A14 5G,"₹16,499",75.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Exynos 1330, Octa Core, 2.4 GHz Processor","4 GB RAM, 64 GB inbuilt",5000 mAh Battery with 15W Fast Charging,"6.6 inches, 1080 x 2408 px, 90 Hz Display with...",50 MP + 2 MP + 2 MP Triple Rear & 13 MP Front ...,"Memory Card Supported, upto 1 TB",Android v13
3,Motorola Moto G62 5G,"₹14,999",81.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with Fast Charging,"6.55 inches, 1080 x 2400 px, 120 Hz Display wi...",50 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
4,Realme 10 Pro Plus,"₹24,999",82.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Dimensity 1080, Octa Core, 2.6 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 67W Fast Charging,"6.7 inches, 1080 x 2412 px, 120 Hz Display wit...",108 MP + 8 MP + 2 MP Triple Rear & 16 MP Front...,Memory Card Not Supported,Android v13
...,...,...,...,...,...,...,...,...,...,...,...
1015,Motorola Moto Edge S30 Pro,"₹34,990",83.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Snapdragon 8 Gen1, Octa Core, 3 GHz Processor","8 GB RAM, 128 GB inbuilt",5000 mAh Battery with 68.2W Fast Charging,"6.67 inches, 1080 x 2460 px, 120 Hz Display wi...",64 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,Android v12,No FM Radio
1016,Honor X8 5G,"₹14,990",75.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Snapdragon 480+, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 22.5W Fast Charging,"6.5 inches, 720 x 1600 px Display with Water D...",48 MP + 2 MP + Depth Sensor Triple Rear & 8 MP...,"Memory Card Supported, upto 1 TB",Android v11
1017,POCO X4 GT 5G (8GB RAM + 256GB),"₹28,990",85.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC, IR Bl...","Dimensity 8100, Octa Core, 2.85 GHz Processor","8 GB RAM, 256 GB inbuilt",5080 mAh Battery with 67W Fast Charging,"6.6 inches, 1080 x 2460 px, 144 Hz Display wit...",64 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,Memory Card Not Supported,Android v12
1018,Motorola Moto G91 5G,"₹19,990",80.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with Fast Charging,"6.8 inches, 1080 x 2400 px Display with Punch ...",108 MP + 8 MP + 2 MP Triple Rear & 32 MP Front...,"Memory Card Supported, upto 1 TB",Android v12


### Step 1: Solving the `Completeness` issues

- Replacing missing values from `ratings`, `card`, `os` and `camera`.
- Finding rows where `display` column without refresh rate and set `60 Hz` as default value. This can be solved when creating new columns for features of `display` during handling tidiness.

In [60]:
# Issue

phones[phones["os"].isnull()]

Unnamed: 0,model,price,rating,sim,processor,ram,battery,display,camera,card,os
473,Nokia 110 4G,1762,,"Dual Sim, 3G, 4G, VoLTE",No Wifi,"128 MB RAM, 48 MB inbuilt",1020 mAh Battery,"1.8 inches, 120 x 160 px Display",0.3 MP Rear Camera,"Memory Card Supported, upto 32 GB",
532,Samsung Guru Music 2 Dual Sim,1949,,Dual Sim,No Wifi,"Single Core, 208 MHz Processor",800 mAh Battery,"2 inches, 128 x 160 px Display",No Rear Camera,"Memory Card Supported, upto 16 GB",
573,Nokia 105 (2019),1299,,Single Sim,No Wifi,"4 MB RAM, 4 MB inbuilt",800 mAh Battery,"1.77 inches, 120 x 160 px Display",No Rear Camera,,
640,Nokia 105 Plus,1299,,Dual Sim,"4 MB RAM, 4 MB inbuilt",800 mAh Battery,"1.77 inches, 128 x 160 px Display",No Rear Camera,"Memory Card Supported, upto 32 GB",Bluetooth,
645,Nokia 2760 Flip,5490,,"Dual Sim, 3G, 4G, Wi-Fi",1450 mAh Battery,"3.6 inches, 240 x 320 px Display",5 MP Rear & 5 MP Front Camera,"Memory Card Supported, upto 32 GB",Kaios v3.0,Bluetooth,
647,Motorola Moto A10,1339,,Dual Sim,"4 MB RAM, 4 MB inbuilt",1750 mAh Battery,"1.8 inches, 160 x 128 px Display",No Rear Camera,"Memory Card Supported, upto 32 GB",,
657,Zanco Tiny T1,2799,,Single Sim,"32 MB RAM, 32 MB inbuilt",200 mAh Battery,"0.49 inches, 64 x 32 px Display",No Rear Camera,No FM Radio,Bluetooth,
665,itel it2163S,958,,Dual Sim,"4 MB RAM, 4 MB inbuilt",1200 mAh Battery,"1.8 inches, 160 x 128 px Display",No Rear Camera,"Memory Card Supported, upto 32 GB",Bluetooth,
699,Samsung Guru GT-E1215,1850,,Single Sim,800 mAh Battery,"1.5 inches, 120 x 120 px Display",No Rear Camera,No FM Radio,,,
748,Nokia 400 4G,3290,,"Dual Sim, 4G, VoLTE, Wi-Fi",2000 mAh Battery,"2.4 inches, 240 x 320 px Display",0.3 MP Rear & 0.3 MP Front Camera,"Memory Card Supported, upto 64 GB",Bluetooth,Browser,


In [68]:
# Solution
# Replacing the missing values of "camera", "card", "os" columns with "No data" as they are of object type.
# Replacing the missing values of "rating" column with 0 as it is now of integer type.

values = {"camera" : "No data", 
          "card" : "No data", 
          "os" : "No data"}

phones.fillna(value=values, inplace=True)
phones["rating"] = phones["rating"].replace(np.nan, 0)

In [69]:
# test
# Checking the result

phones[phones["os"] == "No data"]

Unnamed: 0,model,price,rating,sim,processor,ram,battery,display,camera,card,os
473,Nokia 110 4G,1762,0,"Dual Sim, 3G, 4G, VoLTE",No Wifi,"128 MB RAM, 48 MB inbuilt",1020 mAh Battery,"1.8 inches, 120 x 160 px Display",0.3 MP Rear Camera,"Memory Card Supported, upto 32 GB",No data
532,Samsung Guru Music 2 Dual Sim,1949,0,Dual Sim,No Wifi,"Single Core, 208 MHz Processor",800 mAh Battery,"2 inches, 128 x 160 px Display",No Rear Camera,"Memory Card Supported, upto 16 GB",No data
573,Nokia 105 (2019),1299,0,Single Sim,No Wifi,"4 MB RAM, 4 MB inbuilt",800 mAh Battery,"1.77 inches, 120 x 160 px Display",No Rear Camera,No data,No data
640,Nokia 105 Plus,1299,0,Dual Sim,"4 MB RAM, 4 MB inbuilt",800 mAh Battery,"1.77 inches, 128 x 160 px Display",No Rear Camera,"Memory Card Supported, upto 32 GB",Bluetooth,No data
645,Nokia 2760 Flip,5490,0,"Dual Sim, 3G, 4G, Wi-Fi",1450 mAh Battery,"3.6 inches, 240 x 320 px Display",5 MP Rear & 5 MP Front Camera,"Memory Card Supported, upto 32 GB",Kaios v3.0,Bluetooth,No data
647,Motorola Moto A10,1339,0,Dual Sim,"4 MB RAM, 4 MB inbuilt",1750 mAh Battery,"1.8 inches, 160 x 128 px Display",No Rear Camera,"Memory Card Supported, upto 32 GB",No data,No data
657,Zanco Tiny T1,2799,0,Single Sim,"32 MB RAM, 32 MB inbuilt",200 mAh Battery,"0.49 inches, 64 x 32 px Display",No Rear Camera,No FM Radio,Bluetooth,No data
665,itel it2163S,958,0,Dual Sim,"4 MB RAM, 4 MB inbuilt",1200 mAh Battery,"1.8 inches, 160 x 128 px Display",No Rear Camera,"Memory Card Supported, upto 32 GB",Bluetooth,No data
699,Samsung Guru GT-E1215,1850,0,Single Sim,800 mAh Battery,"1.5 inches, 120 x 120 px Display",No Rear Camera,No FM Radio,No data,No data,No data
748,Nokia 400 4G,3290,0,"Dual Sim, 4G, VoLTE, Wi-Fi",2000 mAh Battery,"2.4 inches, 240 x 320 px Display",0.3 MP Rear & 0.3 MP Front Camera,"Memory Card Supported, upto 64 GB",Bluetooth,Browser,No data


### Step 2: Solving the `Tidiness` issues

- Splitting the `sim` column into 3 cols `has_5g`, `has_NFC` and `has_IR_Blaster`.
- Splitting the `ram` column into 2 cols `RAM` and `ROM`.
- Splitting the `processor` column into `processor name`, `cores` and `cpu speed`.
- Splitting the `battery` column into `battery capacity` and `fast_charging_available`.
- Splitting the `display` column into `size`, `resolution_width`, `resolution_height` and `frequency`.
- Splitting the `camera` column into `front` and `rear camera`.
- Splitting the `card` column into `supported` and `extended_upto`.

In [70]:
# Issue

phones["sim"].value_counts()

Dual Sim, 3G, 4G, VoLTE, Wi-Fi                               324
Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC                      268
Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi                           155
Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, IR Blaster                54
Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC, IR Blaster           52
Dual Sim, 3G, 4G, VoLTE, Wi-Fi, NFC                           46
Dual Sim, 3G, 4G, VoLTE, Wi-Fi, IR Blaster                    46
Dual Sim                                                      13
Dual Sim, 3G, 4G, Wi-Fi                                        9
Dual Sim, 3G, 4G, 5G, VoLTE, Vo5G, Wi-Fi, NFC                  7
Single Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC                      7
Dual Sim, 3G, 4G, VoLTE, Wi-Fi, NFC, IR Blaster                5
Dual Sim, 3G, 4G, VoLTE                                        5
Single Sim, 3G, 4G, VoLTE, Wi-Fi, NFC                          4
Single Sim                                                     4
Single Sim, 3G, 4G, VoLTE

In [46]:
# Solution
# Creating different columns for each feature

sim_cards = df["sim"].str.strip().str.split(',').str.get(0)
has_5g = df['sim'].str.contains('5G')
has_nfc = df['sim'].str.contains('NFC')
has_ir_blaster = df['sim'].str.contains('IR Blaster')



df.insert(4, 'number_of_sim_cards', sim_cards)
df.insert(5, 'has_5g', has_5g)
df.insert(6, 'has_nfc', has_nfc)
df.insert(7, 'has_ir_blaster', has_ir_blaster)

In [47]:
# test
# Checking for result

df.head()

Unnamed: 0,model,price,rating,sim,number_of_sim_cards,has_5g,has_nfc,has_ir_blaster,processor,ram,battery,display,camera,card,os
0,OnePlus 11 5G,54999,89,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC",Dual Sim,True,True,False,"Snapdragon 8 Gen2, Octa Core, 3.2 GHz Processor","12 GB RAM, 256 GB inbuilt",5000 mAh Battery with 100W Fast Charging,"6.7 inches, 1440 x 3216 px, 120 Hz Display wit...",50 MP + 48 MP + 32 MP Triple Rear & 16 MP Fron...,Memory Card Not Supported,Android v13
1,OnePlus Nord CE 2 Lite 5G,19989,81,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",Dual Sim,True,False,False,"Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 33W Fast Charging,"6.59 inches, 1080 x 2412 px, 120 Hz Display wi...",64 MP + 2 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
2,Samsung Galaxy A14 5G,16499,75,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",Dual Sim,True,False,False,"Exynos 1330, Octa Core, 2.4 GHz Processor","4 GB RAM, 64 GB inbuilt",5000 mAh Battery with 15W Fast Charging,"6.6 inches, 1080 x 2408 px, 90 Hz Display with...",50 MP + 2 MP + 2 MP Triple Rear & 13 MP Front ...,"Memory Card Supported, upto 1 TB",Android v13
3,Motorola Moto G62 5G,14999,81,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",Dual Sim,True,False,False,"Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with Fast Charging,"6.55 inches, 1080 x 2400 px, 120 Hz Display wi...",50 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
4,Realme 10 Pro Plus,24999,82,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",Dual Sim,True,False,False,"Dimensity 1080, Octa Core, 2.6 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 67W Fast Charging,"6.7 inches, 1080 x 2412 px, 120 Hz Display wit...",108 MP + 8 MP + 2 MP Triple Rear & 16 MP Front...,Memory Card Not Supported,Android v13


In [48]:
# Checking for phones where there is no "Dual Sim"

df[df["number_of_sim_cards"] != "Dual Sim"]

Unnamed: 0,model,price,rating,sim,number_of_sim_cards,has_5g,has_nfc,has_ir_blaster,processor,ram,battery,display,camera,card,os
174,Apple iPhone 9,29990,61,"Single Sim, 3G, 4G, VoLTE, Wi-Fi, NFC",Single Sim,False,True,False,"A13 Bionic, Hexa Core, 2.65 GHz Processor","3 GB RAM, 64 GB inbuilt",2050 mAh Battery with Fast Charging,"4.7 inches, 750 x 1334 px Display with Large N...",12 MP Rear & 7 MP Front Camera,Memory Card Not Supported,iOS v13.0
306,Samsung Galaxy Z Flip 3,69999,84,"Single Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC",Single Sim,True,True,False,"Snapdragon 888, Octa Core, 2.84 GHz Processor","8 GB RAM, 128 GB inbuilt",3300 mAh Battery with 15W Fast Charging,"6.7 inches, 1080 x 2640 px, 120 Hz Display wit...","Foldable Display, Dual Display",12 MP + 12 MP Dual Rear & 10 MP Front Camera,Memory Card Not Supported
365,OPPO Find N Flip,89990,88,"Single Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC",Single Sim,True,True,False,"Dimensity 9000, Octa Core, 3.05 GHz Processor","8 GB RAM, 128 GB inbuilt",4300 mAh Battery with 44W Fast Charging,"6.8 inches, 1200 x 2400 px, 120 Hz Display wit...","Foldable Display, Dual Display",50 MP + 8 MP Dual Rear & 32 MP Front Camera,Memory Card Not Supported
392,OPPO Find N2 Flip,70990,88,"Single Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC",Single Sim,True,True,False,"Dimensity 9000 Plus, Octa Core, 3.2 GHz Processor","8 GB RAM, 256 GB inbuilt",4300 mAh Battery with 44W Fast Charging,"6.8 inches, 1080 x 2520 px, 120 Hz Display wit...","Foldable Display, Dual Display",50 MP + 8 MP Dual Rear & 32 MP Front Camera,Memory Card Not Supported
429,Nokia X50 5G,34999,76,"Single Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",Single Sim,True,False,False,Snapdragon 775,"6 GB RAM, 64 GB inbuilt",6000 mAh Battery with 33W Fast Charging,"6.81 inches, 1080 x 2400 px Display with Punch...",108 MP Quad Rear & 32 MP Front Camera,Memory Card Supported,Android v11
431,Vertu Signature Touch,650000,62,"Single Sim, 3G, 4G, Wi-Fi, NFC",Single Sim,False,True,False,"Snapdragon 801, Octa Core, 1.5 GHz Processor","2 GB RAM, 64 GB inbuilt",2275 mAh Battery,"4.7 inches, 1080 x 1920 px Display",13 MP Rear & 2.1 MP Front Camera,Memory Card Not Supported,Android v4.4.2 (KitKat)
457,Sony Xperia L5 5G,15990,73,"Single Sim, 3G, 4G, VoLTE, Wi-Fi, NFC",Single Sim,False,True,False,"Helio P35, Octa Core, 2.3 GHz Processor","4 GB RAM, 64 GB inbuilt",4000 mAh Battery with Fast Charging,"6.2 inches, 720 x 1680 px Display",13 MP + 5 MP + 2 MP Triple Rear & 8 MP Front C...,"Memory Card (Hybrid), upto 512 GB",Android v12
506,Google Pixel 2 XL,15990,69,"Single Sim, 3G, 4G, VoLTE, Wi-Fi, NFC",Single Sim,False,True,False,"Snapdragon 835, Octa Core, 2.35 GHz Processor","4 GB RAM, 128 GB inbuilt",3520 mAh Battery with Fast Charging,"6 inches, 1440 x 2880 px Display",12.2 MP Rear & 8 MP Front Camera,Memory Card Not Supported,Android v8.0 (Oreo)
573,Nokia 105 (2019),1299,0,Single Sim,Single Sim,False,False,False,No Wifi,"4 MB RAM, 4 MB inbuilt",800 mAh Battery,"1.77 inches, 120 x 160 px Display",No Rear Camera,No data,No data
607,Apple iPhone 7s,52990,0,"Single Sim, 3G, 4G, VoLTE, Wi-Fi",Single Sim,False,False,False,"Fusion APL1024, Quad Core, 2.37 GHz Processor","3 GB RAM, 32 GB inbuilt",2230 mAh Battery,"4.7 inches, 750 x 1334 px Display",13 MP Rear & 7 MP Front Camera,iOS v10,No FM Radio


In [49]:
# There is an ipod also

df.iloc[753,]

model                       Apple iPod Touch (7th Gen)
price                                            18900
rating                                               0
sim                                              Wi-Fi
number_of_sim_cards                              Wi-Fi
has_5g                                           False
has_nfc                                          False
has_ir_blaster                                   False
processor                                32 GB inbuilt
ram                    4 inches, 640 x 1136 px Display
battery                8 MP Rear & 1.2 MP Front Camera
display                                        iOS v12
camera                                     No FM Radio
card                                         Bluetooth
os                                             Browser
Name: 753, dtype: object

In [50]:
# So the values in this column is

df["number_of_sim_cards"].unique()

array(['Dual Sim', 'Single Sim', 'Wi-Fi'], dtype=object)

In [51]:
# Now making the sim card numbers as numerical


def condition(x):
    if x == "Dual Sim":
        return "2"
    elif x == "Wi-Fi":
        return np.nan
    else:
        return "1"

df["number_of_sim_cards"] = df["number_of_sim_cards"].apply(condition)

In [52]:
# test
# Checking the result

df.head()

Unnamed: 0,model,price,rating,sim,number_of_sim_cards,has_5g,has_nfc,has_ir_blaster,processor,ram,battery,display,camera,card,os
0,OnePlus 11 5G,54999,89,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC",2,True,True,False,"Snapdragon 8 Gen2, Octa Core, 3.2 GHz Processor","12 GB RAM, 256 GB inbuilt",5000 mAh Battery with 100W Fast Charging,"6.7 inches, 1440 x 3216 px, 120 Hz Display wit...",50 MP + 48 MP + 32 MP Triple Rear & 16 MP Fron...,Memory Card Not Supported,Android v13
1,OnePlus Nord CE 2 Lite 5G,19989,81,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2,True,False,False,"Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 33W Fast Charging,"6.59 inches, 1080 x 2412 px, 120 Hz Display wi...",64 MP + 2 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
2,Samsung Galaxy A14 5G,16499,75,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2,True,False,False,"Exynos 1330, Octa Core, 2.4 GHz Processor","4 GB RAM, 64 GB inbuilt",5000 mAh Battery with 15W Fast Charging,"6.6 inches, 1080 x 2408 px, 90 Hz Display with...",50 MP + 2 MP + 2 MP Triple Rear & 13 MP Front ...,"Memory Card Supported, upto 1 TB",Android v13
3,Motorola Moto G62 5G,14999,81,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2,True,False,False,"Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with Fast Charging,"6.55 inches, 1080 x 2400 px, 120 Hz Display wi...",50 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
4,Realme 10 Pro Plus,24999,82,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2,True,False,False,"Dimensity 1080, Octa Core, 2.6 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 67W Fast Charging,"6.7 inches, 1080 x 2412 px, 120 Hz Display wit...",108 MP + 8 MP + 2 MP Triple Rear & 16 MP Front...,Memory Card Not Supported,Android v13


In [53]:
# Checking the row containing ipod

df[df["number_of_sim_cards"].isnull()]

Unnamed: 0,model,price,rating,sim,number_of_sim_cards,has_5g,has_nfc,has_ir_blaster,processor,ram,battery,display,camera,card,os
753,Apple iPod Touch (7th Gen),18900,0,Wi-Fi,,False,False,False,32 GB inbuilt,"4 inches, 640 x 1136 px Display",8 MP Rear & 1.2 MP Front Camera,iOS v12,No FM Radio,Bluetooth,Browser


In [55]:
# Saving upto here

try:
    df.to_csv("datasets/Data Analysis/phones.csv", index=None)
except Exception as err:
    print(err)
else:
    print("File created successfully")

File created successfully


In [16]:
# Checking the dataset

df = pd.read_csv("datasets/Data Analysis/phones.csv")
df

Unnamed: 0,model,price,rating,sim,number_of_sim_cards,has_5g,has_nfc,has_ir_blaster,processor,ram,battery,display,camera,card,os
0,OnePlus 11 5G,54999,89,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC",2.0,True,True,False,"Snapdragon 8 Gen2, Octa Core, 3.2 GHz Processor","12 GB RAM, 256 GB inbuilt",5000 mAh Battery with 100W Fast Charging,"6.7 inches, 1440 x 3216 px, 120 Hz Display wit...",50 MP + 48 MP + 32 MP Triple Rear & 16 MP Fron...,Memory Card Not Supported,Android v13
1,OnePlus Nord CE 2 Lite 5G,19989,81,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2.0,True,False,False,"Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 33W Fast Charging,"6.59 inches, 1080 x 2412 px, 120 Hz Display wi...",64 MP + 2 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
2,Samsung Galaxy A14 5G,16499,75,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2.0,True,False,False,"Exynos 1330, Octa Core, 2.4 GHz Processor","4 GB RAM, 64 GB inbuilt",5000 mAh Battery with 15W Fast Charging,"6.6 inches, 1080 x 2408 px, 90 Hz Display with...",50 MP + 2 MP + 2 MP Triple Rear & 13 MP Front ...,"Memory Card Supported, upto 1 TB",Android v13
3,Motorola Moto G62 5G,14999,81,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2.0,True,False,False,"Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with Fast Charging,"6.55 inches, 1080 x 2400 px, 120 Hz Display wi...",50 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
4,Realme 10 Pro Plus,24999,82,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2.0,True,False,False,"Dimensity 1080, Octa Core, 2.6 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 67W Fast Charging,"6.7 inches, 1080 x 2412 px, 120 Hz Display wit...",108 MP + 8 MP + 2 MP Triple Rear & 16 MP Front...,Memory Card Not Supported,Android v13
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1014,Motorola Moto Edge S30 Pro,34990,83,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2.0,True,False,False,"Snapdragon 8 Gen1, Octa Core, 3 GHz Processor","8 GB RAM, 128 GB inbuilt",5000 mAh Battery with 68.2W Fast Charging,"6.67 inches, 1080 x 2460 px, 120 Hz Display wi...",64 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,Android v12,No FM Radio
1015,Honor X8 5G,14990,75,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2.0,True,False,False,"Snapdragon 480+, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 22.5W Fast Charging,"6.5 inches, 720 x 1600 px Display with Water D...",48 MP + 2 MP + Depth Sensor Triple Rear & 8 MP...,"Memory Card Supported, upto 1 TB",Android v11
1016,POCO X4 GT 5G (8GB RAM + 256GB),28990,85,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC, IR Bl...",2.0,True,True,True,"Dimensity 8100, Octa Core, 2.85 GHz Processor","8 GB RAM, 256 GB inbuilt",5080 mAh Battery with 67W Fast Charging,"6.6 inches, 1080 x 2460 px, 144 Hz Display wit...",64 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,Memory Card Not Supported,Android v12
1017,Motorola Moto G91 5G,19990,80,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC",2.0,True,True,False,"Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with Fast Charging,"6.8 inches, 1080 x 2400 px Display with Punch ...",108 MP + 8 MP + 2 MP Triple Rear & 32 MP Front...,"Memory Card Supported, upto 1 TB",Android v12


In [21]:
# Changing the values of columns "has_5g", "has_nfc" and "has_ir_blaster" to yes and no

df["has_5g"] = df["has_5g"].apply(lambda x: "Yes" if x == True else "No")
df["has_nfc"] = df["has_nfc"].apply(lambda x: "Yes" if x == True else "No")
df["has_ir_blaster"] = df["has_ir_blaster"].apply(lambda x: "Yes" if x == True else "No")

In [22]:
# Checking the result

df.head(5)

Unnamed: 0,model,price,rating,sim,number_of_sim_cards,has_5g,has_nfc,has_ir_blaster,processor,ram,battery,display,camera,card,os
0,OnePlus 11 5G,54999,89,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC",2.0,Yes,Yes,No,"Snapdragon 8 Gen2, Octa Core, 3.2 GHz Processor","12 GB RAM, 256 GB inbuilt",5000 mAh Battery with 100W Fast Charging,"6.7 inches, 1440 x 3216 px, 120 Hz Display wit...",50 MP + 48 MP + 32 MP Triple Rear & 16 MP Fron...,Memory Card Not Supported,Android v13
1,OnePlus Nord CE 2 Lite 5G,19989,81,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2.0,Yes,No,No,"Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 33W Fast Charging,"6.59 inches, 1080 x 2412 px, 120 Hz Display wi...",64 MP + 2 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
2,Samsung Galaxy A14 5G,16499,75,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2.0,Yes,No,No,"Exynos 1330, Octa Core, 2.4 GHz Processor","4 GB RAM, 64 GB inbuilt",5000 mAh Battery with 15W Fast Charging,"6.6 inches, 1080 x 2408 px, 90 Hz Display with...",50 MP + 2 MP + 2 MP Triple Rear & 13 MP Front ...,"Memory Card Supported, upto 1 TB",Android v13
3,Motorola Moto G62 5G,14999,81,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2.0,Yes,No,No,"Snapdragon 695, Octa Core, 2.2 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with Fast Charging,"6.55 inches, 1080 x 2400 px, 120 Hz Display wi...",50 MP + 8 MP + 2 MP Triple Rear & 16 MP Front ...,"Memory Card (Hybrid), upto 1 TB",Android v12
4,Realme 10 Pro Plus,24999,82,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi",2.0,Yes,No,No,"Dimensity 1080, Octa Core, 2.6 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 67W Fast Charging,"6.7 inches, 1080 x 2412 px, 120 Hz Display wit...",108 MP + 8 MP + 2 MP Triple Rear & 16 MP Front...,Memory Card Not Supported,Android v13


### Step 3: Solving the `Validity` issues

- Changing the datatypes of the `price` and `ratings` columns.
- Also removing the `₹` and `,` from the `price` column.

In [47]:
# Issue
# The datatype of the column is object

phones["price"].dtype

dtype('O')

In [48]:
# Issue
# The datatype of this column is of float

phones["rating"].dtype

dtype('float64')

In [49]:
# Issue
# The "price" column also has the "₹" sign and "," in it.

phones.head(1)

Unnamed: 0,model,price,rating,sim,processor,ram,battery,display,camera,card,os
0,OnePlus 11 5G,"₹54,999",89.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 8 Gen2, Octa Core, 3.2 GHz Processor","12 GB RAM, 256 GB inbuilt",5000 mAh Battery with 100W Fast Charging,"6.7 inches, 1440 x 3216 px, 120 Hz Display wit...",50 MP + 48 MP + 32 MP Triple Rear & 16 MP Fron...,Memory Card Not Supported,Android v13


In [51]:
# Checking if there is any missing values in rating column

phones["rating"].isnull().sum()

141

In [52]:
# Solution
# Replacing the "₹" sign and ","
# Also changing the datatype as integer
# Changing the datatype of "rating" column also to integer
# Here we need to use "Int64" as it has some "NaN" values also

phones["price"] = phones["price"].str.replace('₹', '').str.replace(',', '').astype('int')
phones["rating"] = phones["rating"].astype('Int64')

In [53]:
# test
# Checking the result

print("Now datatype of price column is: ", phones["price"].dtype)
print("Now datatype of rating column is: ", phones["rating"].dtype)
phones.sample(5)

Now datatype of price column is:  int32
Now datatype of rating column is:  Int64


Unnamed: 0,model,price,rating,sim,processor,ram,battery,display,camera,card,os
636,Apple iPhone 14 (512GB),95999,82.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Bionic A15, Hexa Core, 3.22 GHz Processor","6 GB RAM, 512 GB inbuilt",3279 mAh Battery with Fast Charging,"6.1 inches, 1170 x 2532 px Display with Small ...",12 MP + 12 MP Dual Rear & 12 MP Front Camera,Memory Card Not Supported,iOS v16
782,Vivo Y12G (3GB RAM + 64GB),11990,68.0,"Dual Sim, 3G, 4G, VoLTE, Wi-Fi","Snapdragon 439 , Octa Core, 2 GHz Processor","3 GB RAM, 64 GB inbuilt",5000 mAh Battery,"6.51 inches, 720 x 1600 px Display with Water ...",13 MP + 2 MP + 2 MP Triple Rear & 8 MP Front C...,Memory Card Supported,Android v11
545,Lenovo Legion Y90,46990,87.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi, NFC","Snapdragon 8 Gen1, Octa Core, 3 GHz Processor","12 GB RAM, 256 GB inbuilt",5600 mAh Battery with 68W Fast Charging,"6.92 inches, 1080 x 2460 px, 144 Hz Display",64 MP + 13 MP Dual Rear & 16 MP Front Camera,Android v12,No FM Radio
222,Realme 9i 5G (6GB RAM + 128GB),16999,78.0,"Dual Sim, 3G, 4G, 5G, VoLTE, Wi-Fi","Dimensity 810 5G, Octa Core, 2.4 GHz Processor","6 GB RAM, 128 GB inbuilt",5000 mAh Battery with 18W Fast Charging,"6.6 inches, 1080 x 2408 px, 90 Hz Display with...",50 MP + 2 MP + 2 MP Triple Rear & 8 MP Front C...,"Memory Card Supported, upto 1 TB",Android v12
447,Infinix Smart 6 HD,6999,,"Dual Sim, 3G, 4G, VoLTE, Wi-Fi","Helio A22, Quad Core, 2 GHz Processor","2 GB RAM, 32 GB inbuilt",5000 mAh Battery,"6.6 inches, 720 x 1600 px Display with Water D...",8 MP Dual Rear & 5 MP Front Camera,"Memory Card Supported, upto 512 GB",Android v11


In [54]:
# describe
# To check the distribution of values in "price" column

phones.describe()

Unnamed: 0,price,rating
count,1020.0,879.0
mean,31371.767647,78.258248
std,39168.94259,7.402854
min,99.0,60.0
25%,12464.25,74.0
50%,19815.0,80.0
75%,34999.0,84.0
max,650000.0,89.0


In [55]:
# As we can see there is a price of "99" which seems unusual
# Let's see that row

phones[phones["price"] == 99]

Unnamed: 0,model,price,rating,sim,processor,ram,battery,display,camera,card,os
608,Namotel Achhe Din,99,,"Dual Sim, 3G, Wi-Fi","1 GB RAM, 4 GB inbuilt",1325 mAh Battery,"4 inches, 720 x 1280 px Display",2 MP Rear & 0.3 MP Front Camera,Android v5.0 (Lollipop),Bluetooth,


### Step 4: Solving the `Accuracy` issues

- The phone with `price` as 99 needed to be dropped, as we don't know about the actual price.

In [56]:
# Issue

phones[phones["price"] == 99]

Unnamed: 0,model,price,rating,sim,processor,ram,battery,display,camera,card,os
608,Namotel Achhe Din,99,,"Dual Sim, 3G, Wi-Fi","1 GB RAM, 4 GB inbuilt",1325 mAh Battery,"4 inches, 720 x 1280 px Display",2 MP Rear & 0.3 MP Front Camera,Android v5.0 (Lollipop),Bluetooth,


In [57]:
# Solution
# Dropping the row

phones = phones[~ (phones["price"] == 99)]

In [58]:
# test
# Checking the result

phones.shape

(1019, 11)

### Step 5: Solving the `Consistancy` issues