## 1. Understanding the business

Bob has started his own mobile company. He wants to give tough fight to big companies like Apple,Samsung etc.

He does not know how to estimate price of mobiles his company creates. In this competitive mobile phone market you cannot simply assume things. To solve this problem he collects sales data of mobile phones of various companies.

Bob wants to find out some relation between features of a mobile phone(eg:- RAM,Internal Memory etc) and its selling price. But he is not so good at Machine Learning. So he needs your help to solve this problem.

## 2. Data Working

### 2.1. Imports

In [4]:
# basic imports
import pandas as pd
import numpy as np

# visualization imports
import matplotlib.pyplot as plt
import seaborn as sns

# warnings
import warnings
warnings.filterwarnings('ignore')

### 2.2. Notebook Options

In [5]:
# Set table to show all columns
pd.set_option('display.max_columns', None)

# Seaborn Settings
sns.set_theme(style="whitegrid", palette="Dark2")

# Global Values
SEED = 99

### 2.3. Dataset Imports

In [6]:
# Read train dataset 
df = pd.read_csv('CSV\mobile.csv', index_col=0)

## 3. Data Check & Inspection

In [7]:
df.head()

Unnamed: 0_level_0,blue,clock_speed,dual_sim,fc,four_g,int_memory,m_dep,mobile_wt,n_cores,pc,px_height,px_width,ram,sc_h,sc_w,talk_time,three_g,touch_screen,wifi,price_range
battery_power,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
842,0,2.2,0.0,1,0,7,0.6,188.0,2,2,20,756,2549,9,7,19,0,0,1,1
1021,1,0.5,1.0,0,1,53,0.7,136.0,3,6,905,1988,2631,17,3,7,1,1,0,2
563,1,0.5,1.0,2,1,41,0.9,145.0,5,6,1263,1716,2603,11,2,9,1,1,0,2
615,1,2.5,0.0,0,0,10,0.8,131.0,6,9,1216,1786,2769,16,8,11,1,0,0,2
1821,1,1.2,0.0,13,1,44,0.6,141.0,2,14,1208,1212,1411,8,2,15,1,1,0,1


In [8]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 3000 entries, 842 to 1000
Data columns (total 20 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   blue          3000 non-null   int64  
 1   clock_speed   3000 non-null   float64
 2   dual_sim      3000 non-null   float64
 3   fc            3000 non-null   int64  
 4   four_g        3000 non-null   int64  
 5   int_memory    3000 non-null   int64  
 6   m_dep         3000 non-null   float64
 7   mobile_wt     3000 non-null   float64
 8   n_cores       3000 non-null   int64  
 9   pc            3000 non-null   int64  
 10  px_height     3000 non-null   int64  
 11  px_width      3000 non-null   int64  
 12  ram           3000 non-null   int64  
 13  sc_h          3000 non-null   int64  
 14  sc_w          3000 non-null   int64  
 15  talk_time     3000 non-null   int64  
 16  three_g       3000 non-null   int64  
 17  touch_screen  3000 non-null   int64  
 18  wifi          3000 non-null   i

In [9]:
df.describe()

Unnamed: 0,blue,clock_speed,dual_sim,fc,four_g,int_memory,m_dep,mobile_wt,n_cores,pc,px_height,px_width,ram,sc_h,sc_w,talk_time,three_g,touch_screen,wifi,price_range
count,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0,3000.0
mean,416.5,1.186833,0.8533,3.045333,1.878667,21.526667,11.551833,93.671833,49.517333,8.053667,433.423333,1043.384,1829.4,721.203667,7.843,9.112667,4.202667,0.587333,0.504667,1.169
std,639.16693,0.867291,0.795118,3.980439,3.238432,20.999181,18.810166,71.941799,66.774816,5.761931,470.037698,523.078249,1011.226753,1183.132467,5.243172,5.753116,5.820672,0.492396,0.500062,1.065904
min,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.1,1.0,0.0,0.0,0.0,256.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,0.0,0.5,0.0,0.0,0.0,1.0,0.4,0.8,4.0,3.0,15.0,646.75,1015.0,11.0,4.0,4.0,1.0,0.0,0.0,0.0
50%,1.0,1.0,1.0,1.0,1.0,16.0,0.8,109.0,7.0,7.0,282.5,1019.5,1647.0,16.0,7.0,8.0,1.0,1.0,1.0,1.0
75%,895.0,1.9,1.0,5.0,1.0,40.0,18.0,156.0,109.25,12.0,727.25,1461.25,2573.25,1235.75,12.0,14.0,6.25,1.0,1.0,2.0
max,1999.0,3.0,3.0,19.0,19.0,64.0,64.0,200.0,200.0,20.0,1960.0,1998.0,3998.0,3989.0,19.0,20.0,20.0,1.0,1.0,3.0


In [10]:
df.isna().sum()

blue            0
clock_speed     0
dual_sim        0
fc              0
four_g          0
int_memory      0
m_dep           0
mobile_wt       0
n_cores         0
pc              0
px_height       0
px_width        0
ram             0
sc_h            0
sc_w            0
talk_time       0
three_g         0
touch_screen    0
wifi            0
price_range     0
dtype: int64