## 1. Introduction
In this modern era, smartphones are an integral part of the lives of human beings. When a smartphone is purchased, many factors like the display, processor, memory, camera, thickness, battery, connectivity and others are taken into account. One factor that people do not consider is whether the product is worth the cost. As there are no resources to cross-validate the price, people fail in taking the correct decision.

This project intends to solve this issue by taking the historical data pertaining to the key features of smartphones along with its cost and develop a model that will predict the approximate price of the new smartphone with a reasonable accuracy.

### Approach
A reminder of the general approach to working on a Machine Lerning Project:</p>
<ul>
 
    1. Start off by loading and viewing the dataset. Make sure you are able to understand the how our data looks, the data types and value ranges.
 
    2. Prepare the data to make sure that you are not missing any values and that your data is ready for your ML model to make predictions.
 
    3. Build some intuition on your data by exploring the features.
 
    4. Finally build a machine learning model that can predict if an individual's application for a credit card will be accepted.


### Notes
- The data is sitting on a csv file named train_mobil_data.csv
- The features present in this data are:

    * id:ID
    * battery_power:Total energy a battery can store in one time measured in mAh
    * blue:Has bluetooth or not
    * clock_speed:speed at which microprocessor executes instructions
    * dual_sim:Has dual sim support or not
    * fc:Front Camera mega pixels
    * four_g:Has 4G or not
    * int_memory:Internal Memory in Gigabytes
    * m_dep:Mobile Depth in cm
    * mobile_wt:Weight of mobile phone
    * n_cores:Number of cores of processor
    * pc:Primary Camera mega pixels
    * px_height:Pixel Resolution Height
    * px_width:Pixel Resolution Width
    * ram:Random Access Memory in Megabytes
    * sc_h:Screen Height of mobile in cm
    * sc_w:Screen Width of mobile in cm
    * talk_time:longest time that a single battery charge will last when you are
    * three_g:Has 3G or not
    * touch_screen:Has touch screen or not
    * wifi:Has wifi or not



## Data Load
Load the data from the file provided and inspect it

In [None]:
# Import pandas
import pandas as pd

# Load dataset
dataset=pd.read_csv('../input/train.csv')

# Inspect data
dataset.head()

## Data Exploration
This is the process where you look at and understand their data with statistical and visualization methods. This step helps identifying patterns and problems in the dataset, as well as deciding which model or algorithm to use in subsequent steps.

The steps you should consider in this stage include:

- Identify input(features) and output(target) variables on your data
- Identify the types of data
- identify categorical vs continuous variables
- Understanding the statistical properties of your variables

In [None]:
# Code for data exploration

## Data Visualization
e now have a basic idea about the data. We need to extend that with some visualizations.

We are going to look at two types of plots:

- Histograms plot to have an idea of the distribution
- Scatter plots to find some of the correlation between variables

In [None]:
# Code for data visualization

## Data Preparation
You must now begin the process of transforming raw data so that data it is run through your ml model

- Modify the data types of each feature (if needed)
- Look for missing values, replace or remove
- Modify skewed variables
- Remove outliers

In [None]:
# Code for data preparation

## Feature Enngineering
You must now begin the process of extracting more information from existing data. You are not adding any new data here, but you are actually making the data you already have more useful.


The steps you should consider in this stage include:

- Developing new features apart from those already generated

- Selecting a set of features to remove

- Creating features using existing data through mathematical operations 

- Applying feature scaling

- Applying label encoding

- Understanding correlation between features and target




In [None]:
# Code for feature engineering

In [None]:
# Code for feature engineering

## Build Train and Test
For training a model we initially split the model into 3 three sections which are ‘Training data’ ,‘Validation data’ and ‘Testing data’.
You train the classifier using ‘training data set’, tune the parameters using ‘validation set’ and then test the performance of your classifier on unseen ‘test data set’. 

- Note: during training the classifier only the training and/or validation set is available. The test data set must not be used during training the classifier. The test set will only be available during testing the classifier.

In [None]:
# Import train_test_split


## Building the Model

Now that the data has been processed it is time to determine the what model will be used to find our predictions. 

Consider the following points before making a choosing a model:

- The type of prediction this project requires (classification/regression)
- How well do you understand the model you want to use.
- Previoous performance of the model you choose on similar data



In [None]:
# Code to build the model

## Evaluating and Accuracy Metrics
<p>But how well does our model perform? </p>
<p>We will now evaluate our model on the test set with respect to <a href="https://developers.google.com/machine-learning/crash-course/classification/accuracy">classification accuracy</a>. But we will also take a look the model's <a href="http://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/">confusion matrix</a>. In the case of predicting credit card applications, it is equally important to see if our machine learning model is able to predict the approval status of the applications as denied that originally got denied. If our model is not performing well in this aspect, then it might end up approving the application that should have been approved. The confusion matrix helps us to view our model's performance from these aspects.  </p>

In [None]:
# Data Accuracy metrics code

## Improving Model Performance

<p>Our model was pretty good! It was able to yield an accuracy score of almost 84%.</p>
<p>For the confusion matrix, the first element of the of the first row of the confusion matrix denotes the true negatives meaning the number of negative instances (denied applications) predicted by the model correctly. And the last element of the second row of the confusion matrix denotes the true positives meaning the number of positive instances (approved applications) predicted by the model correctly.</p>

<p>
<ul>
<li>tol</li>
<li>max_iter</li>
</ul>

In [None]:
# Import GridSearchCV


In [None]:
# Instantiate GridSearchCV with the required parameters
