<img src="the_leagueAI_Logo.png" alt="the_leagueAI_Logo" width="200"/>

## 1. Introduction

The Banana Company Inc. is a new mobile phone company wanting to disrupt the mobile phone business. Banana wants to compete with Apple and get in on this competitive market. However, since they are brand new to the game, they are seeking a Machine Learning Engineer (that’s YOU!) to help them estimate the most optimal price to sell their mobile phones. 

Today Banana Inc. will provide you a dataset containing smartphones design features and their cost from different phone manufacturers. The goal is to build a model that can estimate the price of any given phone based on the phone features. Having a machine learning model that can estimate the price range of any phone will give Banana Inc. a strategic advantage when deciding how to price their own phones. 



<img src="phones_image.jpeg" alt="phones_image" width="1000"/>

## The Data
Banana Inc. has been able to gather information on the technical features and cost of several competitors smartphones. The features gathered for each of these phones are described below:

Features:
   - ***id***: Identifies every single phone record available in our dataset
   - ***battery_power***: Indicates the total energy a battery can store at once (measured in mAh)
   - ***blue***: Indicates whether or not the phone has bluetooth capabilities
   - ***clock_speed***: Indicates the speed at which the microprocessor executes instructions
   - ***dual_sim***: Indicates whether or not the phone has dual sim support
   - ***fc***: Indicates the number of megapixels available in the front camera
   - ***four_g***: Indicates whether or not the phone has 4G capabilities
   - ***int_memory***: Indicates the phone internal memory in Gigabytes
   - ***m_dep***: Indicates the phone’s depth in cm
   - ***mobile_wt***: Indicates the phone’s weight 
   - ***n_cores***: Indicates the number of cores the processor contains
   - ***pc***: Indicates the number of mega pixels available in the phone’s primary camera mega pixels
   - ***px_height***: Indicates the screen’s vertical pixel resolution
   - ***px_width***: Indicates the screen’s horizontal pixel resolution
   - ***ram***: Indicates the phone’s available ram
   - ***sc_h***: Indicates the phone’s height in centimeters
   - ***sc_w***: Indicates the phone’s width in centimeters
   - ***talk_time***: Indicates the maximum call time that the phone’s battery will last on a single charge
   - ***three_g***: Indicates whether or not the phone has 3G capabilities
   - ***touch_screen***: Indicates whether or not the phone has a touch screen
   - ***wifi***: Indicates whether or not the phone has wi-fi capabilities

Target:
   - ***price_range***: The price category for each phone. The different categories are explained using the table below:
   
| price_range Value | Range Description | Dollar Value Range |
| --- | --- | --- |
| 0 | Budget - Midrange | 0-699 | 
|  |  |  |
| 1 | Premium - Flagship | 700-1300 | 

   
   
   
Notes to Concider:
- The data provided to you is sitting on a csv file named mobile_data_raw.csv
- This data was gathered manually and therefore might have a lot of issues


## The Objective:

***To build a machine learning model that is capable of predicting the price range of a smartphone when provided the technical specifications (features) of that phone.***

## The Approach:
Remember of the general approach to working on a Machine Learning Project:

 
    1. Start off by loading and viewing the dataset. Make sure to get a general understanding of how the data looks (data types, numerical range, ect.)
 
    2. Prepare the data to make sure that you are not missing any values and that your data will be digested by your machine learning model as expected.
    
    3. Build some intuition on your data by exploring the features. Understand how your features will ultimately help your ml model make a prediction using the context of the problem.
 
    4. Finally build the machine learning model and test its accuracy.


---
## Data Load 
Load the data from the file provided and inspect it.

---

Steps to concider:
- Start by importing the modules and packages that you might be using in this project
- When importing the data concider the use of dataframes to store your data

In [38]:
import pandas as pd
import numpy as np

phone info_df = pd.read_csv('mobile data raw.csv')

phone info_df

ModuleNotFoundError: No module named 'pandas'

---
## Data Exploration & Data Visualization
Attempt to understand the data with statistical and visualization methods. This step will help you identify patterns and problems in the dataset.

---

The steps you should consider in this stage include:

- Identify input(features) and output(target) variables on your data
- What is the size of our data (data shape)
- Identify the data types of each one of the features
- Identify the number missing values on each feature
- Identify categorical vs continuous(numerical) variables
- Understand the statistical properties each feature
- Creating histograms plot to have an idea of the distribution
- Creating scatter plots to find some of the correlation between variables


In [41]:
phone info_df.head()

phone info_df.columns

SyntaxError: invalid syntax (<ipython-input-41-6790bb82a89d>, line 1)

----

## Data Preparation
You must now begin the process of transforming raw data so that data it is run through your ml model


---

- Modify the data types of each feature (if needed)
- Look for missing values, replace or remove
- Modify skewed variables
- Remove outliers

In [42]:
battery_data = phone info_df.groupby('battery_power')
color_data = phone info_df.groupby('blue')
speed_data = phone info_df.groupby('clock_speed')
sim number_data = phone info_df.groupby('dual_sim')
fc capabilities_data = phone info_df.groupby('fc')
bandwith_data = phone info_df.groupby('four_g')
memory size_data = phone info_df.groupby('int_memory')
wt_data = phone info_df.groupby('mobile_wt')
cores_data = phone info_df.groupby('n_cores')
heigh_data = phone info_df.groupby('px_height')
width_data = phone info_df.groupby('px_width')
ram size_data = phone info_df.groupby('ram')
sc_data = phone info_df.groupby('sc_h')
time_data = phone info_df.groupby('talk_time')
three_data = phone info_df.groupby('three_g')


SyntaxError: invalid syntax (<ipython-input-42-89e3456f05b5>, line 1)

---
## Feature Engineering
You must now begin the process of extracting more information from existing data. You are not adding any new data here, but you are actually making the data you already have more useful.

---

The steps you should consider in this stage include:

- Developing new features apart from those already generated

- Selecting a set of features to remove

- Creating features using existing data through mathematical operations 

- Applying feature scaling

- Applying label encoding

- Understanding correlation between features and target




In [None]:
# Code for feature engineering

---
## Building the Model

Now that the data has been processed it is time to determine and build the model that will be used to find our predictions. 

---

Consider the following points before making a choosing a model:

- Create a train and test sets of data from the provided data
- The type of prediction this project requires (classification/regression)
- Determine the best features to used based on feature importance
- Define your model and modify it's parameters 


In [None]:
# Code to build Model

---

## Accuracy Metrics
With the model finally completed, it is time to understand the model's performance.

---

Consider the following points before making a choosing a model:

- Import the modules that will allow you to estimate different accuracy metrics
- Determine the number of positive and negative predictions.
- Make an assessment of what our results tell us and draw conclusions based on your findings 
- Provide and display your results using appropriate variables

In [None]:
# Code to find Accuracy metrics124