# Cleaning and Analysis of Common Laptops

## Overview

This dataset contains an assortment of laptops and their: prices, screen sizes, manufacturers, cpus, and more. It was previously cleaned and improved, but requires additional cleaning.

## Tasks
1. Convert the price_euros column to a numeric dtype.
2. Extract the screen resolution from the screen column.
3. Extract the processor speed from the cpu column.

# Questions
1. Are laptops made by Apple more expensive than those made by other manufacturers?
2. What is the best value laptop with a screen size of 15" or more?
3. Which laptop has the most storage space?

In [8]:
import pandas as pd

data = pd.read_csv("laptops_cv1.csv",index_col=0)
data.drop(0)
data.head()

Unnamed: 0,manufacturer,model_name,category,screen_size,screen,cpu,ram_gb,storage,gpu,operating_system,operating_system_version,weight_kg,price_euros,cpu_manufacturer,gpu_manufacturer,os_version
0,Apple,MacBook Pro,Ultrabook,"13.3""",IPS Panel Retina Display 2560x1600,Intel Core i5 2.3GHz,8,128GB SSD,Intel Iris Plus Graphics 640,macOS,,1.37,133969,Intel,Intel,X
1,Apple,Macbook Air,Ultrabook,"13.3""",1440x900,Intel Core i5 1.8GHz,8,128GB Flash Storage,Intel HD Graphics 6000,macOS,,1.34,89894,Intel,Intel,X
2,HP,250 G6,Notebook,"15.6""",Full HD 1920x1080,Intel Core i5 7200U 2.5GHz,8,256GB SSD,Intel HD Graphics 620,No OS,,1.86,57500,Intel,Intel,Version Unknown
3,Apple,MacBook Pro,Ultrabook,"15.4""",IPS Panel Retina Display 2880x1800,Intel Core i7 2.7GHz,16,512GB SSD,AMD Radeon Pro 455,macOS,,1.83,253745,Intel,AMD,X
4,Apple,MacBook Pro,Ultrabook,"13.3""",IPS Panel Retina Display 2560x1600,Intel Core i5 3.1GHz,8,256GB SSD,Intel Iris Plus Graphics 650,macOS,,1.37,180360,Intel,Intel,X


## Price conversion

Our price_euros columns is currently stored as an object. 

While this is acceptable, a numeric format such as a float would be more appropriate.

First, we'll remove the "," and replace it with a "." to seperate our decimals. Then we'll convert it to a float.

In [21]:
print(data["price_euros"].describe())

count        1303
unique        791
top       1499,00
freq           14
Name: price_euros, dtype: object


In [23]:
print(data["price_euros"].head())

0    1339,69
1     898,94
2     575,00
3    2537,45
4    1803,60
Name: price_euros, dtype: object


In [27]:
data["price_euros"] = data["price_euros"].str.replace(",",".")

#### We now have a nice and clean price_euros column

In [29]:
data["price_euros"] = data["price_euros"].astype("float")
print(data["price_euros"].head())

0    1339.69
1     898.94
2     575.00
3    2537.45
4    1803.60
Name: price_euros, dtype: float64


## Screen Resolution

Our screen resolution's column is very descriptive, but contains information that not neccasarily the screen resolution. We can simply remove this extra information or extract it into its own column. Typically screen types and their names arent very standardized and 

In [30]:
data["screen"].head()

0    IPS Panel Retina Display 2560x1600
1                              1440x900
2                     Full HD 1920x1080
3    IPS Panel Retina Display 2880x1800
4    IPS Panel Retina Display 2560x1600
Name: screen, dtype: object

In [44]:
data["screen_resolution"] = data["screen"].str.rsplit(n=1).str[-1]
data["screen_resolution"].head()

0    2560x1600
1     1440x900
2    1920x1080
3    2880x1800
4    2560x1600
Name: screen_resolution, dtype: object

# Processor speed

In [46]:
data["Processor_Speed"] = data["cpu"].str.rsplit(n=1).str[-1]
data["Processor_Speed"].head()

0    2.3GHz
1    1.8GHz
2    2.5GHz
3    2.7GHz
4    3.1GHz
Name: Processor_Speed, dtype: object

In [48]:
data.head()

Unnamed: 0,manufacturer,model_name,category,screen_size,screen,cpu,ram_gb,storage,gpu,operating_system,operating_system_version,weight_kg,price_euros,cpu_manufacturer,gpu_manufacturer,os_version,screen_resolution,Processor_Speed
0,Apple,MacBook Pro,Ultrabook,"13.3""",IPS Panel Retina Display 2560x1600,Intel Core i5 2.3GHz,8,128GB SSD,Intel Iris Plus Graphics 640,macOS,,1.37,1339.69,Intel,Intel,X,2560x1600,2.3GHz
1,Apple,Macbook Air,Ultrabook,"13.3""",1440x900,Intel Core i5 1.8GHz,8,128GB Flash Storage,Intel HD Graphics 6000,macOS,,1.34,898.94,Intel,Intel,X,1440x900,1.8GHz
2,HP,250 G6,Notebook,"15.6""",Full HD 1920x1080,Intel Core i5 7200U 2.5GHz,8,256GB SSD,Intel HD Graphics 620,No OS,,1.86,575.0,Intel,Intel,Version Unknown,1920x1080,2.5GHz
3,Apple,MacBook Pro,Ultrabook,"15.4""",IPS Panel Retina Display 2880x1800,Intel Core i7 2.7GHz,16,512GB SSD,AMD Radeon Pro 455,macOS,,1.83,2537.45,Intel,AMD,X,2880x1800,2.7GHz
4,Apple,MacBook Pro,Ultrabook,"13.3""",IPS Panel Retina Display 2560x1600,Intel Core i5 3.1GHz,8,256GB SSD,Intel Iris Plus Graphics 650,macOS,,1.37,1803.6,Intel,Intel,X,2560x1600,3.1GHz
