# MPG Cars

### Introduction:

The following exercise utilizes data from [UC Irvine Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/Auto+MPG)

### Step 1. Import the necessary libraries

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

### Step 2. Import the first dataset [cars1](https://raw.githubusercontent.com/guipsamora/pandas_exercises/master/05_Merge/Auto_MPG/cars1.csv) and [cars2](https://raw.githubusercontent.com/guipsamora/pandas_exercises/master/05_Merge/Auto_MPG/cars2.csv).  

   ### Step 3. Assign each to a variable called cars1 and cars2

In [2]:
cars1 = pd.read_csv('cars1.csv', sep = ',')
cars2 = pd.read_csv('cars2.csv', sep = ',')

### Step 4. Oops, it seems our first dataset has some unnamed blank columns, fix cars1

In [3]:
cars1.drop(columns = cars1.loc[:, cars1.columns.str.contains('^Unnamed')], inplace = True)
cars1

Unnamed: 0,mpg,cylinders,displacement,horsepower,weight,acceleration,model,origin,car
0,18.0,8,307,130,3504,12.0,70,1,chevrolet chevelle malibu
1,15.0,8,350,165,3693,11.5,70,1,buick skylark 320
2,18.0,8,318,150,3436,11.0,70,1,plymouth satellite
3,16.0,8,304,150,3433,12.0,70,1,amc rebel sst
4,17.0,8,302,140,3449,10.5,70,1,ford torino
...,...,...,...,...,...,...,...,...,...
193,24.0,6,200,81,3012,17.6,76,1,ford maverick
194,22.5,6,232,90,3085,17.6,76,1,amc hornet
195,29.0,4,85,52,2035,22.2,76,1,chevrolet chevette
196,24.5,4,98,60,2164,22.1,76,1,chevrolet woody


### Step 5. What is the number of observations in each dataset?

In [4]:
cars1.describe()

Unnamed: 0,mpg,cylinders,displacement,weight,acceleration,model,origin
count,198.0,198.0,198.0,198.0,198.0,198.0,198.0
mean,19.719697,5.89899,223.469697,3177.888889,15.005556,72.818182,1.439394
std,5.814254,1.785417,115.181017,934.783733,2.872382,1.865332,0.708085
min,9.0,3.0,68.0,1613.0,8.0,70.0,1.0
25%,15.0,4.0,113.25,2302.5,13.0,71.0,1.0
50%,19.0,6.0,228.0,3030.0,15.0,73.0,1.0
75%,24.375,8.0,318.0,4080.75,16.8,74.0,2.0
max,35.0,8.0,455.0,5140.0,23.5,76.0,3.0


In [5]:
cars1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 198 entries, 0 to 197
Data columns (total 9 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   mpg           198 non-null    float64
 1   cylinders     198 non-null    int64  
 2   displacement  198 non-null    int64  
 3   horsepower    198 non-null    object 
 4   weight        198 non-null    int64  
 5   acceleration  198 non-null    float64
 6   model         198 non-null    int64  
 7   origin        198 non-null    int64  
 8   car           198 non-null    object 
dtypes: float64(2), int64(5), object(2)
memory usage: 14.0+ KB


### Step 6. Join cars1 and cars2 into a single DataFrame called cars

In [6]:
cars = pd.concat([cars1,cars2])
cars1

Unnamed: 0,mpg,cylinders,displacement,horsepower,weight,acceleration,model,origin,car
0,18.0,8,307,130,3504,12.0,70,1,chevrolet chevelle malibu
1,15.0,8,350,165,3693,11.5,70,1,buick skylark 320
2,18.0,8,318,150,3436,11.0,70,1,plymouth satellite
3,16.0,8,304,150,3433,12.0,70,1,amc rebel sst
4,17.0,8,302,140,3449,10.5,70,1,ford torino
...,...,...,...,...,...,...,...,...,...
193,24.0,6,200,81,3012,17.6,76,1,ford maverick
194,22.5,6,232,90,3085,17.6,76,1,amc hornet
195,29.0,4,85,52,2035,22.2,76,1,chevrolet chevette
196,24.5,4,98,60,2164,22.1,76,1,chevrolet woody


### Step 7. Oops, there is a column missing, called owners. Create a random number Series from 15,000 to 73,000.

In [52]:
import random
datar = pd.Series([random.randint(15000, 73000) for i in np.arange(1,len(cars.index))])
datar

0      65050
1      26723
2      25101
3      49649
4      34825
       ...  
392    49723
393    18272
394    55645
395    30574
396    21472
Length: 397, dtype: int64

### Step 8. Add the column owners to cars

In [54]:
cars['owner'] = datar
cars

Unnamed: 0,mpg,cylinders,displacement,horsepower,weight,acceleration,model,origin,car,owner
0,18.0,8,307,130,3504,12.0,70,1,chevrolet chevelle malibu,65050
1,15.0,8,350,165,3693,11.5,70,1,buick skylark 320,26723
2,18.0,8,318,150,3436,11.0,70,1,plymouth satellite,25101
3,16.0,8,304,150,3433,12.0,70,1,amc rebel sst,49649
4,17.0,8,302,140,3449,10.5,70,1,ford torino,34825
...,...,...,...,...,...,...,...,...,...,...
195,27.0,4,140,86,2790,15.6,82,1,ford mustang gl,52951
196,44.0,4,97,52,2130,24.6,82,2,vw pickup,26892
197,32.0,4,135,84,2295,11.6,82,1,dodge rampage,17042
198,28.0,4,120,79,2625,18.6,82,1,ford ranger,54984
