### Build a neural networks model based on Keras for real world regression task

You are provided with a classic Auto MPG Dataset and you are asked to build a model to predict the fuel efficiency (mpg: miles per gallon) of late-1970s and early 1980s automobiles. 

Attibute information of this dataset: 
1. mpg: continuous (your target variable)
2. cylinders: multi-valued discrete
3. displacement: continuous
4. horsepower: continuous
5. weight: continuous
6. acceleration: continuous
7. model year: multi-valued discrete
8. origin: multi-valued discrete

Full dataset information is available here: http://archive.ics.uci.edu/ml/datasets/Auto+MPG

### Load the dataset

In [2]:
import pandas as pd

dataset = pd.read_csv("auto_mpg.csv")
print("dataset length:", len(dataset))
dataset.tail()

dataset length: 398


Unnamed: 0,MPG,Cylinders,Displacement,Horsepower,Weight,Acceleration,Model Year,Origin
393,27.0,4,140.0,86.0,2790.0,15.6,82,1
394,44.0,4,97.0,52.0,2130.0,24.6,82,2
395,32.0,4,135.0,84.0,2295.0,11.6,82,1
396,28.0,4,120.0,79.0,2625.0,18.6,82,1
397,31.0,4,119.0,82.0,2720.0,19.4,82,1


### Clean the dataset

In [3]:
# check if these is any missing value in the dataset
dataset.isna().sum()

MPG             0
Cylinders       0
Displacement    0
Horsepower      6
Weight          0
Acceleration    0
Model Year      0
Origin          0
dtype: int64

In [4]:
# dealing with missing values
dataset = dataset.dropna()
print("dataset length:", len(dataset))

dataset length: 392


In [5]:
# transform categorical(nominal) variable into dummy variables: 
dataset['Origin'] = dataset['Origin'].map({1: 'USA', 2: 'Europe', 3: 'Japan'})
dataset = pd.get_dummies(dataset, columns=['Origin'], prefix='', prefix_sep='')
dataset.tail()

Unnamed: 0,MPG,Cylinders,Displacement,Horsepower,Weight,Acceleration,Model Year,Europe,Japan,USA
393,27.0,4,140.0,86.0,2790.0,15.6,82,False,False,True
394,44.0,4,97.0,52.0,2130.0,24.6,82,True,False,False
395,32.0,4,135.0,84.0,2295.0,11.6,82,False,False,True
396,28.0,4,120.0,79.0,2625.0,18.6,82,False,False,True
397,31.0,4,119.0,82.0,2720.0,19.4,82,False,False,True


## Now it is your turn to practice what we have learned before
Follow the given instructions and fill the cell below with your own code

### Split the data into train and test
Randomly sample the dataset, use 80% for training and 20% for testing

### Normalize the variables that use different scales and ranges (optional)
One reason this is important is because the variables are multiplied by the model weights. So the scale of the outputs and the scale of the gradients are affected by the scale of the inputs.

Although a model might converge without normalization, normalization makes training much more stable.

In [6]:
from sklearn.preprocessing import MinMaxScaler

# fit scaler on training data
norm = MinMaxScaler().fit(X_train)

# transform training data
X_train_norm = norm.transform(X_train)

# transform testing data
X_test_norm = norm.transform(X_test)

### Build your own neural networks model
Hints for coding steps: setting random seed (optional), build a sequential model (can first try build with two hidden dense layers with 64 neurons each), complie the model with proper loss function and optimizer, fit the model with the normalized training data (you can try with 200 epochs first, 10% for validation and output history to help investigating the process for further improvement)

### Plot the loss 
Since we did not use other performance evaluation metrics, we only have loss results in the history. You can check this link if you want to add other evaluation metrics in the *model.complie()*: https://keras.io/api/metrics/

### Evaluate model with testing data
Hint: feed with normalised testing data

### Makes predictions with testing data
Hint: use model.predict() to get test predictions

If you are interested in measuring the model with r squared (a statistic that only applies to regression mpdels to measure model fit), you can try the following code. The best possible r squared score is 1.0, so the model is considered better fitted to the data if the score is close to 1.

In [None]:
from sklearn.metrics import r2_score

r_squared = r2_score(y_test, test_predictions.flatten())
print("r_squared: ", r_squared)

You can try change the neural networks structure, and see if the model performance can be improved.