# Deep Learning and Image Recognition

## ANN Regression - to predict wine quality

Use ANN to make predictions on stuructured data and then increase the complexity of the model.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = (7,7)
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import EarlyStopping
from keras.utils import to_categorical

Using TensorFlow backend.


### Read Dataset, Describe

In [2]:
!curl https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv -o ../winequality-red.csv
!curl https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-white.csv -o ../winequality-white.csv





In [3]:
white_wine = pd.read_csv('../winequality-white.csv', sep=';')
red_wine = pd.read_csv('../winequality-red.csv', sep=';')

In [4]:
# store wine type as an attribute
red_wine['wine_type'] = 0   
white_wine['wine_type'] = 1

# merge red and white wine datasets
wines = pd.concat([red_wine, white_wine])

#Add a new column to convert the wine quality into a categorical variable
wines['quality_label'] = wines['quality'].apply(lambda value: 0 #low
                                                              if value <= 5 else 1 # medium
                                                                  if value <= 7 else 2) #high

# re-shuffle records just to randomize data points
wines = wines.sample(frac=1, random_state=42).reset_index(drop=True)

#inspect data
wines.head()

Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality,wine_type,quality_label
0,7.0,0.17,0.74,12.8,0.045,24.0,126.0,0.9942,3.26,0.38,12.2,8,1,2
1,7.7,0.64,0.21,2.2,0.077,32.0,133.0,0.9956,3.27,0.45,9.9,5,0,0
2,6.8,0.39,0.34,7.4,0.02,38.0,133.0,0.99212,3.18,0.44,12.0,7,1,1
3,6.3,0.28,0.47,11.2,0.04,61.0,183.0,0.99592,3.12,0.51,9.5,6,1,1
4,7.4,0.35,0.2,13.9,0.054,63.0,229.0,0.99888,3.11,0.5,8.9,6,1,1


In [28]:
#select features and label
X = wines.drop(columns=['quality','quality_label'])
y = wines['quality_label']

In [29]:
X.head()

Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,wine_type
0,7.0,0.17,0.74,12.8,0.045,24.0,126.0,0.9942,3.26,0.38,12.2,1
1,7.7,0.64,0.21,2.2,0.077,32.0,133.0,0.9956,3.27,0.45,9.9,0
2,6.8,0.39,0.34,7.4,0.02,38.0,133.0,0.99212,3.18,0.44,12.0,1
3,6.3,0.28,0.47,11.2,0.04,61.0,183.0,0.99592,3.12,0.51,9.5,1
4,7.4,0.35,0.2,13.9,0.054,63.0,229.0,0.99888,3.11,0.5,8.9,1


In [22]:
#get number of columns in training data
n_cols = X.shape[1]
n_cols

13

## Build Models

In [30]:
#create model
model = Sequential()

#get number of columns in training data
n_cols = X.shape[1]

#add layers to model
model.add(Dense(13, activation='relu', input_shape=(n_cols,)))
model.add(Dense(13, activation='relu'))
model.add(Dense(13, activation='relu'))
model.add(Dense(3, activation='softmax'))

#compile model using accuracy to measure model performance
model.compile(optimizer='adam', 
              loss='categorical_crossentropy',
              metrics=['accuracy'])

In [31]:
#set early stopping monitor so the model stops training when it won't improve anymore
EPOCHS = 20
early_stopping_monitor = EarlyStopping(patience=3)

# check summary
model.summary()

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_25 (Dense)             (None, 13)                169       
_________________________________________________________________
dense_26 (Dense)             (None, 13)                182       
_________________________________________________________________
dense_27 (Dense)             (None, 13)                182       
_________________________________________________________________
dense_28 (Dense)             (None, 3)                 42        
Total params: 575
Trainable params: 575
Non-trainable params: 0
_________________________________________________________________


In [32]:
#train model
history = model.fit(X, y, epochs=EPOCHS, 
                    validation_split=0.2, 
                    callbacks=[early_stopping_monitor])

ValueError: Error when checking target: expected dense_28 to have shape (3,) but got array with shape (1,)

In [25]:
#plot loss curve
def plot_loss(history):
    plt.figure(figsize=[8,6])
    plt.plot(history.history['loss'],'r',linewidth=3.0)
    plt.plot(history.history['val_loss'],'b',linewidth=3.0)
    plt.legend(['Training loss', 'Validation Loss'],fontsize=18)
    plt.xlabel('Epochs ',fontsize=16)
    plt.ylabel('Loss',fontsize=16)
    plt.title('Loss Curves',fontsize=16)

In [27]:
plt.figure(figsize=[8,6])
plt.plot(history.history['loss'],'r',linewidth=3.0)
plt.plot(history.history['val_loss'],'b',linewidth=3.0)
plt.legend(['Training loss', 'Validation Loss'],fontsize=18)
plt.xlabel('Epochs ',fontsize=16)
plt.ylabel('Loss',fontsize=16)
plt.title('Loss Curves',fontsize=16)

NameError: name 'history' is not defined

<Figure size 576x432 with 0 Axes>

### Regression 


## <span style="color:cornflowerblue">Exercise 1:</span>

Build a FFN to predict the quality of wine. 

Increase the complexity of the model to see if you get better results with more layers.

Plot the loss curve over the epochs.