## Exercise W6.4 (programming)

We consider the wine data set again. For this exercise we have provided a training set and a test set with filenames:

wine X train.txt, wine t train.txt and wine X test.txt, wine t test.txt.

We consider all three classes, where Barolo is class $1$ or $(1, 0, 0)$, Grignolino is class $2$ or $(0, 1, 0)$, and Barbera is class $3$ or $(0, 0, 1)$. Start by loading in the dataset (using np.loadtxt).

        
First, we will implement multiclass logistic regression from section 4.3.4 in Bishop using TensorFlow. We will use the identity basis function

$\phi(x) = x$

and explicitly add a bias term. This means that we can write the activations from equation (4.105) as 

$a_k = w_{k}^T 􏰈x + b_k$.

It is useful to implement this equation as
    $a=xW +b$ (W4.5) where $a = (a_1,...,a_K)$􏰈, $W = (w_1,...,w_K) and $b = (b_1,...,b_K)^T$􏰈.


In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
%matplotlib inline

**(a)** 
Start by implementing the TensorFlow graph for the model, i.e. placeholders to the input x and the target t, the weights W and biases b, the activation and class probabilities. Initialize the weights with from a Gaussian.

In [4]:
X_train = pd.read_csv("wine/wine_X_train.txt", names=["alc","acid"],delimiter=" ")
X_test = pd.read_csv("wine/wine_X_test.txt", names=["alc","acid"], delimiter=" ")
t_train = pd.read_csv("wine/wine_t_train.txt",names=["Barolo","Gringolino","Barbera"] ,delimiter=" ")
t_test = pd.read_csv("wine/wine_t_test.txt",names=["Barolo","Gringolino","Barbera"], delimiter=" ")

In [5]:
# Input, putput, weights and bias
x = tf.placeholder(tf.float32, [None, 2])
t = tf.placeholder(tf.float32, [None, 3])
xt = tf.placeholder(tf.float32,[None, 2])
tt = tf.placeholder(tf.float32,[None, 3])

In [6]:
# Defined the model parameters
W = tf.get_variable("W", [X_train.shape[1],t_train.shape[1]], initializer=tf.random_normal_initializer)
b = tf.get_variable("b", [t_train.shape[1]], initializer=tf.random_normal_initializer)

**(b)**
Implement the loss function using tf.nn.softmax cross entropy with logits v2

In [7]:
# Model
y = tf.matmul(x,W) + b
yt = tf.nn.softmax(tf.matmul(xt,W) + b)

# Difine the loss function
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=t, logits=y))


In [8]:
prediction = tf.argmax(y, 1)
prediction_test = tf.argmax(yt, 1)

accuracy = tf.reduce_mean(tf.cast(tf.equal(prediction, tf.argmax(t, 1)), tf.float32))
accuracy_test = tf.reduce_mean(tf.cast(tf.equal(prediction_test, tf.argmax(tt, 1)), tf.float32))

**(c)** 
Training the model using batch gradient decent (GradientDescentOptimizer) and
learning rate of 0.0001 and for 50000 epochs.

In [9]:
optimizer = tf.train.GradientDescentOptimizer(learning_rate = 0.0001).minimize(loss)

In [10]:
init = tf.global_variables_initializer()

# Start a new session
with tf.Session() as session:
    # Initialize the values
    session.run(init)
    
    # Training cycle
    for epoch in range(50000):
        _, loss_value = session.run([optimizer, loss], feed_dict={x: X_train, t: t_train.as_matrix()})
        
        if epoch % 10000 == 0 or epoch == 49999:
            if epoch > 1:
                print("Epoch: {}  loss = {:.6f}  diff = {:.9f}".format(epoch,loss_value,prev-loss_value))
            else:
                print("Epoch: {}  loss = {:.6f}".format(epoch,loss_value))
        prev = loss_value            
            
    print("Optimization done")

    accuracy_value = session.run(accuracy, feed_dict={x: X_train, t: t_train})
    print("Accuracy on train set:", accuracy_value)
    accuracy_value = session.run(accuracy_test, feed_dict={xt: X_test, tt: t_test})
    print("Accuracy on test set:", accuracy_value)

  # Remove the CWD from sys.path while we load stuff.


Epoch: 0  loss = 123.320671
Epoch: 10000  loss = 1.012405  diff = 0.000032306
Epoch: 20000  loss = 0.899174  diff = 0.000003338
Epoch: 30000  loss = 0.883531  diff = 0.000001192
Epoch: 40000  loss = 0.879841  diff = 0.000000060
Epoch: 49999  loss = 0.878471  diff = 0.000000000
Optimization done
Accuracy on train set: 0.5422535
Accuracy on test set: 0.3611111


Now we will implement a two layer neural network with 5 hidden nodes and the rectifier activation function for the hidden layer following the same steps as above.

In [11]:
# Two layed neural network
# Input and output
x = tf.placeholder(tf.float32, [None, 2])
t = tf.placeholder(tf.float32, [None, 3])
xt = tf.placeholder(tf.float32,[None, 2])
tt = tf.placeholder(tf.float32,[None, 3])

# Defined the model parameters
W1 = tf.get_variable("W1", [2, 5], initializer=tf.random_normal_initializer)
b1 = tf.get_variable("b1", [5], initializer=tf.random_normal_initializer)
W2 = tf.get_variable("W2", [5, 3], initializer=tf.random_normal_initializer)
b2 = tf.get_variable("b2", [3], initializer=tf.random_normal_initializer)


# Construct model
z1 = tf.nn.relu(tf.matmul(x, W1) + b1)
y =  tf.matmul(z1, W2) + b2

z1t = tf.nn.relu(tf.matmul(xt, W1) + b1)
yt =  tf.nn.softmax(tf.matmul(z1t, W2) + b2)


# Variables for prediction and accuracy
prediction = tf.argmax(y, 1)
accuracy = tf.reduce_mean(tf.cast(tf.equal(prediction, tf.argmax(t, 1)), tf.float32))

predictiont = tf.argmax(yt, 1)
accuracyt = tf.reduce_mean(tf.cast(tf.equal(predictiont, tf.argmax(tt, 1)), tf.float32))


# Difine the loss function
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=t, logits=y))

In [12]:
# Define the optimizer operation
optimizer = tf.train.GradientDescentOptimizer(learning_rate = 0.0001).minimize(loss)

In [13]:
# Make an operation that initializes the variables
init = tf.global_variables_initializer()

y_value_list = []

# Start a new session
with tf.Session() as session:
    # Initialize the values
    session.run(init)
    
    # Training cycle
    for epoch in range(50000):
        _, loss_value = session.run([optimizer, loss], feed_dict={x: X_train, t: t_train})
        
        if epoch % 10000 == 0 or epoch == 49999:
            if epoch > 1:
                print("Epoch: {}  loss = {:.6f}  diff = {:.9f}".format(epoch,loss_value,prev-loss_value))
            else:
                print("Epoch: {}  loss = {:.6f}".format(epoch,loss_value))
        prev = loss_value            
            
    print("Optimization done")

    # Evaluate the accuracy on the test set
    accuracy_value = session.run(accuracy, feed_dict={x: X_train, t: t_train})
    print("Accuracy on train set:", accuracy_value)
    # Evaluate the accuracy on the test set
    accuracy_value = session.run(accuracyt, feed_dict={xt: X_test, tt: t_test})
    print("Accuracy on test set:", accuracy_value)

Epoch: 0  loss = 126.112068
Epoch: 10000  loss = 1.018538  diff = 0.000004292
Epoch: 20000  loss = 0.961200  diff = 0.000004828
Epoch: 30000  loss = 0.923840  diff = 0.000002563
Epoch: 40000  loss = 0.900589  diff = 0.000001848
Epoch: 49999  loss = 0.885475  diff = 0.000001490
Optimization done
Accuracy on train set: 0.59859157
Accuracy on test set: 0.44444445
