# Tensorflow practical, part 1, logistic regression

This code sample gives an example implementation for logistic regression (see the comments in the code for details) for MNIST dataset.
Your task is to play with the code and understand it so that you could implement and debug the next part yourself. Do the following small tasks step by step:
* Compute and print/plot logits for a batch of examples. You need to add it to the list of the computed values for that.
* Play with weights initializer. Learn how to initialize it with constant or random values of different standard deviation.
* Print the weights "W" while traning (Hint: Tensor.eval() operation is useful)
* Play with different optimizers. Now you can use momentum, Adagrad or ADAM, or any other method without a need to implement it.
* Try using special functions for loss value. Check out the functions ```softmax_cross_entropy_with_logits```, ```sigmoid_cross_entropy_with_logits```.
* Produce one-hot vectors for the labels using ```tf.one_hot```.
* Implement multi-class case logistic regression.
* Print or plot the gradient of the loss function using ```tf.grad```.
* Print any intermediate gradient e.g. $\frac{\partial p}{\partial W}$. Does the dimensionality match your expectations? Now you can do it easily for any pair of variables in the graph.

In [1]:
%matplotlib inline

import numpy as np
import tensorflow as tf

n_batch = 4;
n_classes = 2;
n_feat = 784;

# set random seed to reproduce the results if necessary
seed = 5
np.random.seed(seed)
tf.set_random_seed(seed)

# this is needed to avoid clashes of names if there was a previous version of the same model
tf.reset_default_graph()

# two placeholders for the input and the target
X = tf.placeholder(tf.float32, shape=[n_batch, n_feat])
target = tf.placeholder(tf.float32, shape=[n_batch, 1])

# declare the variable for weights and biases
W = tf.Variable(tf.random_normal([n_feat, 1]),name="W")
b = tf.Variable(tf.zeros([1]), name="b")

# matrix vector multiplication
logits = tf.matmul(X,W) + b

# this would be useful for multi-class classification
# target_one_hot = tf.one_hot(target, n_classes)

# non-linearities are declared in tf.nn 
p = tf.nn.sigmoid(logits);

# TensorFlow is not flexible for types. Explicit conversion is necessary each time
correct_pred = tf.equal(tf.cast(tf.round(p), "int64"), tf.cast(target, "int64"))

# reduction operations like mean and sum are done in numpy style
acc = tf.reduce_mean(tf.cast(correct_pred, "float"))

loss = -1.0 * tf.reduce_sum(target * tf.log(p + 1.0E-8) + (1.0 - target) * tf.log(1.0 - p + 1.0E-8))

# this line defines the operation for a single optimization step
# you only need to choose the method and specify the set of parameters
train_step = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(loss)

In [2]:
# this code reads the data from tensorflow examples datasets
from tensorflow.examples.tutorials.mnist import input_data

def load_mnist_dataset(binary, label1, label2):
    mnist = input_data.read_data_sets("Data/MNIST_data/", reshape=True, one_hot=False)
 
    X_train = mnist.train.images
    target_train = mnist.train.labels
    X_test = mnist.test.images
    target_test = mnist.test.labels

    if (binary):
        # Get only the samples with zero and one label for training.
        index_list_train = []
        for sample_index in range(target_train.shape[0]):
            label = target_train[sample_index]
            #if label == 1 or label == 0:
            if (label == label1 or label == label2):
                index_list_train.append(sample_index)

        # Reform the train data structure.
        X_train = mnist.train.images[index_list_train]
        target_train = mnist.train.labels[index_list_train]

        # Get only the samples with four and nine label for test set.
        index_list_test = []
        for sample_index in range(target_test.shape[0]):
            label = target_test[sample_index]
            #if label == 1 or label == 0:
            if (label == label1 or label == label2):
                index_list_test.append(sample_index)    

        # Reform the test data structure.
        X_test = mnist.test.images[index_list_test]
        target_test = mnist.test.labels[index_list_test]

        target_train[target_train == label1] = 0;
        target_train[target_train == label2] = 1;
        
        target_test[target_test == label1] = 0;
        target_test[target_test == label2] = 1;
        
    return (X_train, X_test, target_train, target_test)

(X_train, X_test, target_train, target_test) = load_mnist_dataset(binary = True, label1 = 4, label2 = 9)


Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting Data/MNIST_data/train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting Data/MNIST_data/train-labels-idx1-ubyte.gz
Extracting Data/MNIST_data/t10k-images-idx3-ubyte.gz
Extracting Data/MNIST_data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


In [3]:
n_train = X_train.shape[0]
n_test = X_test.shape[0]

target_train = target_train.reshape((n_train,1))
target_test = target_test.reshape((n_test,1))

n_epoch = 20

# define a TF session
with tf.Session() as sess:
    
    # initialize the variables
    sess.run(tf.global_variables_initializer())   
    
    indices = np.arange(n_train)
    np.random.shuffle(indices)
    
    # use you implementation of iterate_minibatches function here

    loss_train_epoch = {}
    
    for i_epoch in range(0,n_epoch):
        loss_train = 0
        acc_train = 0

        rng_minibatches = range(0, n_train - n_batch + 1, n_batch);
        
        for start_idx in rng_minibatches:
            X_train_batch = X_train[start_idx:(start_idx+n_batch),:]
            target_batch = target_train[start_idx:(start_idx+n_batch),:]    
            
            # run the session
            # the list of variable [loss, acc, p] is the list of tensors you would like to compute
            # the list of values [loss_val,acc_val,p_val] that are computed is defined on the left
            # feed_dict = {} is the way to provide input data for the placeholders you have in the model
            [loss_val,acc_val,p_val] = sess.run([loss,acc,p], feed_dict={X:X_train_batch, target:target_batch})
            
            # run a single step of the optimizer you specified in the model given the input
            train_step.run(feed_dict={X: X_train_batch, target: target_batch})
            
            acc_train += acc_val
            loss_train += loss_val
            
        loss_train_epoch[i_epoch] = loss_train / len(rng_minibatches)
        
        # implement computing the loss function and accuracy for the test set

        print("epoch %d, train loss: %g, train acc: %g" % (i_epoch, loss_train / len(rng_minibatches), acc_train / len(rng_minibatches)))

        # use your favourite tool to make plots

epoch 0, train loss: 1.49999, train acc: 0.887268
epoch 1, train loss: 0.686929, train acc: 0.944888
epoch 2, train loss: 0.589441, train acc: 0.952974
epoch 3, train loss: 0.537098, train acc: 0.957249
epoch 4, train loss: 0.501773, train acc: 0.961338
epoch 5, train loss: 0.475544, train acc: 0.962825
epoch 6, train loss: 0.454903, train acc: 0.96487
epoch 7, train loss: 0.438087, train acc: 0.965985
epoch 8, train loss: 0.424018, train acc: 0.966822
epoch 9, train loss: 0.412001, train acc: 0.967844
epoch 10, train loss: 0.401599, train acc: 0.96868
epoch 11, train loss: 0.392468, train acc: 0.968959
epoch 12, train loss: 0.384368, train acc: 0.969238
epoch 13, train loss: 0.377122, train acc: 0.969888
epoch 14, train loss: 0.370593, train acc: 0.970725
epoch 15, train loss: 0.364673, train acc: 0.97119
epoch 16, train loss: 0.359276, train acc: 0.972026
epoch 17, train loss: 0.35433, train acc: 0.972026
epoch 18, train loss: 0.349783, train acc: 0.972584
epoch 19, train loss: 0.345