# Part 1 - Tensorflow Softmax

#### Question a
Implement the softmax function using TensorFlow in q1_softmax.py. Remember that

$$
softmax(x)_i = \frac{e^{x_i}}{\sum_j e^{x_j}}
$$

Note that you may not use tf.nn.softmax or related built-in functions. You can run basic (non-exhaustive) tests by running q1_softmax.py.

In [56]:
import numpy as np
import tensorflow as tf

def softmax(x):
    """
    Compute the softmax function in tensorflow.

    You might find the tensorflow functions tf.exp, tf.reduce_max,
    tf.reduce_sum, tf.expand_dims useful. (Many solutions are possible, so you may
    not need to use all of these functions). Recall also that many common
    tensorflow operations are sugared (e.g. x * y does a tensor multiplication
    if x and y are both tensors). Make sure to implement the numerical stability
    fixes as in the previous homework!

    Args:
    x:   tf.Tensor with shape (n_samples, n_features). Note feature vectors are
         represented by row-vectors. (For simplicity, no need to handle 1-d
         input as in the previous homework)
    Returns:
    out: tf.Tensor with shape (n_sample, n_features). You need to construct this
         tensor in this problem.
    """

    ### YOUR CODE HERE
    m = tf.reduce_max(x, reduction_indices=[1]) # Take the max of each row
    size = np.array(x.eval()).shape[0] # Take the number of rows
    m = tf.reshape(m, (size, 1)) # Reshape the vector as a column
    x -= m # substract the maxes to x
    
    numerator = tf.exp(x) # compute the exp
    denominator = tf.reduce_sum(numerator, 1) # compute the sum of exp
    out = numerator / denominator # Compute the softmax
    
    ### END YOUR CODE
  
    return out 

In [57]:
sess=tf.InteractiveSession()
sess.close()

Let's check our function:

In [58]:
def test_softmax_basic():
    """
    Some simple tests to get you started. 
    Warning: these are not exhaustive.
    """
    print("Running basic tests...")
    test1 = softmax(tf.convert_to_tensor(
      np.array([[1001,1002],[3,4]]), dtype=tf.float32))
    with tf.Session():
        test1 = test1.eval()
    assert np.amax(np.fabs(test1 - np.array(
      [0.26894142,  0.73105858]))) <= 1e-6

    test2 = softmax(tf.convert_to_tensor(
      np.array([[-1001,-1002]]), dtype=tf.float32))
    with tf.Session():
        test2 = test2.eval()
    assert np.amax(np.fabs(test2 - np.array(
      [0.73105858, 0.26894142]))) <= 1e-6

    print("Basic (non-exhaustive) softmax tests pass\n")

In [59]:
test_softmax_basic()

Running basic tests...
Basic (non-exhaustive) softmax tests pass



#### Question b
Implement the cross-entropy loss using TensorFlow in q1_softmax.py. Remember that: 

$$
CE(y, \hat y) = - \sum_{i = 1}^{N_c}y_ilog(\hat {y_i})
$$

where $y \in R^5$ is a one-hot vector and $N_c$ is the number of classes. Note that you may not use TensorFlow's built-in cross entropy functions for this question. You can run basic (non-exhaustive) tests by running python q1_softmax.py.

In [60]:
def cross_entropy_loss(y, yhat):
    """
    Compute the cross entropy loss in tensorflow.

    y is a one-hot tensor of shape (n_samples, n_classes) and yhat is a tensor
    of shape (n_samples, n_classes). y should be of dtype tf.int32, and yhat should
    be of dtype tf.float32.

    The functions tf.to_float, tf.reduce_sum, and tf.log might prove useful. (Many
    solutions are possible, so you may not need to use all of these functions).

    Note: You are NOT allowed to use the tensorflow built-in cross-entropy
        functions.

    Args:
    y:    tf.Tensor with shape (n_samples, n_classes). One-hot encoded.
    yhat: tf.Tensorwith shape (n_sample, n_classes). Each row encodes a
          probability distribution and should sum to 1.
    Returns:
    out:  tf.Tensor with shape (1,) (Scalar output). You need to construct this
          tensor in the problem.
    """
    ### YOUR CODE HERE
    y = tf.to_float(y)
    out = -tf.reduce_sum(y * tf.log(yhat))
    return out

Let's test our cross entropy function:

In [61]:
def test_cross_entropy_loss_basic():
    """
    Some simple tests to get you started.
    Warning: these are not exhaustive.
    """
    y = np.array([[0, 1], [1, 0], [1, 0]])
    yhat = np.array([[.5, .5], [.5, .5], [.5, .5]])

    test1 = cross_entropy_loss(
      tf.convert_to_tensor(y, dtype=tf.int32),
      tf.convert_to_tensor(yhat, dtype=tf.float32))
    with tf.Session():
        test1 = test1.eval()
    result = -3 * np.log(.5)
    assert np.amax(np.fabs(test1 - result)) <= 1e-6
    print("Basic (non-exhaustive) cross-entropy tests pass\n")

In [62]:
test_cross_entropy_loss_basic()

Basic (non-exhaustive) cross-entropy tests pass



#### Question c
Carefully study the Model class in model.py. Briefly explain the purpose of placeholder variables and feed dictionnaries in TensorFlow computations. Fill in the implementations for the add_placeholders, create_feed_dict in q1_classifier.py.

Placeholders are the input nodes, but they are not containing any data. We use the dictionnary feed to feed them with values.

In [63]:
import time
import math
import numpy as np
import tensorflow as tf
from q1_softmax import softmax
from q1_softmax import cross_entropy_loss
from model import Model
from utils import data_iterator

class Config(object):
    """Holds model hyperparams and data information.

    The config class is used to store various hyperparameters and dataset
    information parameters. Model objects are passed a Config() object at
    instantiation.
    """
    batch_size = 64
    n_samples = 1024
    n_features = 100
    n_classes = 5
    # You may adjust the max_epochs to ensure convergence.
    max_epochs = 50
    # You may adjust this learning rate to ensure convergence.
    lr = 1e-4 

class SoftmaxModel(Model):
    """Implements a Softmax classifier with cross-entropy loss."""

    def load_data(self):
        """Creates a synthetic dataset and stores it in memory."""
        np.random.seed(1234)
        self.input_data = np.random.rand(
            self.config.n_samples, self.config.n_features)
        self.input_labels = np.ones((self.config.n_samples,), dtype=np.int32)

    def add_placeholders(self):
        """Generate placeholder variables to represent the input tensors.

        These placeholders are used as inputs by the rest of the model building
        code and will be fed data during training.

        Adds following nodes to the computational graph

        input_placeholder: Input placeholder tensor of shape
                           (batch_size, n_features), type tf.float32
        labels_placeholder: Labels placeholder tensor of shape
                           (batch_size, n_classes), type tf.int32

        Add these placeholders to self as the instance variables

          self.input_placeholder
          self.labels_placeholder

        (Don't change the variable names)
        """
        ### YOUR CODE HERE
        self.input_placeholder = tf.placeholder(tf.float32, 
                                                shape=(self.config.batch_size,
                                                self.config.n_features))
        self.labels_placeholder = tf.placeholder(tf.int32, shape=(self.config.batch_size,
                                                                 self.config.n_classes))
        ### END YOUR CODE

    def create_feed_dict(self, input_batch, label_batch):
        """Creates the feed_dict for softmax classifier.

        A feed_dict takes the form of:

        feed_dict = {
            <placeholder>: <tensor of values to be passed for placeholder>,
            ....
        }

        If label_batch is None, then no labels are added to feed_dict.

        Hint: The keys for the feed_dict should match the placeholder tensors
              created in add_placeholders.

        Args:
          input_batch: A batch of input data.
          label_batch: A batch of label data.
        Returns:
          feed_dict: The feed dictionary mapping from placeholders to values.
        """
        ### YOUR CODE HERE
        feed_dict = {self.input_placeholder: input_batch,
                    self.labels_placeholder: label_batch}
        ### END YOUR CODE
        return feed_dict

    def add_training_op(self, loss):
        """Sets up the training Ops.

        Creates an optimizer and applies the gradients to all trainable variables.
        The Op returned by this function is what must be passed to the
        `sess.run()` call to cause the model to train. See 

        https://www.tensorflow.org/versions/r0.7/api_docs/python/train.html#Optimizer

        for more information.

        Hint: Use tf.train.GradientDescentOptimizer to get an optimizer object.
              Calling optimizer.minimize() will return a train_op object.

        Args:
          loss: Loss tensor, from cross_entropy_loss.
        Returns:
          train_op: The Op for training.
        """
        ### YOUR CODE HERE
        raise NotImplementedError
        ### END YOUR CODE
        return train_op

    def add_model(self, input_data):
        """Adds a linear-layer plus a softmax transformation

        The core transformation for this model which transforms a batch of input
        data into a batch of predictions. In this case, the mathematical
        transformation effected is

        y = softmax(xW + b)

        Hint: Make sure to create tf.Variables as needed. Also, make sure to use
              tf.name_scope to ensure that your name spaces are clean.
        Hint: For this simple use-case, it's sufficient to initialize both weights W
              and biases b with zeros.

        Args:
          input_data: A tensor of shape (batch_size, n_features).
        Returns:
          out: A tensor of shape (batch_size, n_classes)
        """
        ### YOUR CODE HERE
        raise NotImplementedError
        ### END YOUR CODE
        return out

    def add_loss_op(self, pred):
        """Adds cross_entropy_loss ops to the computational graph.

        Hint: Use the cross_entropy_loss function we defined. This should be a very
              short function.
        Args:
          pred: A tensor of shape (batch_size, n_classes)
        Returns:
          loss: A 0-d tensor (scalar)
        """
        ### YOUR CODE HERE
        raise NotImplementedError
        ### END YOUR CODE
        return loss

    def run_epoch(self, sess, input_data, input_labels):
        """Runs an epoch of training.

        Trains the model for one-epoch.

        Args:
          sess: tf.Session() object
          input_data: np.ndarray of shape (n_samples, n_features)
          input_labels: np.ndarray of shape (n_samples, n_classes)
        Returns:
          average_loss: scalar. Average minibatch loss of model on epoch.
        """
        # And then after everything is built, start the training loop.
        average_loss = 0
        for step, (input_batch, label_batch) in enumerate(
            data_iterator(input_data, input_labels,
                          batch_size=self.config.batch_size,
                          label_size=self.config.n_classes)):

          # Fill a feed dictionary with the actual set of images and labels
          # for this particular training step.
          feed_dict = self.create_feed_dict(input_batch, label_batch)

          # Run one step of the model.  The return values are the activations
          # from the `self.train_op` (which is discarded) and the `loss` Op.  To
          # inspect the values of your Ops or variables, you may include them
          # in the list passed to sess.run() and the value tensors will be
          # returned in the tuple from the call.
          _, loss_value = sess.run([self.train_op, self.loss], feed_dict=feed_dict)
          average_loss += loss_value

        average_loss = average_loss / step
        return average_loss 

    def fit(self, sess, input_data, input_labels):
        """Fit model on provided data.

        Args:
          sess: tf.Session()
          input_data: np.ndarray of shape (n_samples, n_features)
          input_labels: np.ndarray of shape (n_samples, n_classes)
        Returns:
          losses: list of loss per epoch
        """
        losses = []
        for epoch in range(self.config.max_epochs):
          start_time = time.time()
          average_loss = self.run_epoch(sess, input_data, input_labels)
          duration = time.time() - start_time
          # Print status to stdout.
          print('Epoch %d: loss = %.2f (%.3f sec)'
                 % (epoch, average_loss, duration))
          losses.append(average_loss)
        return losses

    def __init__(self, config):
        """Initializes the model.

        Args:
          config: A model configuration object of type Config
        """
        self.config = config
        # Generate placeholders for the images and labels.
        self.load_data()
        self.add_placeholders()
        self.pred = self.add_model(self.input_placeholder)
        self.loss = self.add_loss_op(self.pred)
        self.train_op = self.add_training_op(self.loss)

#### Question d
Implement the transformation for a softmax classifier in function add_model in q1_classifier.py. Add cross-entropy loss in function add_loss_op in the same file. Use the implementations from the earlier parts of the problem, not TensorFlow built-ins.

In [64]:
import time
import math
import numpy as np
import tensorflow as tf
from q1_softmax import softmax
from q1_softmax import cross_entropy_loss
from model import Model
from utils import data_iterator

class Config(object):
    """Holds model hyperparams and data information.

    The config class is used to store various hyperparameters and dataset
    information parameters. Model objects are passed a Config() object at
    instantiation.
    """
    batch_size = 64
    n_samples = 1024
    n_features = 100
    n_classes = 5
    # You may adjust the max_epochs to ensure convergence.
    max_epochs = 50
    # You may adjust this learning rate to ensure convergence.
    lr = 1e-4 

class SoftmaxModel(Model):
    """Implements a Softmax classifier with cross-entropy loss."""

    def load_data(self):
        """Creates a synthetic dataset and stores it in memory."""
        np.random.seed(1234)
        self.input_data = np.random.rand(
            self.config.n_samples, self.config.n_features)
        self.input_labels = np.ones((self.config.n_samples,), dtype=np.int32)

    def add_placeholders(self):
        """Generate placeholder variables to represent the input tensors.

        These placeholders are used as inputs by the rest of the model building
        code and will be fed data during training.

        Adds following nodes to the computational graph

        input_placeholder: Input placeholder tensor of shape
                           (batch_size, n_features), type tf.float32
        labels_placeholder: Labels placeholder tensor of shape
                           (batch_size, n_classes), type tf.int32

        Add these placeholders to self as the instance variables

          self.input_placeholder
          self.labels_placeholder

        (Don't change the variable names)
        """
        ### YOUR CODE HERE
        self.input_placeholder = tf.placeholder(tf.float32, 
                                                shape=(self.config.batch_size,
                                                self.config.n_features))
        self.labels_placeholder = tf.placeholder(tf.int32, shape=(self.config.batch_size,
                                                                 self.config.n_classes))
        ### END YOUR CODE

    def create_feed_dict(self, input_batch, label_batch):
        """Creates the feed_dict for softmax classifier.

        A feed_dict takes the form of:

        feed_dict = {
            <placeholder>: <tensor of values to be passed for placeholder>,
            ....
        }

        If label_batch is None, then no labels are added to feed_dict.

        Hint: The keys for the feed_dict should match the placeholder tensors
              created in add_placeholders.

        Args:
          input_batch: A batch of input data.
          label_batch: A batch of label data.
        Returns:
          feed_dict: The feed dictionary mapping from placeholders to values.
        """
        ### YOUR CODE HERE
        feed_dict = {self.input_placeholder: input_batch,
                    self.labels_placeholder: label_batch}
        ### END YOUR CODE
        return feed_dict

    def add_training_op(self, loss):
        """Sets up the training Ops.

        Creates an optimizer and applies the gradients to all trainable variables.
        The Op returned by this function is what must be passed to the
        `sess.run()` call to cause the model to train. See 

        https://www.tensorflow.org/versions/r0.7/api_docs/python/train.html#Optimizer

        for more information.

        Hint: Use tf.train.GradientDescentOptimizer to get an optimizer object.
              Calling optimizer.minimize() will return a train_op object.

        Args:
          loss: Loss tensor, from cross_entropy_loss.
        Returns:
          train_op: The Op for training.
        """
        ### YOUR CODE HERE
        raise NotImplementedError
        ### END YOUR CODE
        return train_op

    def add_model(self, input_data):
        """Adds a linear-layer plus a softmax transformation

        The core transformation for this model which transforms a batch of input
        data into a batch of predictions. In this case, the mathematical
        transformation effected is

        y = softmax(xW + b)

        Hint: Make sure to create tf.Variables as needed. Also, make sure to use
              tf.name_scope to ensure that your name spaces are clean.
        Hint: For this simple use-case, it's sufficient to initialize both weights W
              and biases b with zeros.

        Args:
          input_data: A tensor of shape (batch_size, n_features).
        Returns:
          out: A tensor of shape (batch_size, n_classes)
        """
        ### YOUR CODE HERE
        n_features = self.config.n_features
        n_classes = self.config.n_classes
        
        with tf.name_scope('softmax'):
            weights = tf.Variable(tf.zeros([n_features, n_classes]), name = "weights")
            biases = tf.Variable(tf.zeros([n_classes]), name = 'biases')
            
            linear = tf.matmul(input_data, weights) + biases
            out = softmax(linear)
        
        ### END YOUR CODE
        return out

    def add_loss_op(self, pred):
        """Adds cross_entropy_loss ops to the computational graph.

        Hint: Use the cross_entropy_loss function we defined. This should be a very
              short function.
        Args:
          pred: A tensor of shape (batch_size, n_classes)
        Returns:
          loss: A 0-d tensor (scalar)
        """
        ### YOUR CODE HERE
        loss = cross_entropy_loss(self.labels_placeholder, pred)
        ### END YOUR CODE
        return loss

    def run_epoch(self, sess, input_data, input_labels):
        """Runs an epoch of training.

        Trains the model for one-epoch.

        Args:
          sess: tf.Session() object
          input_data: np.ndarray of shape (n_samples, n_features)
          input_labels: np.ndarray of shape (n_samples, n_classes)
        Returns:
          average_loss: scalar. Average minibatch loss of model on epoch.
        """
        # And then after everything is built, start the training loop.
        average_loss = 0
        for step, (input_batch, label_batch) in enumerate(
            data_iterator(input_data, input_labels,
                          batch_size=self.config.batch_size,
                          label_size=self.config.n_classes)):

          # Fill a feed dictionary with the actual set of images and labels
          # for this particular training step.
          feed_dict = self.create_feed_dict(input_batch, label_batch)

          # Run one step of the model.  The return values are the activations
          # from the `self.train_op` (which is discarded) and the `loss` Op.  To
          # inspect the values of your Ops or variables, you may include them
          # in the list passed to sess.run() and the value tensors will be
          # returned in the tuple from the call.
          _, loss_value = sess.run([self.train_op, self.loss], feed_dict=feed_dict)
          average_loss += loss_value

        average_loss = average_loss / step
        return average_loss 

    def fit(self, sess, input_data, input_labels):
        """Fit model on provided data.

        Args:
          sess: tf.Session()
          input_data: np.ndarray of shape (n_samples, n_features)
          input_labels: np.ndarray of shape (n_samples, n_classes)
        Returns:
          losses: list of loss per epoch
        """
        losses = []
        for epoch in range(self.config.max_epochs):
          start_time = time.time()
          average_loss = self.run_epoch(sess, input_data, input_labels)
          duration = time.time() - start_time
          # Print status to stdout.
          print('Epoch %d: loss = %.2f (%.3f sec)'
                 % (epoch, average_loss, duration))
          losses.append(average_loss)
        return losses

    def __init__(self, config):
        """Initializes the model.

        Args:
          config: A model configuration object of type Config
        """
        self.config = config
        # Generate placeholders for the images and labels.
        self.load_data()
        self.add_placeholders()
        self.pred = self.add_model(self.input_placeholder)
        self.loss = self.add_loss_op(self.pred)
        self.train_op = self.add_training_op(self.loss)

#### Question e
Fill in the implementation for add_training_op in q1_classifier.py. Explain how TensorFlow's automatic differentiation removes the need to define gradients explicitely. Verify that your model is able to fit synyjetic data by running python q1_classifier.py and making sure that the test pass.

HInt: Make sure to use the learning rate specified in Config.

In [70]:
import time
import math
import numpy as np
import tensorflow as tf
from q1_softmax import softmax
from q1_softmax import cross_entropy_loss
from model import Model
from utils import data_iterator

class Config(object):
    """Holds model hyperparams and data information.

    The config class is used to store various hyperparameters and dataset
    information parameters. Model objects are passed a Config() object at
    instantiation.
    """
    batch_size = 64
    n_samples = 1024
    n_features = 100
    n_classes = 5
    # You may adjust the max_epochs to ensure convergence.
    max_epochs = 50
    # You may adjust this learning rate to ensure convergence.
    lr = 1e-4 

class SoftmaxModel(Model):
    """Implements a Softmax classifier with cross-entropy loss."""

    def load_data(self):
        """Creates a synthetic dataset and stores it in memory."""
        np.random.seed(1234)
        self.input_data = np.random.rand(
            self.config.n_samples, self.config.n_features)
        self.input_labels = np.ones((self.config.n_samples,), dtype=np.int32)

    def add_placeholders(self):
        """Generate placeholder variables to represent the input tensors.

        These placeholders are used as inputs by the rest of the model building
        code and will be fed data during training.

        Adds following nodes to the computational graph

        input_placeholder: Input placeholder tensor of shape
                           (batch_size, n_features), type tf.float32
        labels_placeholder: Labels placeholder tensor of shape
                           (batch_size, n_classes), type tf.int32

        Add these placeholders to self as the instance variables

          self.input_placeholder
          self.labels_placeholder

        (Don't change the variable names)
        """
        ### YOUR CODE HERE
        self.input_placeholder = tf.placeholder(tf.float32, 
                                                shape=(self.config.batch_size,
                                                self.config.n_features))
        self.labels_placeholder = tf.placeholder(tf.int32, shape=(self.config.batch_size,
                                                                 self.config.n_classes))
        ### END YOUR CODE

    def create_feed_dict(self, input_batch, label_batch):
        """Creates the feed_dict for softmax classifier.

        A feed_dict takes the form of:

        feed_dict = {
            <placeholder>: <tensor of values to be passed for placeholder>,
            ....
        }

        If label_batch is None, then no labels are added to feed_dict.

        Hint: The keys for the feed_dict should match the placeholder tensors
              created in add_placeholders.

        Args:
          input_batch: A batch of input data.
          label_batch: A batch of label data.
        Returns:
          feed_dict: The feed dictionary mapping from placeholders to values.
        """
        ### YOUR CODE HERE
        feed_dict = {self.input_placeholder: input_batch,
                    self.labels_placeholder: label_batch}
        ### END YOUR CODE
        return feed_dict

    def add_training_op(self, loss):
        """Sets up the training Ops.

        Creates an optimizer and applies the gradients to all trainable variables.
        The Op returned by this function is what must be passed to the
        `sess.run()` call to cause the model to train. See 

        https://www.tensorflow.org/versions/r0.7/api_docs/python/train.html#Optimizer

        for more information.

        Hint: Use tf.train.GradientDescentOptimizer to get an optimizer object.
              Calling optimizer.minimize() will return a train_op object.

        Args:
          loss: Loss tensor, from cross_entropy_loss.
        Returns:
          train_op: The Op for training.
        """
        ### YOUR CODE HERE
        optimizer = tf.train.GradientDescentOptimizer(self.config.lr)
        step = tf.Variable(0, name = 'step', trainable = False)
        train_op = optimizer.minimize(loss, global_step = step)
        ### END YOUR CODE
        return train_op

    def add_model(self, input_data):
        """Adds a linear-layer plus a softmax transformation

        The core transformation for this model which transforms a batch of input
        data into a batch of predictions. In this case, the mathematical
        transformation effected is

        y = softmax(xW + b)

        Hint: Make sure to create tf.Variables as needed. Also, make sure to use
              tf.name_scope to ensure that your name spaces are clean.
        Hint: For this simple use-case, it's sufficient to initialize both weights W
              and biases b with zeros.

        Args:
          input_data: A tensor of shape (batch_size, n_features).
        Returns:
          out: A tensor of shape (batch_size, n_classes)
        """
        ### YOUR CODE HERE
        n_features = self.config.n_features
        n_classes = self.config.n_classes
        
        with tf.name_scope('softmax'):
            weights = tf.Variable(tf.zeros([n_features, n_classes]), name = "weights")
            biases = tf.Variable(tf.zeros([n_classes]), name = 'biases')
            
            linear = tf.matmul(input_data, weights) + biases
            out = softmax(linear)
        
        ### END YOUR CODE
        return out

    def add_loss_op(self, pred):
        """Adds cross_entropy_loss ops to the computational graph.

        Hint: Use the cross_entropy_loss function we defined. This should be a very
              short function.
        Args:
          pred: A tensor of shape (batch_size, n_classes)
        Returns:
          loss: A 0-d tensor (scalar)
        """
        ### YOUR CODE HERE
        loss = cross_entropy_loss(self.labels_placeholder, pred)
        ### END YOUR CODE
        return loss

    def run_epoch(self, sess, input_data, input_labels):
        """Runs an epoch of training.

        Trains the model for one-epoch.

        Args:
          sess: tf.Session() object
          input_data: np.ndarray of shape (n_samples, n_features)
          input_labels: np.ndarray of shape (n_samples, n_classes)
        Returns:
          average_loss: scalar. Average minibatch loss of model on epoch.
        """
        # And then after everything is built, start the training loop.
        average_loss = 0
        for step, (input_batch, label_batch) in enumerate(
            data_iterator(input_data, input_labels,
                          batch_size=self.config.batch_size,
                          label_size=self.config.n_classes)):

            # Fill a feed dictionary with the actual set of images and labels
            # for this particular training step.
            feed_dict = self.create_feed_dict(input_batch, label_batch)

            # Run one step of the model.  The return values are the activations
            # from the `self.train_op` (which is discarded) and the `loss` Op.  To
            # inspect the values of your Ops or variables, you may include them
            # in the list passed to sess.run() and the value tensors will be
            # returned in the tuple from the call.
            _, loss_value = sess.run([self.train_op, self.loss], feed_dict=feed_dict)
            average_loss += loss_value

        average_loss = average_loss / step
        return average_loss 

    def fit(self, sess, input_data, input_labels):
        """Fit model on provided data.

        Args:
          sess: tf.Session()
          input_data: np.ndarray of shape (n_samples, n_features)
          input_labels: np.ndarray of shape (n_samples, n_classes)
        Returns:
          losses: list of loss per epoch
        """
        losses = []
        for epoch in range(self.config.max_epochs):
            start_time = time.time()
            average_loss = self.run_epoch(sess, input_data, input_labels)
            duration = time.time() - start_time
            # Print status to stdout.
            print('Epoch %d: loss = %.2f (%.3f sec)'
                 % (epoch, average_loss, duration))
            losses.append(average_loss)
        return losses

    def __init__(self, config):
        """Initializes the model.

        Args:
          config: A model configuration object of type Config
        """
        self.config = config
        # Generate placeholders for the images and labels.
        self.load_data()
        self.add_placeholders()
        self.pred = self.add_model(self.input_placeholder)
        self.loss = self.add_loss_op(self.pred)
        self.train_op = self.add_training_op(self.loss)
        
def test_SoftmaxModel():
    """Train softmax model for a number of steps."""
    config = Config()
    with tf.Graph().as_default():
        model = SoftmaxModel(config)

        # Create a session for running Ops on the Graph.
        sess = tf.Session()

        # Run the Op to initialize the variables.
        init = tf.initialize_all_variables()
        sess.run(init)

        losses = model.fit(sess, model.input_data, model.input_labels)

    # If ops are implemented correctly, the average loss should fall close to zero
    # rapidly.
    assert losses[-1] < .5
    print("Basic (non-exhaustive) classifier tests pass\n")

In [71]:
test_SoftmaxModel()

ValueError: Cannot use the default session to evaluate tensor: the tensor's graph is different from the session's graph. Pass an explicit session to `eval(session=sess)`.

We notice that TF automoatically compute the gradients, as soon as the graph is well defined!

Now, let's test the functions:

In [160]:
def test_SoftmaxModel():
    """Train softmax model for a number of steps."""
    config = Config()
    with tf.Graph().as_default():
        model = SoftmaxModel(config)

        # Create a session for running Ops on the Graph.
        sess = tf.Session()

        # Run the Op to initialize the variables.
        init = tf.initialize_all_variables()
        sess.run(init)

        losses = model.fit(sess, model.input_data, model.input_labels)

    # If ops are implemented correctly, the average loss should fall close to zero
    # rapidly.
    assert losses[-1] < .5
    print("Basic (non-exhaustive) classifier tests pass\n")

In [161]:
test_SoftmaxModel()

ValueError: Cannot use the default session to evaluate tensor: the tensor's graph is different from the session's graph. Pass an explicit session to `eval(session=sess)`.

In [163]:
config = Config()
with tf.Session() as sess:
    model = SoftmaxModel(config)

InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder_20' with dtype float and shape [64,100]
	 [[Node: Placeholder_20 = Placeholder[dtype=DT_FLOAT, shape=[64,100], _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

Caused by op 'Placeholder_20', defined at:
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\ipykernel\__main__.py", line 3, in <module>
    app.launch_new_instance()
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
    app.start()
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\ipykernel\kernelapp.py", line 474, in start
    ioloop.IOLoop.instance().start()
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\zmq\eventloop\ioloop.py", line 177, in start
    super(ZMQIOLoop, self).start()
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\tornado\ioloop.py", line 887, in start
    handler_func(fd_obj, events)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\tornado\stack_context.py", line 275, in null_wrapper
    return fn(*args, **kwargs)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\zmq\eventloop\zmqstream.py", line 440, in _handle_events
    self._handle_recv()
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\zmq\eventloop\zmqstream.py", line 472, in _handle_recv
    self._run_callback(callback, msg)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\zmq\eventloop\zmqstream.py", line 414, in _run_callback
    callback(*args, **kwargs)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\tornado\stack_context.py", line 275, in null_wrapper
    return fn(*args, **kwargs)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\ipykernel\kernelbase.py", line 276, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\ipykernel\kernelbase.py", line 228, in dispatch_shell
    handler(stream, idents, msg)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\ipykernel\kernelbase.py", line 390, in execute_request
    user_expressions, allow_stdin)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\ipykernel\ipkernel.py", line 196, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\ipykernel\zmqshell.py", line 501, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\IPython\core\interactiveshell.py", line 2717, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\IPython\core\interactiveshell.py", line 2821, in run_ast_nodes
    if self.run_code(code, result):
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-163-bbd1341491b1>", line 3, in <module>
    model = SoftmaxModel(config)
  File "<ipython-input-151-20557fad3fc1>", line 229, in __init__
    self.add_placeholders()
  File "<ipython-input-151-20557fad3fc1>", line 59, in add_placeholders
    self.config.n_features))
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1512, in placeholder
    name=name)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 2043, in _placeholder
    name=name)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 759, in apply_op
    op_def=op_def)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\framework\ops.py", line 2240, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "C:\Users\Peter martigny\Anaconda3\envs\py35\lib\site-packages\tensorflow\python\framework\ops.py", line 1128, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder_20' with dtype float and shape [64,100]
	 [[Node: Placeholder_20 = Placeholder[dtype=DT_FLOAT, shape=[64,100], _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
