# Resources

https://chromium.googlesource.com/external/github.com/tensorflow/tensorflow/+/r0.7/tensorflow/g3doc/tutorials/mnist/beginners/index.md

Flattening the data throws away information about the 2D structure of the image. Isn‘t that bad? Well, the best computer vision methods do exploit this structure, and we will in later tutorials. But the simple method we will be using here, a softmax regression, won’t.

The result is that mnist.train.images is a tensor (an n-dimensional array) with a shape of [55000, 784]. The first dimension indexes the images and the second dimension indexes the pixels in each image. Each entry in the tensor is the pixel intensity between 0 and 1, for a particular pixel in a particular image.

The corresponding labels in MNIST are numbers between 0 and 9, describing which digit a given image is of. For the purposes of this tutorial, we're going to want our labels as “one-hot vectors”. A one-hot vector is a vector which is 0 in most dimensions, and 1 in a single dimension. In this case, the \(n\)th digit will be represented as a vector which is 1 in the \(n\)th dimensions. For example, 3 would be \([0,0,0,1,0,0,0,0,0,0]\). Consequently, mnist.train.labels is a [55000, 10] array of floats.




# Mini-batching
 
 In Mini batching we take a subset or a batch of data randomly and train the network weights on this batch with Gradient descent. This provides ability to train a model, even if a computer lacks memory to store entire dataset.
 Since these batches are random we are performing SDG with each batch

In [51]:
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

from tensorflow import keras

# Import MNIST data
#mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True)

# The features are already scaled and the data is shuffled
# train_features = mnist.train.images
# test_features = mnist.test.images

# train_labels = mnist.train.labels.astype(np.float32)
# test_labels = mnist.test.labels.astype(np.float32)

(train_features, train_labels), (test_features, test_labels) = keras.datasets.mnist.load_data()

y_train = np.zeros((train_labels.shape[0], train_labels.max()+1), dtype=np.float32)
y_train[np.arange(train_labels.shape[0]), train_labels] = 1
y_test = np.zeros((test_labels.shape[0], test_labels.max()+1), dtype=np.float32)
y_test[np.arange(test_labels.shape[0]), test_labels] = 1


print(test_labels)

[7 2 1 ... 4 5 6]


In [49]:
import numpy as np
l = np.zeros((test_labels.shape[0],test_labels.max()+1), dtype=np.float32)
l[np.arange(test_labels.shape[0]),test_labels] = 1

In [50]:
l

array([[0., 0., 0., ..., 1., 0., 0.],
       [0., 0., 1., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)

Let's say we want to create batch of 128 samples and we want to divide 1000 samples . But the divided samples will not be of equal size, it will be like 7 batches of 128 samples, and 1 batch of 104 samples. (7 * 128 + 1 * 104 = 1000).
In that case , the sizes of batch would vary we will use Tensorflow's tf.placeholder() function to recieve batch of varying sizes.

if each sample had n_input = 784 features and n_classes = 10 possible labels, the dimensions for features would be [None, n_input] and labels would be [None, n_classes].

The None dimension is a placeholder for the batch size. At runtime, TensorFlow will accept any batch size greater than 0.
Going back to our earlier example, this setup allows you to feed features and labels into the model as either the batches of 128 samples or the single batch of 104 samples.



In [None]:
# Features and Labels
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])


In [56]:
def batches(batch_size, features, labels):
    """
    Create batches of features and labels
    :param batch_size: The batch size
    :param features: List of features
    :param labels: List of labels
    :return: Batches of (Features, Labels)
    """
    assert len(features) == len(labels)
    outout_batches = []
    
    sample_size = len(features)
    for start_i in range(0, sample_size, batch_size):
        end_i = start_i + batch_size
        batch = [features[start_i:end_i], labels[start_i:end_i]]
        outout_batches.append(batch)
        
    return outout_batches


In [57]:
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()



from tensorflow import keras

# Import MNIST data
#mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True)

# The features are already scaled and the data is shuffled
# train_features = mnist.train.images
# test_features = mnist.test.images

# train_labels = mnist.train.labels.astype(np.float32)
# test_labels = mnist.test.labels.astype(np.float32)

(train_features, train_labels), (test_features, test_labels) = keras.datasets.mnist.load_data()

#one hot encoding of labels
y_train = np.zeros((train_labels.shape[0], train_labels.max()+1), dtype=np.float32)
y_train[np.arange(train_labels.shape[0]), train_labels] = 1
y_test = np.zeros((test_labels.shape[0], test_labels.max()+1), dtype=np.float32)
y_test[np.arange(test_labels.shape[0]), test_labels] = 1

train_labels = y_train
test_labels = y_test

train_features = train_features.reshape(60000,784)
test_features = test_features.reshape(10000,784)

learning_rate = 0.001
n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)


# Features and Labels
features = tf.placeholder(tf.float32, [None, n_input])
print(features)
labels = tf.placeholder(tf.float32, [None, n_classes])
print(labels)

# Weights & bias
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))

# Logits - xW + b
logits = tf.add(tf.matmul(features, weights), bias)

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

# Calculate accuracy
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


# TODO: Set batch size
batch_size = 50
assert batch_size is not None, 'You must set the batch size'

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    
    # TODO: Train optimizer on all batches
    for batch_features, batch_labels in batches(batch_size, train_features, train_labels):
        # for batch_features, batch_labels in ______
        sess.run(optimizer, feed_dict={features: batch_features, labels: batch_labels})

    # Calculate accuracy for test dataset
    test_accuracy = sess.run(
        accuracy,
        feed_dict={features: test_features, labels: test_labels})

print('Test Accuracy: {}'.format(test_accuracy))



Tensor("Placeholder_20:0", shape=(?, 784), dtype=float32)
Tensor("Placeholder_21:0", shape=(?, 10), dtype=float32)
Test Accuracy: 0.8513000011444092
