# Batches

Modern data sets can contain millions or even billions of data points. So training one value at a time is completely impractical. For this reason modern applications split their input data into batches. 

A __batch__ is a set of data points that are trained in a single training step. Batch size plays an important role in determining an applications performance and accuracy. 

### Small vs. Large
If batches are small an application may take a long time to train. If batches are large the training time may decrease, but the applications accuracy may also decrease. Unfortunately, there's no clear rule for selecting batch size. 

Split the training data into subsets called batches. So each trainining step processes one batch and the batch size is determined by trial and error.

### Randomizing / Shuffling Training Batches; To Solve - Stuck at Local Minima

Training methods frequently get stuck at local minima instead of finding the global point of minimum loss. One way to prevent this from happening is to randomize or shuffle the training batches. 

Batch shuffling increases the likelihood of finding global minimum.

If you shuffle the batches in the gradient decent method the resulting algorithm is called the __Stochastic Gradient Descent__, or SGD algorithm. 

### Placeholders (Tensors that receive different batches of data)

If a tensor contains an input batch of data it will need to updated with each training step.  (check tensorflow folder for more examples)
 
This process of updating the input tensor is called __feeding__ and TensorFlow provides a specific type of tensor for receiving batch data. This new type of tensor is called a __placeholder (type of tensor)__, and you can use a placeholder by calling __tf.placeholder__.

    tf.placeholder(dtype, shape=None) # no initial values until fed data

This function requires the data type of the input data and an optional parameter sets the batches shape. Tf.placeholder returns a tensor without any values. Its values will be provided as the training is performed. 

We can feed data into a placeholder using the second parameter of the sessions run method. This is called __feed_dict__ and it accepts a dictionary that associates placeholder names with data. This simple example code demonstrates how batches and placeholders work together.

    sess.run(feed_dict) # dictionary that associates placeholders with data

### Placeholder Example

In [None]:
num_batches = 10

# Configured to contain 1,000 floats in a 100 x 10 matrix
holder = tf.placeholder(tf.float32, shape=[100,10]) # returns a tensor named holder

with tf.Session() as sess:
    for _ in range(num_batches):
        batch_data = np.array(...)
        
        '''
         When you feed data into a placeholder with feed_dict you can't provide the 
         data as a tensor. But you can provide the data as a numpy array. This is important 
         to keep in mind when you use batches and placeholders. 
        '''
        
        sess.run(op, feed_dict={holder: batch_data})

----

### Linear Regression Example

- Set constants


- Generate input points


- Create variables and placeholders


- Define model and loss


- Create optimizer


- Execute optimizer in a session


In [7]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import tensorflow as tf

# Set constants
N = 1000
learn_rate = 0.1
batch_size = 40
num_batches = 400

# Step 1: Generate input points
x = np.random.normal(size=N) # get values to surround the line Y equals 0.5, X plus one
m_real = np.random.normal(loc=0.5, scale=0.2, size=N) # generate points with a mean of 0.5, and sd to 0.2
b_real = np.random.normal(loc=1.0, scale=0.2, size=N) # formula is y = 0.50X + 1, so 1 = b
y = m_real * x + b_real

# Step 2: Create variables and placeholders
m = tf.Variable(tf.random_normal([])) # set initial value with a generic shape
b = tf.Variable(tf.random_normal([]))
x_holder = tf.placeholder(tf.float32, shape=[batch_size])
y_holder = tf.placeholder(tf.float32, shape=[batch_size])

# Step 3: Define model and loss
model = m * x_holder + b # general equation of the line

'''
Computing the average of a tensor using tf.reduce_mean. For the argument, I'll use tf.pow, 
model minus y_holder and then two. And what this will do is it will compute model minus y_holder 
for each training step and square that
'''

loss = tf.reduce_mean(tf.pow(model-y_holder, 2)) # power of 2 (mean_squared_error, no negatives)

# Step 4: Create optimizer
optimizer = tf.train.GradientDescentOptimizer(learn_rate).minimize(loss)

# Step 5: Execute optimizer in a session
with tf.Session() as sess:
    '''Execute the operation that initializes both of the variables in the application by calling sess.run'''
    sess.run(tf.global_variables_initializer()) # initialize global variables

    # Perform training in a loop that executes num_batches times
    for _ in range(num_batches):

        '''Within each step, I need to generate the data that's going to be sent into the two placeholders'''
        x_data = np.empty(batch_size) # create empty numpy array
        y_data = np.empty(batch_size)
        
        '''Generate values for that particular batch by using another loop'''
        for i in range(batch_size):
            index = np.random.randint(0, N) # generate an index value into the original data batch
            x_data[i] = x[index] # set ith value of the batch equal to X index
            y_data[i] = y[index]# set ith value of the batch equal to Y index

        '''Start the training process by calling sess.run() with the optimizer'''
        sess.run(optimizer, feed_dict={x_holder: x_data, y_holder: y_data})

    print('m - ', sess.run(m))
    print('b - ', sess.run(b))
                

m -  0.49117818
b -  0.977285


Demonstrated, in practice, how variables, training, and optimizers work together.