# Stochastic Gradient Descent (SGD)

An algorithm that uses the loss function of the model to determine the local minimum. The local minimum of the loss functions is the location that produces the smallest error for our predictive results.

With SGD, instead of taking each individual data point at a time, we take groups of data or batches. The size of these bathches, or gradient, determines the size of the step that we take walking down to the local minimum of the loss function. In other words, small batches require more computational power.


# Backpropagation 

An algorithm that computes the gradient of the loss function with respect to the weights of the network for a single input-output example, and does so efficiently.

This algorithm works by computing the gradient of the loss function with respect to each weight by the chain rule, computing the gradient one layer at a time, iterating backward from the last layer to avoid redundant calculations of internmediate terms in the chain rule.

## Example

In this example, a csv file of pixel information is used to determine the contents of the images. The first column in the file contains the label, and every following column in the same row contains the pixel intensity for the specified pixel of the column.

In [3]:
import numpy as np
from sklearn.model_selection import train_test_split
from tensorflow import keras

img_row, img_cols = 28, 28
num_classes = 10

def prep_data(raw):
    y = raw[:, 0]
    out_y = keras.utils.to_categorical(y, num_classes)
    
    x = raw[:, 1:]
    num_images = raw.shape[0]
    out_x = x.reshape(num_images, img_rows, img_cols, 1)
    out_x = out_x / 255
    return out_x, out_y

# fashion_file = '' #must make file
# fashion_data = np.loadtxt(fashion_file, skiprows=1, delimiter=',')
# x, y = prep_data(fashion_data)


In [None]:
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D

fashion_model = Sequential()
fashion_model.add(Conv2D(kernel_size=3,filters=12,activation='relu',input_shape=(img_rows,img_cols,1)))
fashion_model.add(Conv2D(20, activation='relu', kernel_size=3))
fashion_model.add(Conv2D(20, activation='relu', kernel_size=3))
fashion_model.add(Flatten())
fashion_model.add(Dense(100, activation='relu'))
fashion_model.add(Dense(10, activation='softmax'))

fashion_model.compile(optimizer='adam',
                     loss='categorical_crossentropy',
                     metrics=['accuracy'])

fashion_model.fit(x, y,
                 batch_size=100,
                 epochs=4,
                 validation_split=0.2)