# Part 3: Training a Few Colors
Let's actually train the basic network with a small list of colors. First, let's copy the network from the previous part, but add in the ability to pass a map from color names to RGB values. Go ahead and click Run to create the train() function.

In [9]:
from keras.layers import Activation, Dense, Dropout
from keras.models import Sequential
import keras.optimizers, keras.utils, numpy
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelBinarizer

def train(colorNameToRGBMap, epochs = 16):
    """
    Trains a neural network to understand how to map color names to RGB triples.
    The provided map is from 'color-name':(r,g,b) where r,g,b are floats in the range [0,1].
    Different names are allowed to map to the same RGB triple.
    Returns a trained model that can be used for recognize().
    """

    # Convert the Python map RGB values into a numpy array needed for training.
    rgbNumpyArray = numpy.array(list(colorNameToRGBMap.values()), numpy.float)
    
    # Convert the color labels into a one-hot feature array.
    # Text labels for each array position are in the classes_ list on the binarizer.
    labelBinarizer = LabelBinarizer()
    oneHotLabels = labelBinarizer.fit_transform(list(colorNameToRGBMap.keys()))
    numColors = len(labelBinarizer.classes_)
    colorLabels = labelBinarizer.classes_
    
    # Hyperparameters to define the network shape.
    numFullyConnectedPerceptrons = numColors * 16
    batchSize = 1
    
    model = Sequential([
        # Layer 1: Fully connected layer with ReLU activation.
        Dense(numFullyConnectedPerceptrons, activation='relu', kernel_initializer='TruncatedNormal', input_shape=(3,)),

        # Outputs: SoftMax activation to get probabilities by color.
        Dense(numColors, activation='softmax')
    ])

    print(model.summary())

    # Compile for categorization.
    model.compile(
        optimizer = keras.optimizers.SGD(lr = 0.01, momentum = 0.9, decay = 1e-6, nesterov = True),
        loss = 'categorical_crossentropy',
        metrics = [ 'accuracy' ])

    history = model.fit(rgbNumpyArray, oneHotLabels, epochs=epochs, batch_size=batchSize)

    return (model, colorLabels)

OK, now we have the train() function, let's provide it with a color map with our first three colors, black, gray, and white, and train it. Click Run.

In [10]:
colorMap = {
    'white': (1.0, 1.0, 1.0),
    'black': (0.0, 0.0, 0.0),
    'gray': (0.5, 0.5, 0.5)
}

(colorModel, colorLabels) = train(colorMap)

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_9 (Dense)              (None, 48)                192       
_________________________________________________________________
dense_10 (Dense)             (None, 3)                 147       
Total params: 339
Trainable params: 339
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/16
Epoch 2/16
Epoch 3/16
Epoch 4/16
Epoch 5/16
Epoch 6/16
Epoch 7/16
Epoch 8/16
Epoch 9/16
Epoch 10/16
Epoch 11/16
Epoch 12/16
Epoch 13/16
Epoch 14/16
Epoch 15/16
Epoch 16/16


Let's reuse the same color slider that runs the neural network and shows percentages.

In [11]:
from IPython.core.display import display, HTML
from ipywidgets import interact
def displayColor(r, g, b):
    rInt = min(255, max(0, int(r * 255.0)))
    gInt = min(255, max(0, int(g * 255.0)))
    bInt = min(255, max(0, int(b * 255.0)))
    hexColor = "#%02X%02X%02X" % (rInt, gInt, bInt)
    display(HTML('<div style="width: 50%; height: 50px; background: ' + hexColor + ';"></div>'))

@interact(r = (0.0, 1.0, 0.01), g = (0.0, 1.0, 0.01), b = (0.0, 1.0, 0.01))
def getPredictionsFromModel(r, g, b):
    testColor = numpy.array([ (r, g, b) ])
    predictions = colorModel.predict(testColor, verbose=0)  # Predictions shape (1, numColors)
    predictions *= 100.0
    print(colorLabels)
    numpy.set_printoptions(precision=1, suppress=True)
    print(predictions[0])
    displayColor(r, g, b)


interactive(children=(FloatSlider(value=0.5, description='r', max=1.0, step=0.01), FloatSlider(value=0.5, desc…

Play with the colors and watch the percentages. Notice how it starts at gray but the network only gives about a 32% likelihood of the color being gray? If you slide the sliders to black you get about a 28% prediction for the label 'black', and if you slide to all white you get a 62% likelihood. Not very good! We're gong to have to find ways to do better.

## Epochs: Like Talking to a Toddler
Let's continue with our baby analogy from before. We started training by providing examples to the network and letting it start to figure things out. But just like humans, babies and toddlers like a lot of repetition when they are learning. And just like toddlers, repeating usually lets the network learn better each time, at least up to a point. You can control how much you repeat all the examples to the network using a number called the _epoch count_. An epoch runs all the training examples against the network one time. So setting the epoch count to 5 repeats all the data 5 times. The network training system will actually show a number back to you called _loss_ which, if things are going well, should get closer and closer to zero.

In the training above, we used 16 epochs. Take a look at its output. On my machine the last two training epochs showed:

 `Epoch 15/16
 3/3 [==============================] - 0s - loss: 0.8288 - acc: 0.6667`

and

 `Epoch 16/16
 3/3 [==============================] - 0s - loss: 0.8042 - acc: 0.6667`
 
On epoch 15 the loss was about 0.82, while on epoch 16 it was 0.80, so we're getting a bit better each time we repeat the examples, but ideally we'd want to get the loss number down as close to zero as we can.

So let's repeat ourselves a lot more. Let's use 100 epochs to see what happens:

In [12]:
(colorModel, colorLabels) = train(colorMap, epochs=100)

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_11 (Dense)             (None, 48)                192       
_________________________________________________________________
dense_12 (Dense)             (None, 3)                 147       
Total params: 339
Trainable params: 339
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/10

100 epochs puts out a lot of output. But by the time we get to the end you should see something like:

 `Epoch 100/100
 3/3 [==============================] - 0s - loss: 0.1667 - acc: 1.0000`
 
Notice our loss is 0.16 now, better but not very close to zero. Maybe your network trained to a lower loss. Go ahead and see how the percentages work out below.

In [13]:
interact(getPredictionsFromModel, r = (0.0, 1.0, 0.01), g = (0.0, 1.0, 0.01), b = (0.0, 1.0, 0.01))

interactive(children=(FloatSlider(value=0.5, description='r', max=1.0, step=0.01), FloatSlider(value=0.5, desc…

<function __main__.getPredictionsFromModel>

On my machine, I get a prediction of about 80% for gray, 85% when I set to black, and 92% when I set to white. Much better!

Below there's a number entry field that lets you rerun the training and test with whatever number of epochs you want. Try training to 50 or 200 epochs and see what results you get.

In [14]:
@interact(epochs = (1, 500))
def trainModel(epochs=10):
    global colorModel
    global colorLabels
    (colorModel, colorLabels) = train(colorMap, epochs=epochs)


interactive(children=(IntSlider(value=10, description='epochs', max=500, min=1), Output()), _dom_classes=('wid…

In [15]:
interact(getPredictionsFromModel, r = (0.0, 1.0, 0.01), g = (0.0, 1.0, 0.01), b = (0.0, 1.0, 0.01))

interactive(children=(FloatSlider(value=0.5, description='r', max=1.0, step=0.01), FloatSlider(value=0.5, desc…

<function __main__.getPredictionsFromModel>

If you try out higher numbers like 300 or 400, you should see the loss getting down to 0.01, and when you run the sliders with that network the predictions get much more accurate.

### Coming up...
Our neural network has graduated to pre-kindergarten. Time to teach it more colors!