We consider Fashion MNIST. It has 70,000 grayscale images of 28×28 pixels each, with 10 classes,

"T-shirt/top", "Trouser", "Pullover", "Dress", "Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"

In [5]:
from keras.datasets import fashion_mnist

(X_train1, y_train1), (X_test, y_test) = fashion_mnist.load_data()
print(X_train1.shape)
print(X_test.shape)

(60000, 28, 28)
(10000, 28, 28)


Note that the data are of matrix form and integer type!

We create validation set and normalize the features

In [6]:
X_valid, y_valid = X_train1[:5000]/255.0, y_train1[:5000]
X_train, y_train = X_train1[5000:]/255.0, y_train1[5000:]

In [8]:
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]

# Modeling

In [12]:
from keras.models import Sequential
from keras.layers import Dense, Flatten

In [14]:
model = Sequential(
                  [Flatten(input_shape=[28, 28]),    # vectorize the matrix input     
                   Dense(300, activation='relu'),    # first hidden layer with 300 neuron and RELU activation function
                   Dense(100, activation='relu'),    # second hiden layer with 100 neuron and RELU activation function
                   Dense(10, activation='softmax')   # output layer with 10 neuron (10 classes) and softmax activation function
                   ])

We can see the model info using summary(). It shows number of units and parameters in each layer. For examle, in the first hidden layer, we have 300 units. With 28\*28 feature vector the number of parameters is 300\*(28\*28+1). 1 is for bias unit.

In [16]:
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
flatten_2 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 300)               235500    
_________________________________________________________________
dense_5 (Dense)              (None, 100)               30100     
_________________________________________________________________
dense_6 (Dense)              (None, 10)                1010      
Total params: 266,610
Trainable params: 266,610
Non-trainable params: 0
_________________________________________________________________


We now need to define the loss function, optimizer and some metric to evaluate the model. All can be done by compiling the model

In [18]:
model.compile(
            loss="sparse_categorical_crossentropy",  # multi-class classificatio
            optimizer = 'sgd',                       # stocatstic gradient descent algorithm
            metrics=['accuracy'])                    

# Training

In [20]:
history = model.fit(X_train, y_train, epochs=30, validation_data=(X_valid, y_valid))

Train on 55000 samples, validate on 5000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


The accuracy on training set is 94.95% and on validation set is 89.5% which is good!

In [21]:
history.history

{'val_loss': [0.28968710760474203,
  0.28947549964785574,
  0.2974525733202696,
  0.30593549649715424,
  0.29235944193005564,
  0.2829130459547043,
  0.283090737336874,
  0.3076669326066971,
  0.2883341287791729,
  0.3172474097073078,
  0.29063725994974376,
  0.2981434750914574,
  0.30333364070802926,
  0.2975735347509384,
  0.2852647826552391,
  0.29724177500009535,
  0.2780705550342798,
  0.30219612251520156,
  0.27746986013799907,
  0.31114451722502706,
  0.31112232922315597,
  0.2948711484134197,
  0.3188945025533438,
  0.29863573335409166,
  0.29537142882049083,
  0.29808478766977786,
  0.30210550267398356,
  0.28555410170704126,
  0.3184716885969043,
  0.3022170447409153],
 'val_accuracy': [0.8981999754905701,
  0.8944000005722046,
  0.8924000263214111,
  0.8880000114440918,
  0.8917999863624573,
  0.8966000080108643,
  0.8988000154495239,
  0.8870000243186951,
  0.897599995136261,
  0.8830000162124634,
  0.8906000256538391,
  0.8925999999046326,
  0.8848000168800354,
  0.8934000

In [22]:
model.evaluate(X_test, y_test)



[91.15792224121094, 0.8238999843597412]