In [9]:
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


In [17]:
import tensorflow as tf

(X_train, y_train), (X_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()

In [10]:
folder_path = '/content/drive/MyDrive/Colab Notebooks/fashion_mnist'


In [13]:


import os
import numpy as np

if not os.path.exists(folder_path):
    os.makedirs(folder_path)

In [None]:
import numpy as np
X_train, y_train, X_test, y_test = np.load(os.path.join(folder_path,'fashion_mnist.npz'))

In [26]:

data = np.load(os.path.join(folder_path,'fashion_mnist.npz'))
X_train = data['X_train']
y_train = data['y_train']
X_test = data['X_test']
y_test = data['y_test']

In [28]:
X_train.shape

(60000, 28, 28)

In [None]:
np.load('fashion_mnist.npz')

In [6]:
X_train.max().max()

255

# Fashion MNIST Dataset

## Introduction
The [Fashion MNIST dataset](https://github.com/zalandoresearch/fashion-mnist) is a dataset of Zalando's article images, with 28x28 grayscale images of 70,000 fashion products from 10 categories, and 7,000 images per category. The training set has 60,000 images, and the test set has 10,000 images. 

Fashion MNIST is intended to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms, as the dataset shares the same image size, data format and the structure of training and testing splits.

The 10 categories include:

1. T-shirt/top
2. Trouser
3. Pullover
4. Dress
5. Coat
6. Sandal
7. Shirt
8. Sneaker
9. Bag
10. Ankle boot

Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning lighter. This pixel-value is an integer between 0 and 255.

The dataset is widely used in the field of machine learning for the benchmarking of algorithm performance on tasks of image classification.

## TensorFlow Fashion MNIST Dataset

The Fashion MNIST dataset is made easily accessible via TensorFlow as below. The [tf.keras.datasets.fashion_mnist.load_data()](https://www.tensorflow.org/api_docs/python/tf/keras/datasets/fashion_mnist/load_data) function is a convenient function provided by TensorFlow that fetches the Fashion MNIST dataset. You can also inspect the data via [this website](https://knowyourdata-tfds.withgoogle.com/#tab=STATS&dataset=fashion_mnist).







In [37]:
# Import tensorflow
import tensorflow as tf

# Load the Fashion MNIST dataset
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()

I also saved the data to using numpy,


In [29]:
# Save the data
import numpy as np
np.save(os.path.join(folder_path, 'X_train.npy'), X_train)
np.save(os.path.join(folder_path, 'y_train.npy'), y_train)
np.save(os.path.join(folder_path, 'X_test.npy'), X_test)
np.save(os.path.join(folder_path, 'y_test.npy'), y_test)

In [32]:
from google.colab import files
files.download(folder_path + '/X_train.npy')
files.download(folder_path + '/y_train.npy')
files.download(folder_path + '/X_test.npy')
files.download(folder_path + '/y_test.npy')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [34]:
import numpy as np

# Load the data
X_train = np.load(os.path.join(folder_path,'X_train.npy'))
y_train = np.load(os.path.join(folder_path,'y_train.npy'))
X_test = np.load(os.path.join(folder_path,'X_test.npy'))
y_test = np.load(os.path.join(folder_path,'y_test.npy'))

In [35]:
X_train.shape

(60000, 28, 28)

In [36]:
X_train.__class__

numpy.ndarray

When you load the Fashion MNIST dataset via TensorFlow's `fashion_mnist.load_data()` function, you get the dataset as two tuples, one for the training set and one for the test set. Each tuple contains a numpy array with the images and a numpy array with the labels.

Each image in the dataset is stored as a 28 x 28 numpy array. The entire training dataset is a 3D numpy array of shape (60000, 28, 28) and the entire testing dataset is a 3D numpy array of shape (10000, 28, 28).

Labels: The labels are integer arrays with values ranging from 0 to 9 that correspond to the class of clothing the image represents.

| Label | Class       |
|-------|-------------|
| 0     | T-shirt/top |
| 1     | Trouser     |
| 2     | Pullover    |
| 3     | Dress       |
| 4     | Coat        |
| 5     | Sandal      |
| 6     | Shirt       |
| 7     | Sneaker     |
| 8     | Bag         |
| 9     | Ankle boot  |


Data Split: TensorFlow provides a predefined split of the data for training and testing. The training set contains 60,000 examples. The test set contains 10,000 examples.

In [None]:

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import tensorflow as tf
from sklearn.metrics import confusion_matrix

# Load the Fashion MNIST dataset
(X_train_full, y_train_full), (X_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()

# Load the Fashion MNIST dataset
(X_train_full, y_train_full), (X_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()

class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat",
               "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]

print("X full train.shape", X_train_full.shape)
print("y full train.shape", X_train_full.shape)

X_train, y_train = X_train_full[:-5000], y_train_full[:-5000]
X_valid, y_valid = X_train_full[-5000:], y_train_full[-5000:]

# Scale the pixel values to be between 0 and 1
X_train, X_valid, X_test = X_train / 255., X_valid / 255., X_test / 255.
n_rows = 5
n_cols = 10
plt.figure(figsize=(n_cols * 1.2, n_rows * 1.2))
for row in range(n_rows):
    for col in range(n_cols):
        index = n_cols * row + col
        plt.subplot(n_rows, n_cols, index + 1)
        plt.imshow(X_train[index], cmap="binary", interpolation="nearest")
        plt.axis('off')
        plt.title(class_names[y_train[index]])
plt.subplots_adjust(wspace=0.2, hspace=0.5)

plt.show()