Data is rapidly becoming one of the most valuable resources. Obtaining data and the appropiate data can be difficult, but highly rewarding. For example, imagine capturing multiple types of data on what teaching methods work best for children? However, bad data = bad models. So it is important to understand your data before you get started.

Data focused considerations:

    Who created the dataset?
    How was the dataset created?
    What transformations were used?
    What intent does the dataset have?
    Possible unintentional consequences?
    Is the dataset biased?
    Are there ethical issues with the dataset?


For this tutorial we will be using the Fashion-MNIST dataset (fashion MNIST dataset has MNIST in it's name is because the creators seek to replace the MNIST with Fashion-MNIST). Look up the MNIST dataset if you don't know what it is. <br><br>
Fashion-MNIST is based on the assortment on Zalando's website. Zalando is a German based multi-national fashion commerce company that was founded in 2008. Fashion-MNIST – has 10 classes and same specs (this is intentional so switch from MNIST dataset to Fashion-MNIST).<br><br>
We will be accessing Fashion-MNIST though a PyTorch vision library called <code>torchvision</code> and building our first neural network that can accurately predict an output class given an input fashion image. 

# Neural Network Project Overview
There are four general steps that we'll be following as we move through this project:

   **Prepare the data** <br>
   Build the model<br>
   Train the model<br>
   Analyze the model's results<br>


## Extract, Transform, and Load (ETL) with PyTorch (preparing the data)
To prepare our data, we'll be following what is loosely known as an ETL process.

    Extract data from a data source.
    Transform data into a desirable format.
    Load data into a suitable structure.
    
Or more specifically:
    
    Extract – Get the Fashion-MNIST image data from the source.
    Transform – Put our data into tensor form.
    Load – Put our data into an object to make it easily accessible.



In [2]:
import torch
import torchvision
import torchvision.transforms as transforms #interface giving access for common transformations for image processing

            Class                                    Description
    torch.utils.data.Dataset 	    An abstract class for representing a dataset.
    torch.utils.data.DataLoader 	Wraps a dataset and provides access to the underlying data.

An abstract class is a Python class that has methods we must implement, so **we can create a custom dataset by creating a subclass that extends the functionality of the Dataset class.** 

To create a custom dataset using PyTorch, you extend the Dataset class by creating a subclass that implements these required methods. Upon doing this, our new subclass can then be passed to the a PyTorch DataLoader object. <br><br>
*The fashion-MNIST dataset comes with torchvision so the above is already baked in.*
<br>
### All subclasses of the Dataset class must override __len__, that provides the size of the dataset, and __getitem__, supporting integer indexing in range from 0 to len(self) exclusive. 

In [4]:
train_set = torchvision.datasets.FashionMNIST(
    root='./data/FashionMNIST',
    train=True,
    download=True,
    transform=transforms.Compose([transforms.ToTensor()])
)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./data/FashionMNIST/FashionMNIST/raw/train-images-idx3-ubyte.gz


93.0%IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)

100.0%

Extracting ./data/FashionMNIST/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


159.1%

Extracting ./data/FashionMNIST/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/FashionMNIST/raw
Processing...
Done!


  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


In [7]:
train_loader = torch.utils.data.DataLoader(train_set)

Extract:
<code>train_set = torchvision.datasets.FashionMNIST(
    root='./data/FashionMNIST',
    train=True,
    download=True,</code>
<br>
<br>
Transform:
   <code>transform=transforms.Compose([transforms.ToTensor()]</code>
<br>
   
Load:
   <code>train_loader = torch.utils.data.DataLoader(train_set)</code>