## Write and train dog breed classificator from scratch

---

In this notebook, the process of implementation, training and improvement the neural network from scratch is presented. We go step-by-step from basic AlexNet initinal model to get accuracy on test dataset at least 10%.

The notebook is broken into separate steps. Feel free to use the links below to navigate the notebook.

* [Step 0](#step0): Import Datasets
* [Step 1](#step1): Basic model realization and description
* [Step 2](#step2): Experiments and results 
* [Step 3](#step3): 

---

<a id='step0'></a>
## Step 0: Import Datasets

Make sure that you've downloaded the required human and dog datasets:
* Download the [dog dataset](https://s3-us-west-1.amazonaws.com/udacity-aind/dog-project/dogImages.zip).  Unzip the folder and place it in this project's home directory, at the location `/dogImages`. 

*Note: If you are using a Windows machine, you are encouraged to use [7zip](http://www.7-zip.org/) to extract the folder.*

In the code cell below, we save the file paths for the dog dataset in the numpy arrays `dog_files`.

In [1]:
import numpy as np
from glob import glob
from datetime import datetime, timedelta
from tqdm import tqdm

# load filenames for dog images
dog_files = np.array(glob("data/dogImages/*/*/*"))

# print number of images in each dataset
print('There are %d total dog images.' % len(dog_files))

There are 8351 total dog images.


<a id='step1'></a>
## Step 1: Basic Model Architecture

Specify Data Loaders for the Dog Dataset

Use the code cell below to write three separate [data loaders](http://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader) for the training, validation, and test datasets of dog images (located at `dog_images/train`, `dog_images/valid`, and `dog_images/test`, respectively).  You may find [this documentation on custom datasets](http://pytorch.org/docs/stable/torchvision/datasets.html) to be a useful resource.  If you are interested in augmenting your training and/or validation data, check out the wide variety of [transforms](http://pytorch.org/docs/stable/torchvision/transforms.html?highlight=transform)!


In [2]:
import os
import torch
from torchvision import datasets
import numpy as np
import torchvision.transforms as transforms

# check if CUDA is available
use_cuda = torch.cuda.is_available()
print("CUDA available: {}".format(use_cuda))

### Data loaders for training, validation, and test sets
## Specify appropriate transforms, and batch_sizes

normalize = transforms.Normalize(
   mean=[0.485, 0.456, 0.406],
   std=[0.229, 0.224, 0.225]
)
preprocess = transforms.Compose([
   transforms.Resize(256),
   transforms.CenterCrop(224),
   transforms.ToTensor(),
   normalize
])

train_data = datasets.ImageFolder("data/dogImages/train/", transform=preprocess)
valid_data = datasets.ImageFolder("data/dogImages/valid/", transform=preprocess)
test_data = datasets.ImageFolder("data/dogImages/test/", transform=preprocess)

# define dataloader parameters
batch_size = 32
num_workers=0

# prepare data loaders
loaders_scratch = {}
loaders_scratch['train'] = torch.utils.data.DataLoader(train_data, batch_size=batch_size, 
                                           num_workers=num_workers, shuffle=True)
loaders_scratch['valid'] = torch.utils.data.DataLoader(test_data, batch_size=batch_size, 
                                          num_workers=num_workers, shuffle=True)
loaders_scratch['test'] = torch.utils.data.DataLoader(test_data, batch_size=batch_size, 
                                          num_workers=num_workers, shuffle=True)


CUDA available: True


In [6]:
# get number of classes
NCLASSES = len(train_data.class_to_idx)
print("Number of classes: {}".format(NCLASSES))

Number of classes: 133


### Optional: calculate mean and std of dataset

In [15]:
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

preprocess = transforms.Compose([
   transforms.Resize(256),
   transforms.CenterCrop(224),
   transforms.ToTensor()
])

dataset_train = datasets.ImageFolder("data/dogImages/train/", transform=preprocess)

loader = torch.utils.data.DataLoader(dataset_train,
                         batch_size=10,
                         num_workers=0,
                         shuffle=False)

mean = 0.0
for images, _ in loader:
    batch_samples = images.size(0) 
    images = images.view(batch_samples, images.size(1), -1)
    mean += images.mean(2).sum(0)
mean = mean / len(loader.dataset)
print(mean)

print("length of dataset: {}".format(loader.dataset))
print("length of sampler: {}".format(loader.sampler))

var = 0.0
for images, _ in loader:
    batch_samples = images.size(0)
    images = images.view(batch_samples, images.size(1), -1)
    var += ((images - mean.unsqueeze(1))**2).sum([0,2])
std = torch.sqrt(var / (len(loader.dataset)*224*224))
print(std)

tensor([0.4864, 0.4560, 0.3918])
length of dataset: Dataset ImageFolder
    Number of datapoints: 6680
    Root location: data/dogImages/train/
    StandardTransform
Transform: Compose(
               Resize(size=256, interpolation=PIL.Image.BILINEAR)
               CenterCrop(size=(224, 224))
               ToTensor()
           )
length of sampler: <torch.utils.data.sampler.SequentialSampler object at 0x7fe45ee7eb70>
tensor([0.2602, 0.2536, 0.2562])


### Basic CNN implementation

Basic CNN model based on AlexNet architecture [paper](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf) with some simplifications:
- not used `Local Response Normalization`
- using single GPU instead of how in paper (two GPUs)

---
Here, the vizualization of model architecture:

In [7]:
from models.nn_alex_net import BasicCNN

model_scratch = BasicCNN(n_classes=NCLASSES)
print(model_scratch)

BasicCNN(
  (conv1): Conv2d(3, 48, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
  (conv2): Conv2d(48, 128, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (conv3): Conv2d(128, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv4): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv5): Conv2d(192, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc1): Linear(in_features=4608, out_features=2048, bias=True)
  (fc2): Linear(in_features=2048, out_features=2048, bias=True)
  (fc3): Linear(in_features=2048, out_features=133, bias=True)
  (dropout): Dropout(p=0.5, inplace=False)
)


<a id='step2'></a>
## Step 2: Experiments

Description of experiments.

To speed up training process, all stuff code such train, test were moved to .py scripts, to run experiments in console mode and use several GPU's. Here just experiments details are described and corresponding results are presented.

### Experiments description

#### Experiment 0: use only initial model, without train
##### **Results**:
* stopped on epoch: --
* train loss: --
* val loss: --
* test loss: TODO
* test accuracy: TODO

In [None]:
# run to get test accuracy
# TODO

#### Experiment 1
* checkpoint name: TODO 
* model: BasicCNN
* optimezer: SGD
* momentum: 0.9
* lr: 0.01
* weight decay: no
* augmentations: no
* batch size: 32

##### **Results**:
* stopped on epoch: TODO
* train loss: TODO
* val loss: TODO
* test loss: TODO
* test accuracy: TODO

In [None]:
# run to get test accuracy
# TODO