## Classifying images using Keras  
## @ AISaturdays Singapore
####  By Nasrudin Salim
https://github.com/Nasdin/AISaturdayTutorials

## Image classifications of Dogs and Cats
#### Creating a state-of-the-art image detection model for any category


#### Downloading the dataset of your choice
For this example, I would use the dogs and cats dataset provided here
http://files.fast.ai/files/dogscats.zip

Please download and extract to /data/

    Separating Dataset into trend, test, validate
    60% into train
    20% into test
    20% into validate

In [7]:
#Locations of data files
#These are variables depending on where you saved your data to
#normally it's train and test (training is verified with test) 
#however the data came with train and validate followed by test being the verification, so we will just stick to that

testpath = 'data/test1/'
trainpath = 'data/train/'
validpath = 'data/valid/'


Note that the train and valid folders have subfolders

    data/train
    --> /cats/
    --> /dogs/
    data/valid
    --> /cats/
    --> /dogs/

### Creating the models with a pretrained Architecture
We will use the ResNet50 architecture with pretrained weights from imagenet

In [1]:
import keras

Using TensorFlow backend.
  return f(*args, **kwds)


Instantiate the class model ResNet50, "model" with pretrained weights from imagenet

In [3]:
model = keras.applications.resnet50.ResNet50(include_top=True, weights='imagenet', 
                                     input_tensor=None, input_shape=None, pooling=None, classes=1000)

Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5

### Testing the imagenet model

Create a function to feed from the directory of the images to make feeding of images easier

In [4]:
from keras.preprocessing import image

def get_batches(full_directory, shuffle=True, batch_size=4, class_mode='categorical',
                target_size=(224,224)):
    '''Retrieves images from a directory into batches ready to be trained with'''
    
    
    #The Image Generator object which can be optimized
    ImageGenerator = image.ImageDataGenerator()
    #The method to retrieve batches of images
    batches = ImageGenerator.flow_from_directory(
        full_directory,
        shuffle=shuffle,
        batch_size=batch_size,
        class_mode = class_mode,
        target_size = target_size )
    
    return batches

In [25]:
model.predict(img)

array([[  4.11039345e-08,   5.83101460e-08,   2.81110118e-07, ...,
          1.05117941e-07,   2.24899268e-05,   5.93605546e-06],
       [  1.14940057e-09,   1.26407010e-10,   3.02300748e-07, ...,
          3.45268858e-09,   3.12956594e-09,   1.63710795e-10],
       [  1.76446136e-07,   1.83243341e-08,   4.58474206e-05, ...,
          1.74235367e-07,   1.38845010e-07,   9.70105816e-07],
       [  3.46321940e-05,   8.00564885e-05,   3.69185873e-05, ...,
          2.41499401e-06,   3.90111345e-05,   5.56927989e-05]], dtype=float32)

### Get Batches from the train and validate directory

In [8]:
batches_train = get_batches('data/train')
batches_validate = get_batches('data/valid')


Found 23000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.


### Predicting some samples

In [26]:
import numpy as np
def pred_batch(imgs,model=model):
    ''' Input a batch of images with a model
   '''
    
    preds = model.predict(imgs)
    idxs = np.argmax(preds, axis=1)

    print('Predictions prob/class: ')
    
    for i in range(len(idxs)):
        idx = idxs[i]
        print (idx)
        
        
    

In [27]:
img,labels=next(batches_train)

In [28]:
pred_batch(img)

Shape: (4, 1000)
First 5 probabilities: [  8.81620679e-07   1.72847535e-06   2.78921050e-07   8.84970859e-07
   3.37605115e-06]

Predictions prob/class: 
285
233
250
178


### Working with Transfer Learning to add new categories (WIP)

Create a function to retrieve images from the sub directory belonging to a category( dog or cat, etc)

    Input a model to be trained
    Inputs name of subdirectory (cat or dog, etc)
    
    1.Gets the full train and validate path of category
    2.Trains the model
    3.Output trained model
    
    

In [21]:
def TrainModel(Category_subdirectory,model):
    
    #location of the category's train and validation files
    train = trainpath + Category_subdirectory
    validate = validpath + Category_subdirectory
    #Batches of train images
    train_image_batches = get_batches(train,batch_size=8)
    validate_image_batches = get_batches(validate,batch_size=8)
    
    #Making changes to the last layer of the model's CNN
    
    pass
    
    