# Exercise: CNN Image Recognition

## Why CNN?
ANN have disadvantages:
- require linear input - too big for average size images
- each hidden neuron is connected with every one from the previous layer - too many weights and complicated calculations

CNN idea based on human vision: one neuron in the visual cortex has a small local receptive field and a preference of shapes/stimuli to react. All neurons together build the whole visual field.

## Convolutional Neural Network

The full picture:
!["Explanation of image"](../images/cnnimages/fullcnn.png "CNN")

## Features

### Input Layer
2- or 3-dimensional - more natural representation of the image

!["Explanation of image"](../images/cnnimages/8-gif.gif "pixels")

### Convolutional Layer
Connects one hidden neuron with a small part of the previous layer neurons. <br>
A the small receptive field, called __convolution kernel__ is used as a _weight_ and a _filter_

!["Explanation of image"](../images/cnnimages/convolutional.png "pixels")

#### Feature Maps
Each convolutional layer extracts specific features: feature maps. Each pixel of each feature map is an output of the convolutional layer.
Here are some examples.

!["Explanation of image"](../images/cnnimages/filters.png "pixels")

The filter on the left might activate strongest when it encounters a horizontal line; the one in the middle for a vertical line.

In the convolution layer, the filters slide across the input, row by row, and act when they detect their shape.<br>
Sliding filters are how the CNN can learn meaningful features and later locate them in any part of the image.

!["Explanation of image"](../images/cnnimages/kernels.gif "pixels")

#### Kernels

This is an illustration of a kernel:
!["Explanation of image"](../images/cnnimages/kernel.gif "pixels")

__Strides__: the number of transitional positions - 1 on the image above <br>
__Padding__: adding pixels to the sides, needed for processing the edge pixels - 1 on the image below <br>


![pad_step.png](attachment:pad_step.png)

### Pooling Layer
Used as _activation function_ and for reducing the size of the layer

This is an illustration of a max pooling: four pixels are replaced by the one, which has the biggest value.
!["Explanation of image"](../images/cnnimages/pooling.gif "Pooling")

### Fully Connected Layers and Dropouts

To avoid overfitting, various models of networks can be used, but it is expensive. Alternatively, a single model can be used to simulate large number of different networks by _randomly dropping out nodes_ during training. This is called __dropout__ and used as effective regularization method for reducing the overfitting and improve the errors. It can be implemented at the __fully connected__ layers.
More about dropouts: https://machinelearningmastery.com/dropout-for-regularizing-deep-neural-networks/

### Output Layer
A classic, fully connected layer is used as __output layer__.

## CNN Task: Cat or Dog?

!["Explanation of data"](../images/cnnimages/catordog.gif "CatorDog")

## Solution

In [13]:
!pip install opencv-python
# for resizing images



In [14]:
!pip install tflearn
# for CNN modelling

Collecting tflearn
  Using cached tflearn-0.5.0.tar.gz (107 kB)
  Preparing metadata (setup.py) ... [?25ldone
Building wheels for collected packages: tflearn
  Building wheel for tflearn (setup.py) ... [?25ldone
[?25h  Created wheel for tflearn: filename=tflearn-0.5.0-py3-none-any.whl size=127283 sha256=0d5731c7ad4ec1b7d854826448491a624b932f4197df0d868557a7e9606fc2f4
  Stored in directory: /Users/tdi/Library/Caches/pip/wheels/67/dd/7b/ddc1488c900739f410ea276e0e450d220f69d85c39e421f5f3
Successfully built tflearn
Installing collected packages: tflearn
Successfully installed tflearn-0.5.0


In [15]:
!pip install livelossplot
# for graphics

Collecting livelossplot
  Using cached livelossplot-0.5.5-py3-none-any.whl (22 kB)
Installing collected packages: livelossplot
Successfully installed livelossplot-0.5.5


In [16]:
from __future__ import division, print_function, absolute_import

### Imports

In [17]:
# Imports
import cv2                 # working with and resizing images
import os                  # dealing with directories
import numpy as np         # dealing with arrays and numbers
from random import shuffle # mixing up or currently ordered data
from tqdm import tqdm      # extra UX, shows progress bar

In [18]:
import os
os.getcwd()

'/Users/tdi/Documents/Teaching/BI/2023/Exercises'

### Data

In [19]:
# https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition/data
# https://www.cs.toronto.edu/~kriz/cifar.html
TRAIN_DIR = '../data/cnn/train'
TEST_DIR = '../data/cnn/test'
IMG_SIZE = 48
LR = 1e-4

In [20]:
# training files
list = os.listdir(TRAIN_DIR)

In [21]:
# number of training files
n_files = len(list)
n_files

5533

In [22]:
# see some
list[3666]

'cat.3951.jpg'

#### Set Labels

In [23]:
# function to get the image file name and create a label for it - one hot encoder
def set_label(file_name):
    label = file_name.split('.')[-3] 
    if label == 'cat': return [1,0]
    elif label == 'dog': return [0,1]

In [31]:
# prepare train data
def create_train_data(list):
    # empty array, which will contain the train images and their labels
    train_data = []
    
    # for each image file in the train directory:
    for file_name in tqdm(list):
        # convert the file name into label by use of the function above
        label = set_label(file_name)
        
        # extend the file path for this image with the file name
        file_path = os.path.join(TRAIN_DIR, file_name)
        
        # read the image from this file
        # img = cv2.imread(file_path, cv2.IMREAD_GRAYSCALE)
        img = cv2.imread(file_path, cv2.IMREAD_UNCHANGED)
        
        # resize the image, so all have same size
        img = cv2.resize(img, (IMG_SIZE,IMG_SIZE))       
        
        # add the image and its label into array 
        # train_data.append([np.array(img), np.array(label)])
        train_data.append([img, label])
        
    shuffle(train_data)
    # np.save('../data/train_data.npy', train_data)
    return train_data   

In [32]:
train_data = create_train_data(list)

100%|██████████████████████████████████████| 5533/5533 [00:07<00:00, 769.51it/s]


In [34]:
train_data[0]

[array([[[159, 173, 192],
         [140, 154, 172],
         [133, 147, 165],
         ...,
         [150, 173, 195],
         [113, 136, 158],
         [116, 139, 161]],
 
        [[132, 146, 164],
         [121, 135, 153],
         [142, 156, 174],
         ...,
         [134, 157, 179],
         [153, 176, 198],
         [161, 184, 206]],
 
        [[120, 134, 152],
         [132, 146, 164],
         [105, 119, 137],
         ...,
         [182, 205, 227],
         [172, 194, 217],
         [179, 202, 224]],
 
        ...,
 
        [[ 60,  47,  39],
         [ 61,  48,  40],
         [ 60,  47,  39],
         ...,
         [ 75,  88, 108],
         [127, 136, 160],
         [137, 145, 171]],
 
        [[ 40,  27,  19],
         [ 38,  25,  17],
         [ 56,  43,  35],
         ...,
         [ 94,  98, 118],
         [128, 144, 163],
         [116, 133, 152]],
 
        [[ 46,  33,  25],
         [ 55,  42,  34],
         [ 51,  38,  30],
         ...,
         [122, 138, 151],
  

#### Prepare X and Y sets From Train Data

In [35]:
# split the training data into train and test, for testing while training
train = train_data[:-500]
train_test = train_data[-500:]

In [36]:
# split both into X and Y
X = np.array([i[0] for i in train]).reshape(-1, IMG_SIZE, IMG_SIZE,3)
print(X.shape)
Y = np.array([i[1] for i in train])
print(Y.shape)

(5033, 48, 48, 3)
(5033, 2)


In [37]:
# split both into X and Y
X_test_set = np.array([i[0] for i in train_test]).reshape(-1, IMG_SIZE, IMG_SIZE, 3)
Y_test_set = np.array([i[1] for i in train_test])
print(X_test_set.shape)
print(Y_test_set.shape)

(500, 48, 48, 3)
(500, 2)


#### Build the CNN

In [46]:
!pip uninstall tensorflow -y


Found existing installation: tensorflow 2.14.0
Uninstalling tensorflow-2.14.0:
  Successfully uninstalled tensorflow-2.14.0


In [None]:
!pip install tensorflow==1.15

In [None]:
!conda install -c apple tensorflow-deps=2.9.0  

In [39]:
!python -m pip install tensorflow

Collecting tensorflow
  Obtaining dependency information for tensorflow from https://files.pythonhosted.org/packages/22/50/1e211cbb5e1f52e55eeae1605789c9d24403962d37581cf0deb3e6b33377/tensorflow-2.14.0-cp311-cp311-macosx_10_15_x86_64.whl.metadata
  Downloading tensorflow-2.14.0-cp311-cp311-macosx_10_15_x86_64.whl.metadata (3.9 kB)
Collecting absl-py>=1.0.0 (from tensorflow)
  Obtaining dependency information for absl-py>=1.0.0 from https://files.pythonhosted.org/packages/01/e4/dc0a1dcc4e74e08d7abedab278c795eef54a224363bb18f5692f416d834f/absl_py-2.0.0-py3-none-any.whl.metadata
  Downloading absl_py-2.0.0-py3-none-any.whl.metadata (2.3 kB)
Collecting astunparse>=1.6.0 (from tensorflow)
  Using cached astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Collecting gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 (from tensorflow)
  Downloading gast-0.5.4-py3-none-any.whl (19 kB)
Collecting google-pasta>=0.1.1 (from tensorflow)
  Using cached google_pasta-0.2.0-py3-none-any.whl (57 kB)
Collecting libclang>=13

Collecting oauthlib>=3.0.0 (from requests-oauthlib>=0.7.0->google-auth-oauthlib<1.1,>=0.5->tensorboard<2.15,>=2.14->tensorflow)
  Downloading oauthlib-3.2.2-py3-none-any.whl (151 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m151.7/151.7 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hDownloading tensorflow-2.14.0-cp311-cp311-macosx_10_15_x86_64.whl (229.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m229.7/229.7 MB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m:00:01[0m00:01[0m
[?25hDownloading ml_dtypes-0.2.0-cp311-cp311-macosx_10_9_universal2.whl (1.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m12.2 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hDownloading absl_py-2.0.0-py3-none-any.whl (130 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m130.2/130.2 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading grpcio-1.59.0-cp311-cp311-macosx_10_10_univ

In [40]:
# Import
import tensorflow as tf
from tensorflow.python.compiler.tensorrt import trt_convert as trt
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression
from tflearn.metrics import Accuracy

2023-10-03 02:19:52.847716: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Instructions for updating:
non-resource variables are not supported in the long term


ImportError: cannot import name 'is_sequence' from 'tensorflow.python.util.nest' (/Users/tdi/anaconda3/lib/python3.11/site-packages/tensorflow/python/util/nest.py)

In [41]:
# Prepare image transformations
from tflearn.data_preprocessing import ImagePreprocessing
from tflearn.data_augmentation import ImageAugmentation

# normalisation
img_prep = ImagePreprocessing()
img_prep.add_featurewise_zero_center()
img_prep.add_featurewise_stdnorm()

# create extra synthetic training data by flipping & rotating images
img_aug = ImageAugmentation()
img_aug.add_random_flip_leftright()
img_aug.add_random_rotation(max_angle=25.)

ImportError: cannot import name 'is_sequence' from 'tensorflow.python.util.nest' (/Users/tdi/anaconda3/lib/python3.11/site-packages/tensorflow/python/util/nest.py)

In [42]:
# Buld the CNN
# Input: IMG_SIZE x IMG_SIZE x 3 color channels (red, green and blue)
network = input_data(shape=[None, IMG_SIZE, IMG_SIZE, 3], name='input',
          # data_preprocessing=img_prep, 
          data_augmentation=img_aug)

# 1: Convolution layer with 32 filters, each 3x3x3
network = conv_2d(network, 32, 5, activation='relu')

# 2: Max pooling layer
network = max_pool_2d(network, 2)
                    
# 3: Convolution layer with 64 filters
network = conv_2d(network, 64, 5, activation='relu')

# 4: Max pooling layer
network = max_pool_2d(network, 2)

# 5: Convolution layer with 128 filters
network = conv_2d(network, 128, 5, activation='relu')

# 6: Max pooling layer
network = max_pool_2d(network, 2)

# 7: Convolution layer with 64 filters
network = conv_2d(network, 64, 5, activation='relu')

# 8: Max pooling layer
network = max_pool_2d(network, 2)
                     
# 9: Convolution layer with 32 filters
network = conv_2d(network, 32, 5, activation='relu')

# 10: Max pooling layer
network = max_pool_2d(network, 2)

# 11: Fully-connected 1024 node layer
network = fully_connected(network, 1024, activation='relu')

# 12: Dropout layer to combat overfitting
network = dropout(network, 0.5)

# Output: Fully-connected layer with two outputs
network = fully_connected(network, 2, activation='softmax')


NameError: name 'input_data' is not defined

In [43]:
# Configure how the network will be trained
acc = Accuracy(name="Accuracy")
network = regression(network, optimizer='adam', learning_rate=LR, 
                     loss='categorical_crossentropy', metric=acc, name='target')

NameError: name 'Accuracy' is not defined

In [None]:
# Wrap the network in a model object
import tflearn
model = tflearn.DNN(network, checkpoint_path='tmp/log',
                    tensorboard_verbose = 0, max_checkpoints = 1,
                    tensorboard_dir='tmp/log')
# parameters for the log files

#### Train the CNN

In [None]:
# Train model for 5 epochs
model.fit({'input': X}, {'target': Y}, n_epoch=10, 
          validation_set=({'input': X_test_set}, {'target': Y_test_set}), 
          snapshot_step=500, show_metric=True, run_id='MyModel12CNN33')

# see http://tflearn.org/models/dnn/ for more

In [None]:
# save the model
model.save('MyModel12CNN33.tflearn')

In [None]:
# See Tensorboard at tensorboard --logdir='/tmp/tflearn_logs'
# Tensorboard default address: http://0.0.0.0/6006
# !tensorboard --logdir='tmp/tflearn_logs/'
!tensorboard --logdir='tmp/log'

### Test the Model with Unlabeled Data

In [None]:
# load the model
model.load('MyModel12CNN33.tflearn')

In [None]:
# number of test files
list = os.listdir(TEST_DIR)
n_files = len(list)
n_files

In [None]:
# prepare test data - files have no word labels
def process_test_data(list):
    test_data = []
    for file in tqdm(list):
        # create file path
        path = os.path.join(TEST_DIR, file)
        # get image from this file
        img_num = file.split('.')[0]
        # img_data = cv2.imread(path,cv2.IMREAD_GRAYSCALE)
        img_data = cv2.imread(path,cv2.IMREAD_UNCHANGED)
        img_data = cv2.resize(img_data, (IMG_SIZE,IMG_SIZE))
        test_data.append([np.array(img_data), img_num])        
    shuffle(test_data)
    np.save('test_data.npy', test_data)
    return test_data

In [None]:
test_data = process_test_data(list)

In [None]:
# Test and plot
import matplotlib.pyplot as plt

fig=plt.figure(figsize=(16, 12))

for num, data in enumerate(test_data[500:525]):
    # cat: [1,0]
    # dog: [0,1]
    
    image = data[0]
    id = data[1]   
    
    original = image
    data = image.reshape(IMG_SIZE,IMG_SIZE,3)   
    
    # predict
    predicted = model.predict([data])[0]
    #print(f"cat: {predicted[0]}, dog: {predicted[1]}")
    
    if np.argmax(predicted) == 1: 
        result='Dog'
    else: 
        result='Cat'
            
    y = fig.add_subplot(5, 5, num+1)
    #y.imshow(original, cmap='gray')
    y.imshow(original)
    
    plt.title(result)
    y.axes.get_xaxis().set_visible(False)
    y.axes.get_yaxis().set_visible(False)
plt.show()


### Test with Unseen Image

In [None]:
# load your test image
img = cv2.imread('../images/cnnimages/siam.jpg')
print('Original Dimensions : ', img.shape)

In [None]:
# resize
img = cv2.resize(img, (IMG_SIZE,IMG_SIZE)) 
print('Resized Dimensions : ', img.shape)

In [None]:
# predict
predicted = model.predict([data])[0]
print(f"cat: {predicted[0]}, dog: {predicted[1]}")
if np.argmax(predicted) == 1: 
    result='Dog'
else: 
    result='Cat'
print(result)

In [None]:
cv2.imshow('result', img)
cv2.imwrite('result.jpg',img)
cv2.waitKey(0) 


#### How to Get Better Result
There's a lot of effort required, such as:

- Bigger networks with more layers: ResNet - the winner of the 2015 ImageNet Large Scale Visual Recognition Challenge (ILSVRC) - has 152 layers. (And if that wasn’t enough, to win they competition they used an ensemble of ResNets). 
- Pre-training: the winner of the Kaggle Dogs vs. Cats competition wrote, "My system was pre-trained on ImageNet (ILSVRC12 classification dataset) and subsequently refined on the cats and dogs data". The ImageNet ILSVRC12 dataset contains 10m labelled images depicting 10k objects. Even if there aren't many cats and dogs in the pre-training data, it helps the CNN learn useful filters for distinguishing objects in images. This is sometimes called "transfer learning." 
- Larger, higher-resolution images, downsizing sacrifices information; larger images have more detail and afford larger filters.
- Bagging, ensembles and using CNNs as inputs to other ML techniques: the 8th placed contestant in Kaggle's competition described some of their approaches on their blog.

__From__: http://www.subsubroutine.com/sub-subroutine/2016/9/30/cats-and-dogs-and-convolutional-neural-networks


## Your Task

1. Locate the data source here: https://www.tensorflow.org/tutorials/images/cnn
   Study the code. What does it do? What does it consist of?

   Explain the model, defined in it.
   For each layer, write down: the input, what the layers actually does and the output shape.

   Build and train the model. 
   What are the achieved values for precission and loss?

2. Now go to https://github.com/keras-team/keras/blob/master/examples/imdb_cnn_lstm.py
   Are there similarities and differences between the two models?

## Reference <br>
- https://towardsdatascience.com/image-classifier-cats-vs-dogs-with-convolutional-neural-networks-cnns-and-google-colabs-4e9af21ae7a8 
- http://web.pdx.edu/~jduh/courses/Archive/geog481w07/Students/Ludwig_ImageConvolution.pdf
- http://www.subsubroutine.com/sub-subroutine/2016/9/30/cats-and-dogs-and-convolutional-neural-networks
- https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/