---

#University of Stirling - Spring 2023

## CSCU9M6 - Natural Language Processing and Computer Vision (2022/3)

---

# Assignment Summary

In this activity, you are required to apply the knowledge acquired in this module through the design and development of a complete project for image classification in an application to be defined by yourself. For this, you will need to perform the following **mandatory** steps:

1. [Problem definition](#scrollTo=hglJVRRslqMn)
2. [GitHub repository](#scrollTo=ecxDhkV9qmUf)
3. [Dataset](#scrollTo=qEgFzxmWrGA9)
4. [Dataloader](#scrollTo=EDd6lLwlx4un)
5. [Proposed solution](#scrollTo=ScTrpUW8zOp4)
6. [Experimental tests and evaluations](#scrollTo=3RBW58of0ZDo)
7. [Quiz and Report](#scrollTo=ws14iV4Dp_vf)

**Deadlines** and other details can be seen on Canvas [\[link\]](https://canvas.stir.ac.uk/courses/12587/assignments/102373).

---

# 1. **Problem definition** 


In this assignment, you are required to apply the knowledge acquired in the module to solve a classification problem from images collected in the context of two different cities (A and B).
 - If the work is being carried out in pairs, **cities A and B must be the hometowns of each student**. In the case of individual work, city A must be your hometown and city B must be Stirling (or Edinburgh, if needed).
 - The standard recommendation is that the project focuses on classifying cars or trees image scenes, which are easier to identify and annotate. Other objects or phenomena can be adopted, but are subject to prior approval by the module instructor (Jefersson A. dos Santos). **You are not allowed to assemble datasets containing people. Other sensitive patterns, such as license plates, must be properly hidden.**
 - Don't panic! We are aware that acquiring images _in situ_ is an impediment for most students. The dataset can be assembled with images collected remotely or from public repositories. Just be careful with rights and permissions for using images found on the internet. Anyway, these factors must be taken into account for the problem definition.
 - While we encourage you to do interesting and engaging work, it shouldn't be too complex or time-consuming. Try to appropriately scale the time required for this step. Ask the instructors for advice, if necessary. **GA students:** you are encouraged to link the project with your work activities, but keep in mind you still need to construct two datasets (A and B). 

[top](#scrollTo=4i5afvUbhmGo)
 

---
# 2. **GitHub repository**

Give your project a name, create a private [GitHub repository](https://github.com/) with the name [Module Code] + [Project Name] and give access to the module instructors. Create a cover page with a description of your project. This empty notebook must be uploaded in the repository as well as the created dataset. The deadline to perform this task will be 10 days after the publication of this notebook. 
This notebook should be updated and committed to the repository according to the deadlines.
The repository's update history will be used as a criterion for monitoring and evaluating the work.
**Check the videos provided in the extra section on Canvas for more details on how to create your GitHub repository** [\[link\]](https://canvas.stir.ac.uk/courses/12587/pages/extra-session-cnn-hyperparameters-and-github).

[top](#scrollTo=4i5afvUbhmGo)

---
# 3. **Dataset creation**

You must collect a minimum of **200 positive samples** from the study objects for each city (A and B). 
Note that, depending on the task being solved, it will also be necessary to collect more samples - negative ones, for instance.

Your dataset can be assembled from one or more of the following ways:

  - *M1* - Pictures taken by yourself on site (street view from cities A and B), with attention to anonymization issues (if it is the case). It is not allowed to assemble datasets containing people. Other sensitive patterns, such as license plates, must be properly hidden.

  - *M2* - Aerial satellite/drone images obtained from GIS and remote sensing platforms or public repositories. Be careful with unusual file formats that may be challenging to manipulate using basic image processing libraries. We recommend keeping or converting the images to jpg or png.

  - *M3* - Pictures taken from other public available datasets. Remember you are not allowed to use datasets containing people or other sensitive patterns/objects.

  - *M4* - Images crawled from the internet as a whole (social networks, webpages, etc), with special attention to use and copyrights.

  - *M5* - Textual and metadata you may need in your project, with special attention to use and copyrights (as always!).

**Important:** If you collect the images on your own or from aerial imagery repositories, it will be necessary to keep the geographic coordinates. If you collect from specific websites, please retain the source links. This information should be placed in a .csv file and made available along with the final dataset.

[top](#scrollTo=4i5afvUbhmGo)

---

# 4. **Dataloader**

Here you are required to implement all the code related to pre-processing, cleaning, de-noising and preparing the input images and metadata according to the necessary data structures as input to your pattern recognition module. We recommend using [PyTorch](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html) or [Tensorflow (with Keras)](https://keras.io/getting_started/intro_to_keras_for_engineers/) as a base, but you are free to use any library or platform as long as it is well justified in the [final report](#scrollTo=ws14iV4Dp_vf).

[top](scrollTo=4i5afvUbhmGo)

In [1]:
# import necessary libraries
import cv2

import os

import numpy as np

from random import shuffle

from tqdm import tqdm

import tensorflow as tf

from tensorflow.keras.layers import Input, Conv2D

import tflearn
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression

import matplotlib.pyplot as plt

Instructions for updating:
non-resource variables are not supported in the long term
curses is not supported on this machine (please install/reinstall curses for an optimal experience)


In [2]:
# based on code from Subhajit Saha on:
# https://www.geeksforgeeks.org/image-classifier-using-cnn/

# environment setup 
# learning rate set to 0.001
TRAIN_DIR = 'C:/Users/maria/Documents/GitHub/CSCU9M6-2929300/Dataset/train'
TEST_DIR = 'C:/Users/maria/Documents/GitHub/CSCU9M6-2929300/Dataset/test'
IMG_SIZE = 50
LR = 1e-3 


# setup model to help tensorflow models
MODEL_ONE = 'treeDetection-{}-{}.model'.format(LR, '6conv-basic')


# label dataset
def label_img(img):
    word_label = img.split('.')[-3]
    #hot encoder
    if word_label == 'ca': return [1,0]
    elif word_label == 'cb': return [0,1]
    
    
# training data creation
def create_train_data():
    #empty list to store training data after processing
    training_data = []
    
    # tqdm used for interactive loading of training data
    for img in tqdm(os.listdir(TRAIN_DIR)):
        
        # label images
        label = label_img(img)
        
        path = os.path.join(TRAIN_DIR, img)
        
        # load image from path
        # convert into grayscale 
        
        img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
        
        # resize image to process
        # into covnet
        img = cv2.resize(img, (IMG_SIZE, IMG_SIZE))
        
        # form trainind data list with 
        # image numpy array
        training_data.append([np.array(img), np.array(label)])
        
    # shuffle training data to 
    # preserve random state
    shuffle(training_data)
    
    # save trained data for further use
    np.save('train_data.npy', training_data)
    return training_data


# process test data
def process_test_data():
    testing_data = []
    for img in tqdm(os.listdir(TEST_DIR)):
        path = os.path.join(TEST_DIR, img)
        img_num = img.split('.')[0]
        img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
        img = cv2.resize(img, (IMG_SIZE, IMG_SIZE))
        testing_data.append([np.array(img), img_num])
        
    # shuffle test data to
    # preserve randomness 
    shuffle(testing_data)
    
    # save test data for further use
    np.save('test_data.npy', testing_data)
    return testing_data
    

---

# 5. **Proposed solution** 

This is where you should implement most of your code for your solution. Write the routines for training and predicting the models and any necessary intermediate steps. Post-processing functions must also be implemented here.

  - Use good programming practices, modularizing and adequately commenting on your code. Code quality will be considered in the final assessment.

  - You can use pre-trained models as backbones or any code available on the web as a basis, but they must be correctly credited and referenced both in this notebook and in the final report. Cite the source link repository and explicitly cite the authors of it.
If you changed existing code, make it clear what the changes were.
Make it clear where your own code starts and where it ends. Note that the originality percentage of the code will be considered in the evaluation, so use external codes wisely and sparingly. **Missconduct alert:** remember that there are many tools that compare existing source code and that it is relatively easy to identify authorship. So, be careful and fair by always properly thanking the authors if you use external code.

[top](#scrollTo=4i5afvUbhmGo)

In [None]:
# based on code from Subhajit Saha on:
# https://www.geeksforgeeks.org/image-classifier-using-cnn/


# run train and test data 
train_data = create_train_data()  # loads from train_data.npy
test_data = process_test_data()   # loads from test_data.npy


# use tensorflow to create neural network

# clear and reset default graph
tf.compat.v1.reset_default_graph()

# define input layer
convnet = tflearn.input_data(shape=[None, IMG_SIZE, IMG_SIZE, 1], 
                         name='input')

# define 1st convolutional layer
convnet = tflearn.conv_2d(convnet, 32, 3, activation='relu')

# define 2nd convolutional layer
convnet = tflearn.conv_2d(convnet, 64, 3, activation='relu')

# define pooling layer 
convnet = tflearn.max_pool_2d(convnet, 2)

# define fully connected layer
convnet = tflearn.fully_connected(convnet, 1024, activation='relu')

# create CNN output layer with 2 neurons
# use softmax activation funtion for multi-class classification
convnet = tflearn.fully_connected(convnet, 2, activation='softmax')

# define loss function (cat. cross entropy),
# the optimizer (Adam),
# learning rate (0.001),
# and output layer name (targets)
convnet = tflearn.regression(convnet, optimizer='adam', 
                             learning_rate=0.001, 
                             loss='categorical_crossentropy',
                             name ='targets')

# create tf DNN (Deep Neural Network) with specified architecture
# write training info to log directory to
# visualise with TensorBoard
model = tflearn.DNN(convnet, tensorboard_verbose=0)


# split test and train data
train = train_data[:-500]
test = test_data[-500:]


# set up X-Features and Y-Labels
# X is the input data used to train the model
# Y is the expected output for each input
# -1 in reshape function means that size of a dimension is
# automatically calculated based on size of other dimensions
# and lenght of input data
X = np.array([i[0] for i in train]). reshape(-1, IMG_SIZE, IMG_SIZE, 1)
Y = [i[1] for i in train]
test_x = np.array([i[0] for i in train]). reshape(-1, IMG_SIZE, IMG_SIZE, 1)
test_y = [i[1] for i in test]


# fit data in model
model.fit({'input':X}, {'targets':Y}, n_epoch = 5,
         validation_set = ({'input': test_x}, {'targets': test_y}),
         snapshot_step = 500, show_metric = True, run_id = MODEL_ONE)
model.save(MODEL_ONE)


# test data,
# use the previously saved test_data.npy,
# make picke True (its default is False) because otherwise 
# it wont allow loading of object arrays
test_data = np.load('test_data.npy', allow_pickle=True)

# create container for graph elements
fig = plt.figure()


# iterate over the first 20 elements of test_data,
# assign a unique number and data element to variables 'num' and 'data'
# on each iteration
for num, data in enumerate(test_data[:20]):
    
    # belongs to City A->[1, 0]
    # belongs to City B->[0, 1]
    
    img_num = data[1]
    img_data = data[0]
    
    # display image gride for visualisation
    # +1 indicates that subplot index starts at 1, not 0
    y = fig.add_subplot(4, 4, num + 1)
    # assign current image data to a new variable
    orig = img_data
    # reshape data into 3D numpy array 
    data = img_data.reshape(IMG_SIZE, IMG_SIZE, 1)
    
    # train model to generate a prediction for a 
    # single input data point
    model_out = model.predict([data])[0]
    
    # use model output to predict the class of input data,
    # if tree belongs to City A or City B
    
    if np.argmax(model_out) == 1 : str_label = 'City B'
    else : str_label = 'City A'
            
    # display original image data in grayscale
    y.imshow(orig, cmap = 'gray')
    # set subplot title to predicted label
    plt.title(str_label)
    # remove x-and-y-axis labels from subplot to improve visuals
    y.axes.get_xaxis().set_visible(False)
    y.axes.get_yaxis().set_visible(False)
#display all graphs
plt.show()

100%|██████████| 32/32 [00:00<00:00, 219.52it/s]
  arr = np.asanyarray(arr)
100%|██████████| 355/355 [00:01<00:00, 200.58it/s]

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Instructions for updating:
Use tf.initializers.variance_scaling instead with distribution=uniform to get equivalent behavior.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor





---------------------------------
Run id: treeDetection-0.001-6conv-basic.model
Log directory: /tmp/tflearn_logs/
INFO:tensorflow:Summary name Accuracy/ (raw) is illegal; using Accuracy/__raw_ instead.


Exception in thread Thread-6 (fill_batch_ids_queue):
Traceback (most recent call last):
  File "C:\Users\maria\anaconda3\envs\tf\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "C:\Users\maria\anaconda3\envs\tf\lib\threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\maria\anaconda3\envs\tf\lib\site-packages\tflearn\data_flow.py", line 201, in fill_batch_ids_queue
    ids = self.next_batch_ids()
  File "C:\Users\maria\anaconda3\envs\tf\lib\site-packages\tflearn\data_flow.py", line 215, in next_batch_ids
    batch_start, batch_end = self.batches[self.batch_index]
IndexError: list index out of range


---------------------------------
Training samples: 0
Validation samples: 0
--
INFO:tensorflow:C:\Users\maria\Documents\GitHub\CSCU9M6-2929300\treeDetection-0.001-6conv-basic.model is not in all_model_checkpoint_paths. Manually adding it.


ValueError: num must be an integer with 1 <= num <= 16, not 17

---

# 6. **Experimental tests and evaluations** 


Here you must implement your code for training, testing and evaluating your solution. For this, the following code blocks (*E1*, *E2*, and *E3*) are mandatory:

  - *E1* - Training the models. Implement code to call the dataloaders implemented for training your models.  Make routines to test different parameters of your models. Plot graphs that illustrate how parameters impact model training. Compare. Train and select a model for each city (A and B) and justify. You should use half (50%) of the samples from each dataset for training and leave the other half for testing (50%). 

[top](#scrollTo=4i5afvUbhmGo)

In [None]:
# Write your codes for E1 here. Create more code cells if needed





  - *E2* - Testing the models in the dataset. You must implement code routines to test the predictive ability of your models using half of each dataset intended for testing. **The model trained in city A must be tested in city A. The model trained in city B must be tested in city B.** Use the evaluation metrics (accuracy, F1-score, AUC, etc) that are most appropriate for your problem. Plot graphs that illustrate the results obtained for each city (A and B). Plot visual examples of correctly (true positive) and incorrectly (false positive) classified samples. 

[top](#scrollTo=4i5afvUbhmGo)


In [None]:
# Write your codes for E2 here. Create more code cells if needed





  - *E3* - Testing the models crossing datasets. Here you must do exactly the same as in *E2*, but now training in one city and testing in the other. **The model trained in city A must be tested in city B. The model trained in city B must be tested in city A.** Use the same metrics and plot the same types of graphs so that results are comparable.

[top](scrollTo=4i5afvUbhmGo)

In [None]:
# Write your codes for E3 here. Create more code cells if needed





---

# 7. **Quiz and Report**

Answer the assessment quiz that will be made available on Canvas one week before the final deadline. Make a 2-page report using the [IEEE template](https://www.overleaf.com/read/rdqwshtvyjdn) with a maximum of 1000 words. Latex is recommended, but you can deliver the report in MS Word if you prefer. Your report should contain five sections: introduction, description of the proposed solution with justifications, results (here you can include the same graphs and pictures generated in this jupyter notebook), discussion of the results, and conclusion. Properly cite references to articles, tutorials, and sources used. A pdf version of your report should be made available in the project's github repository under the name "[project name] + _final_report.pdf".


[top](#scrollTo=4i5afvUbhmGo)