# Big Data Team 8

Our names are Ehran, Princewell and Lenn and in this report, we will explain the steps we took to come up with a solution for the image recognition project for our deep learning course.

Since Ehran already had made a model based on felines which suited the requirements for this course, we decided to start from this. Doing so gave us the chance to include more extras than needed. We ended up adding a total of four extras. This includes benchmarking, assessment using ROC and AUC metrics, a Keras model and an API endpoint using Flask.

In this document we will go over our general model as well as those four extras.

### Data set - Types of cats

For this project we utilized a data set that had been scraped by one of our members previously. It consists of roughly 2000 images, comprised of big cats. We focused our model on 5 types of cats: Tiger, Lion, Cheetah, Leopard and the normal house cat. Multiple images that we used were duplicates to allow for data augmentation within the code. One thing to note is that there is a  tiny amount of bias against the leopard since many of the images that were collected turned out to be not images of leopards but of other types of cats such as the cheetah and therefore we had to remove quite a few of these.

Below is the python scraper that was used to achieve the data set. This scraper has not been adjusted since the previous assignment for which it was used.

In [None]:
import selenium
import time
import requests
import threading
import os
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By

DRIVER_PATH = "11. DL - Introduction to Deep Learning\\resources\chromedriver.exe"
search_terms = ['Cat', 'Cheetah', 'Lion', 'Leopard', 'Tiger']
thread_list = []
query_string = "https://www.pexels.com/search/{s}/"
for i in search_terms:
    os.mkdir("11. DL - Introduction to Deep Learning\\resources\DL_DataFolders\DL_{0}".format(i)) 
data_set_list = os.listdir("11. DL - Introduction to Deep Learning\\resources\DL_DataFolders") 
service = Service(DRIVER_PATH)

def scrape_images(datafolder, query_word):
    service.start()
    driver = webdriver.Remote(service.service_url)
    driver.get(query_string.format(s=(query_word)))
    time.sleep(2)
    src = []

    for z in range(6):
        # Scroll down the body of the web page and load the images.
        driver.execute_script("window.scrollBy(0,1500);")
        time.sleep(2)
        # Find the images.
        imgResults = driver.find_elements(By.CLASS_NAME,"MediaCard_image__ljFAl")
        
        # Access and store the scr list of image url's.
        for img in imgResults:
            src.append(img.get_attribute('src'))
        

    
    time.sleep(2)
    # Retrieve and download the images.    
    for i in range(500):   
        r = requests.get(str(src[i]),stream = True).content
        with open("11. DL - Introduction to Deep Learning\\resources\DL_DataFolders\{0}\{1}{2}.jpg".format(datafolder, query_word, i),'wb') as f:
                f.write(r)
        

    driver.quit()

for i in range(len(data_set_list)):
    thread_list.append(threading.Thread(target=scrape_images, args=(data_set_list[i], search_terms[i])))
    thread_list[i].start()
    

for i in range(len(data_set_list)):
    thread_list[i].join()


These images have then been evaluated for wrong images which were removed. Afterwards the folder structure was also renamed to make future work with it easier. After the completion of numerous training cycles we adjusted the last wrongly classified images that slipped through the cracks previously.

### Modeling using fastai

Fastai is a deep learning library which provides us with high level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, providing researcher with low-level components that can be mixed and matched to build new approaches. 

Loading the data

The data to fit our model needs to be wrapped in a wrapper class called 'DataLoaders'. The wrapper class takes whatever objects we pass to it which is usually a 'train' and a 'valid'  objects. In other words, we need DataLoaders as a wrapper around our training and validation data, so other fastai functions can call upon them.

Fastai data block API

Since we're going to use a custom dataset in the DataLoaders object, we need to provide fastai info on the kind of data we'll be working with, how to get all the data items, how to label these items and how to create the validation set. This was done using the 'data block API'

In [None]:
cats = DataBlock(
    # A turple to specify what types we want for the independent and dependent variables
    blocks=(ImageBlock, CategoryBlock), 
    # Function below takes a path and returns a list of all the images in that path
    get_items=get_image_files,
    # Random splitting of our dataset 
    splitter=RandomSplitter(valid_pct=0.2, seed=1),
    # Independent variable =x, dependent  = y
    get_y=parent_label,
    # Transforming images to the same size and add some augmentation
    item_tfms=Resize(128))
    batch_tfms=aug_transforms(size=224, min_scale=0.75))
 # Data loader been fed with the path   
dls = cats.dataloaders('/content/gdrive/MyDrive/resources/DL_DataFolders_cleaned/train')

Creating and training the model

With the datasets and DataLoaders defined, we'll define our model. Pytorch provides several world class CNN's pretrained on Imagenet, We'll use resent34 model pretrained on Imagenet and only as a feature extractor. So we'll train the fully connected layer of the network.

In [None]:
# The learning rate finder
resnet34_just_right = vision_learner(dls, resnet34, metrics=error_rate)
resnet34_just_right.lr_find()

The learning rate finder will do a quick search using the chosen architecture and data, to try to find the best learning rate. Since the learning rate is very significant when training a model, we have to make sure it is right. If we have a low learning rate, there will be a lot of iterations to train the model and we may eventually have problems with overfitting because we give it too much time and the model will have a chance at memorizing.

In [None]:
resnet_adv = vision_learner(dls, resnet34, metrics=error_rate)
resnet_adv.fit_one_cycle(3, 3e-3)

A vision_learner is the module that defines the cnn_learner method to easily get a model suitable for transfer learning. The cnn_learner method helps to automatically get a pretrained model from a given architecture with a custom head that is suitable for your data.

Since we're doing transfer learning, the DataLoader and the pretrained model resnet34 is passed as an argument to the vision_learner. And we're going to use the fit_one_cycle method due to its better performance in speed and accuracy. It uses large cyclical learning rates to train models significantly quicker and with a higher accuracy. What happens is it start training at a low learning rate, gradually increase it for the first section of training, and then gradually decrease it again for the last section of training. The number of epochs = 3 and the learning rate = 0.003(3e-3) is passed as arguments to the method.

The learning rate is a tuning parameter in an optimization algorithm that determines the step size at each iteration while moving toward the minimum of a loss function, and an epoch is an iteration over the whole dataset. 

In [None]:
# Unfreezing the resnet pretrained weight
resnet_adv.unfreeze()

In [None]:
# Train the entire network for a couple of epochs
resnet_adv.fit_one_cycle(6, 1e-5)

Save the model

In [None]:
# Export model
resnet_adv.export('cats.pkl')

### Deploying on Streamlit

Streamlit is an open source python library that makes it easy to turn data science, computer vision, NLP, and so forth into apps. Providing extensive documentation, Streamlit allows us to deploy previously difficult to code projects in a matter of minutes. Streamlit can be ran locally on our own machines but we chose to also deploy our app online using the new service of Streamlit cloud. Streamlit cloud is a rather new service and when we deployed the base versions of our app to it we never ran into any problems. However, this changed when we tried to upload the model training part since it requires the use of data augmentation within its dataloaders. This in turn causes the cloud version of Streamlit to run out of memory and reload, stopping any possible interaction. We therefore had to remove this functionality from our online deployed app but it still seems to function correctly on local.

We used Streamlit to deploy not only our model and the ability to predict different images but we also provided our data set in an easy to use format to perform an EDA and the ability to train a new model using different advanced settings. Streamlit apps can be deployed locally by using the active Shell and simply typing `streamlit run app.py` while in the folder with said Python file. For Streamlit cloud we had to sign up to share.streamlit.io and create a new project where we pointed the project to our github page. Streamlit cloud runs the app by using the appointed .py file within our github and a requirements.txt file that tells Streamlit which dependencies to install. Within this requirements.txt we can also tell Streamlit which specific versions of the imports to use. 

Streamlit allows us to use simple configuration options to set up a sidebar navigation, give the page a title and apply any other necessary small addition.

In [None]:
# Setting page configurations such as title and layout
st.set_page_config(
    page_title="Image Classification",
    page_icon="🧊",
    layout="wide",
    initial_sidebar_state="expanded")
# Creating the navigation method which is always activated upon starting of the Streamlit app
def navigation():
    #1. as slidebar menu
    with st.sidebar:
        # We create a new object in which we place different attributes that Streamlit requires to further
        # build the navigation. Upon clicking on the options which we prescribe inside of the option_menu
        # it will change the selected variable
        selected = option_menu(
            menu_title= "Big Data",
            options = ["Classifier", "EDA", "Google Teachable Machine", "New Image Trainer"],
            # We are also capable of giving specific Streamlit icons to the different navigation options
            icons=['upload', 'graph-down'],
            menu_icon="cast", default_index=0
        )
    # It will fire the selected function, which if nothing is selected will 
    # be the main app at the beginning
    if selected == "Classifier":
        main_app()
    if selected == "EDA":
        eda()
    if selected == "Google Teachable Machine":
        googlemachine()
    if selected == "New Image Trainer":
        fastai_training()

Streamlit functions with the use of methods, for each page it has a different method that is activated when the corresponding navigation button is activated. One notable exception however is the first method which is immediately called, in our case main_app(). The main app is our classic image classification which gives the user a button to upload an image where we will then keep it in a variable. We create two columns using st.colums(2). Streamlit allows us to modify these columns as objects with their own methods which will then apply our modifications to the page. Upon uploading an image it will display the uploaded image together with the predicted class and the probability that it is actually that class.

In [None]:
def main_app():
    st.header('Image Classification')
    st.subheader('Model trained with Fastai')
    # Setting up the path file system to use windowspath type, this also helps us use 
    # githubs folder structure.
    plt = platform.system()
    if plt == 'Windows': pathlib.PosixPath = pathlib.WindowsPath
    # Load the previously uploaded fastai model which was saved in a pickle file
    res_model = load_learner(pathlib.Path()/'cats.pkl')
    # Keep the saved file in a variable while also only allowing image files
    uploaded_file = st.file_uploader("Upload Files",type=['png','jpeg', 'jpg'])
    # Creating two columns
    col1,col2 = st.columns(2)
    # 'Opening' the first column and utilizing this activated column to replace original information
    with col1:
        # Remove the previous image (if there is a previous image)
        display_image = st.empty()
        # If there is no uploaded file it will return to showing the beginning information
        if not uploaded_file:
            return  display_image.info("Choose a file to upload, only type: png, jpg, jpeg ")
        # If there is an uploaded file we will use the PIL library to open the uploaded image
        # and afterwards display it onto the page
        else:
            uploaded_file = PILImage.create((uploaded_file))
            display_image.image(uploaded_file.to_thumb(500,500), caption='Image Upload')
        # Saving the prediction probability of the uploaded file that has 
        # been predicted by our learner model
        pred, pred_idx, probs = res_model.predict(uploaded_file)
    with col2:
        # Upon success of the previous functions it will show the predicted class
        # and prediction probability
        st.success(f'Prediction: {pred} ')
        st.info(f'Probability: {probs[pred_idx]:.04f}')

The next method we created is the one for our EDA, Exploratory Data Analysis. The same idea transpired here except that instead of loading a resnet model here we go through our data set and place it on our Streamlit app. We can load all of the images fairly straightforwardly using two for loops. We do this by first taking the names of our labels/classes by listing our directory folders which we have aptly named by our classes. Afterwards we find the images by creating the image names with the class name and path to our data set. Afterwards we apply a filter to separate the different classes' images.

As part of our EDA we calculate the average resolution too by using the .size[] method on our images and calculating the average of all. Afterwards we will create multiple 'tabs', 5 in total to fit the amount of classes that we have. Inside of the tabs we then display the amount of images of the specified class, with the amount being chosen using a Streamlit slider.

In [None]:
def eda():
    st.header('Exploratory data analysis')

    # List the path to our data set in github
    data_path = 'cats/'
    # Instantiating arrays so that we might save our labels and images
    img_list = []
    labels = []
    # Fill in the labels & img_list lists
    for class_name in os.listdir(data_path):
        if class_name not in labels:
            labels.append(class_name)
        img_dir = data_path + class_name + "/"
        for img_filename in os.listdir(img_dir):
            img_path = img_dir + img_filename
            img_list.append([img_path, class_name])
    # We create a filter to get the label part of the sublist (eg: [[img_path, label], ...]
    def get_filtered_list(filter: str, list: list = img_list):
        return [x[0] for x in list if x[1] == filter]
    # We create a method to calculate the average resolution for a given list of images
    def get_average_img_resolution(images: list):
        widths = []
        heights = []
        # Use a loop to get the height and width of all of the images
        for img in images:
            im = Image.open(img)
            # Appending the size items width and height) into the previously made array
            widths.append(im.size[0])
            heights.append(im.size[1])

        avg_width = round(sum(widths) / len(widths))
        avg_height = round(sum(heights) / len(heights))
        return [avg_width, avg_height]
    # Creating 5 tabs, equal to the amount of classes we have
    tab1, tab2, tab3, tab4, tab5 = st.tabs(labels)
    # A tab for each of the labels
    for index, tab in enumerate([tab1, tab2, tab3, tab4, tab5]):
        # With activating the tab we are able to write the amount of samples within the data set
        # and the chosen amount to be displayed. Afterwards we will display the images of
        # the chose class
        with tab:
            images = get_filtered_list(labels[index])
            total_imgs = len(images)
            # We get the average image resolution of the images that pertain to the chosen class
            avg_w, avg_h = get_average_img_resolution(images)
            st.header(labels[index])
            st.write(f'Total Samples:  {total_imgs} ')
            st.write(f'Image Resolution: {avg_w}x{avg_h}')
            to_show = st.slider('Slide to adjust samples being displayed', 0, total_imgs,
                                30)
            st.image(images[:to_show], width=200)

We deployed the Google Teachable Machine on our Streamlit app too. By creating a new method and using the code Google gave us for deploying their model. The model itself is saved to a keras model file that we uploaded to github. Loading the model into a variable is fairly straightforward using the load_model() function. Google Teachable Machine does require us to create another textfile in which we place the labels of our to be predicted classes. Normally Google also gives this file together with the model however we adjusted it slightly to give it the proper names. 

In [None]:
def googlemachine():
    # Disable scientific notation for clarity
    np.set_printoptions(suppress=True)
    # Load the model
    model = load_model(pathlib.Path()/'keras_model.h5', compile=False)
    # Load the labels
    class_names = open(pathlib.Path()/'labels.txt', 'r').readlines()
    # We reshape the image into Google's preferred shape
    data = np.ndarray(shape=(1, 224, 224, 3), dtype=np.float32)
    # Utilizing the same file upload as with our image classification
    google_file = st.file_uploader("Upload Files",type=['png','jpeg', 'jpg'])
    # Generating two columns again to keep the same layout
    col1,col2 = st.columns(2)
    with col1:
        display_image = st.empty()
        if not google_file:
            return  display_image.info("Choose a file to upload, only type: png, jpg, jpeg ")
        else:
            google_file = PILImage.create((google_file))
            display_image.image(google_file.to_thumb(500,500), caption='Image Upload')
    with col2:
        # We begin with processing the uploaded file as was described with Google's
        # Teachable Machine
        image = google_file.convert('RGB')
        # Resizing the image to a 224x224
        size = (224, 224)
        image = ImageOps.fit(image, size, Image.Resampling.LANCZOS)
        # Turning the image into a numpy array
        image_array = np.asarray(image)
        # Normalizing the image
        normalized_image_array = (image_array.astype(np.float32) / 127.0) - 1
        # Loading the image into the array
        data[0] = normalized_image_array
        # Run the inference and predict the processed image
        prediction = model.predict(data)
        index = np.argmax(prediction)
        # Save the predicted class and the probability/confidence score that Google Teachable
        # Machine gives us
        class_name = class_names[index]
        confidence_score = prediction[0][index]
        # Displaying the predicted class and giving its probability
        st.success(f'Prediction: {class_name} ')
        st.info(f'Probability: {confidence_score}')
    
    google_file.close()

The last and final tab of our Streamlit app is the new model trainer with fastai. The general idea is the same as our earlier explanations of how we trained our fastai model. Few notable differences however is the addition of 'options' in our app, giving the users the ability to change the learning rate, epochs trained and which resnet to use. resnet152 was not included in these options as the model is far too big to be trained via the Streamlit cloud. Later on it turned out that the dataloaders in general were too memory intensive to be used in the Streamlit cloud and therefore the training of the resnet model has been delegated to only local use. An important change is the slider for the learning rate. Since Streamlit does not allow the base slider values to go down to a certain number, meaning we have to divide the chosen value by a 1000. Upon choosing the required parameters the user will be able to start the training. After the initial training for the amount of epochs and learning rate chosen we will unfreeze the weights and then fine tune the newly trained model once more. Upon the training being finished we export our model as a new pickle file and allow the user to download this pickle file to set up their own prediction service.

In [None]:
def fastai_training():
    st.header('New Image Classifier Model')
    st.subheader('Powered by FastAi')
    fns = get_image_files('cats/')
    cats = DataBlock(
        blocks=(ImageBlock, CategoryBlock), 
        get_items=get_image_files, 
        splitter=RandomSplitter(valid_pct=0.2, seed=1),
        get_y=parent_label,
        item_tfms=Resize(128))
    dls = cats.dataloaders('./cats/')
    # Random Resize and Augmentation
    cats = cats.new(
    item_tfms=RandomResizedCrop(224, min_scale=0.5),
    batch_tfms=aug_transforms())
    dls = cats.dataloaders('./cats/')
    col1, col2 = st.columns(2)
    col11, col12 = st.columns(2)
    cnn_arch = col1.selectbox('Select CNN Architecture', options=['resnet50','resnet34']
    , index = 0)
    no_epoch = col2.slider('What is your desired number of epochs', min_value=1, max_value=50, 
    value=3, step=1)
    learning_rate = col11.slider('What is your desired learning rate (Value divided by 1000)'
    , min_value= 1, max_value=100, value=30, step=10)/1000
    st.info(f'Calculated learning rate: {learning_rate}')
    if st.button('Train Model'):
        resnet_adv = vision_learner(dls, cnn_arch, metrics=error_rate)
        resnet_adv.fit_one_cycle(no_epoch, learning_rate)
        resnet_adv.unfreeze()
        resnet_adv.fit_one_cycle(1, 1e-5)
        resnet_adv.export('cats.pkl')
        with open('cats.pkl', 'rb') as f:
            # Defaults to 'application/octet-stream'
            st.download_button('Download Model', f, file_name='cats.pkl')  
    else:
        st.write('Click on button to start training')

### Google Teachable Machine

Google Teachable Machine is a web-based machine learning tool that allows us to easily train machine learning models without needing to have any previous knowledge or experience with machine learning. It provides an easy-to-use interface that allows us to train models using our own images, audio, or video data, and then use those trained models to make predictions on new data. With Teachable Machine, we could choose from a variety of pre-defined model types, such as image classification, sound classification, and pose estimation, and then use the tool to train our own models by providing sample data. Once the model had been trained, We could make our own predictions on the web interface our, as we did, download the Keras model and apply it to our own app.

"Benchmark picture"

The Teachable Machine has pretty good results, yet slightly worse than our model with the advanced options. Even still, the performance of the Teachable Machine 

### Google Vertex

Google Vertex is a machine learning model developed by Google for image classification. The Google Vertex model is trained using a large dataset of labeled images and uses advanced techniques such as convolutional neural networks to accurately classify images.

"Vertex benchmark"

Vertex used only a handful of images as validation in its training therefore the result should interpreted accordingly. 

### ROC and AUC metric

### Flask API

Flask is a micro web framework written in Python used to create web applications. In this project, we used it to provide an API with a single post method which can make a guess of which feline is shown in an image included in the body of the API call. This image can either be a jpg, jpeg or png file.

In [None]:
# necessary imports
import os
from flask import Flask, request, make_response, jsonify
from werkzeug.utils import secure_filename
from fastai.vision.all import *
from fastai.data.external import *
import pathlib


temp = pathlib.PosixPath
pathlib.PosixPath = pathlib.WindowsPath

# list of allowed extensions
ALLOWED_EXTENSIONS = {'png', 'jpg', 'jpeg'}

# initiation
app = Flask(__name__)

# specify the pickle file made in the BigDataAdvance notebook
learner = load_learner('cats.pkl')

# method used to check if the extension of the file matches one of the allowed extensions mentioned earlier
def allowed_file(filename):
    return '.' in filename and filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS

# route and method for the API call
@app.route('/predict', methods=['POST'])
def predict():
    # exception for when there is no image (error code 400)
    if 'image' not in request.files:
        return {'error': 'no image found, in request.'}, 400

    # save the image in a variable called file
    file = request.files['image']
    # exception for when the name of the file is empty (error code 400)
    if file.filename == '':
        return {'error': 'no image found. Empty'}, 400
 
    # actions for when the file is present and has one of the allowed extensions
    if file and allowed_file(file.filename): 
        # create an image based on the file
        img = PILImage.create(file)
        # let the model make a prediction based on the image
        pred = learner.predict(img)
        print(pred)
        # return "success" as well as the prediction (code 200)
        return {'success': pred[0]}, 200    
    # create an exception for any other error that may occur (error code 500)
    return {'error': 'something went wrong.'}, 500

if __name__ == '__main__':
    # set to port 5000
    port = os.getenv('PORT',5000)
    # run the app
    app.run(debug=True, host='0.0.0.0', port=port) 
    print("success")

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=3ae29049-a1d9-4cc4-a029-77a1751cb9d4' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>