<a href="https://colab.research.google.com/github/jeffheaton/t81_558_deep_learning/blob/master/assignments/assignment_yourname_class7.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# T81-558: Applications of Deep Neural Networks
* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), School of Engineering and Applied Science, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)
* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/).

**Module 7 Assignment: Computer Vision Neural Network**

**Student Name: Your Name**

# Google CoLab Instructions

This assignment will be most straightforward if you use Google CoLab, because it requires both PyTorch and YOLOv5 to be installed. It will be necessary to mount your GDrive so that you can send your notebook during the submit process. Running the following code will map your GDrive to ```/content/drive```.

In [1]:
try:
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
    COLAB = True
    print("Note: using Google CoLab")
except:
    print("Note: not using Google CoLab")
    COLAB = False

Mounted at /content/drive
Note: using Google CoLab


# Assignment Submit Function

You will submit the 10 programming assignments electronically.  The following submit function can be used to do this.  My server will perform a basic check of each assignment and let you know if it sees any basic problems. 

**It is unlikely that should need to modify this function.**

In [2]:
import base64
import os
import numpy as np
import pandas as pd
import requests
import PIL
import PIL.Image
import io

# This function submits an assignment.  You can submit an assignment as much as you like, only the final
# submission counts.  The paramaters are as follows:
# data - List of pandas dataframes or images.
# key - Your student key that was emailed to you.
# no - The assignment class number, should be 1 through 1.
# source_file - The full path to your Python or IPYNB file.  This must have "_class1" as part of its name.  
# .             The number must match your assignment number.  For example "_class2" for class assignment #2.
def submit(data,key,no,source_file=None):
    if source_file is None and '__file__' not in globals(): raise Exception('Must specify a filename when a Jupyter notebook.')
    if source_file is None: source_file = __file__
    suffix = '_class{}'.format(no)
    if suffix not in source_file: raise Exception('{} must be part of the filename.'.format(suffix))
    with open(source_file, "rb") as image_file:
        encoded_python = base64.b64encode(image_file.read()).decode('ascii')
    ext = os.path.splitext(source_file)[-1].lower()
    if ext not in ['.ipynb','.py']: raise Exception("Source file is {} must be .py or .ipynb".format(ext))
    payload = []
    for item in data:
        if type(item) is PIL.Image.Image:
            buffered = BytesIO()
            item.save(buffered, format="PNG")
            payload.append({'PNG':base64.b64encode(buffered.getvalue()).decode('ascii')})
        elif type(item) is pd.core.frame.DataFrame:
            payload.append({'CSV':base64.b64encode(item.to_csv(index=False).encode('ascii')).decode("ascii")})
    r= requests.post("https://api.heatonresearch.com/assignment-submit",
        headers={'x-api-key':key}, json={ 'payload': payload,'assignment': no, 'ext':ext, 'py':encoded_python})
    if r.status_code==200:
        print("Success: {}".format(r.text))
    else: print("Failure: {}".format(r.text))

# Assignment Instructions

For this assignment, you will use YOLO running on Google CoLab.  I suggest that you run this assignment on CoLab because the example code below is already setup to get you started with the correct versions of  YOLO on TensorFlow 2.0.

For this assignment you are provided with 10 image files that contain 10 different webcam pictures taken at the [Venice Sidewalk Cafe](https://www.westland.net/beachcam/) a WebCam that has been in opration since 1996.  You can find the 10 images here:

* https://data.heatonresearch.com/data/t81-558/sidewalk/sidewalk1.jpg
* https://data.heatonresearch.com/data/t81-558/sidewalk/sidewalk2.jpg
* https://data.heatonresearch.com/data/t81-558/sidewalk/sidewalk3.jpg
* https://data.heatonresearch.com/data/t81-558/sidewalk/sidewalk4.jpg
* https://data.heatonresearch.com/data/t81-558/sidewalk/sidewalk5.jpg
* https://data.heatonresearch.com/data/t81-558/sidewalk/sidewalk6.jpg
* https://data.heatonresearch.com/data/t81-558/sidewalk/sidewalk7.jpg
* https://data.heatonresearch.com/data/t81-558/sidewalk/sidewalk8.jpg
* https://data.heatonresearch.com/data/t81-558/sidewalk/sidewalk9.jpg
* https://data.heatonresearch.com/data/t81-558/sidewalk/sidewalk10.jpg

You can see a sample of the WebCam here:

![alt text](https://data.heatonresearch.com/data/t81-558/sidewalk/sidewalk1.jpg)

YOLO does quite well-recognizing objects in this webcam, as the following image illustrates.

![alt text](https://data.heatonresearch.com/data/t81-558/sidewalk/predictions.jpg)

You are to write a script that counts the number of certain objects in each of the images.  Specifically, you are looking for:

* person
* car
* bicycle
* motorbike
* umbrella
* handbag

It is essential that your use YOLO with a threshold of 10% if you want your results to match mine. The sample code below already contains this setting.  Your program can set this threshold with the following command.

* conf_thres=0.1  # confidence threshold (use this value)
* iou_thres=0.25  # NMS IOU threshold (use this value)

Your submitted data frame should also contain a column that identifies which image generated each row.  This column should be named **image** and contain integer numbers between 1 and 10.  There should be 10 rows in total.  The complete data frame should look something like this (not necessarily exactly these numbers).

|image|person|car|bicycle|motorbike|umbrella|handbag|
|-|-|-|-|-|-|-|
|1|23|0|3|4|0|0|
|2|27|1|8|2|0|0|
|3|29|0|0|0|3|0|
|...|...|...|...|...|...|...|


The following code sets up YOLO and then dumps the classification information for the first image.  This notebook only serves to get you started.  Read in all ten images and generate a data frame that looks like the following. Use the **submit** function as you did in previous assignments.

### Installing YOLOv5

YOLO is not available directly through either PIP or CONDA.  Additionally, YOLO is not installed in Google CoLab by default. Therefore, whether you wish to use YOLO through CoLab or run it locally, you need to go through several steps to install it.  This section describes the process of installing YOLO.  The same steps apply to either CoLab or a local install.  For CoLab, you must repeat these steps each time the system restarts your virtual environment.  You must perform these steps only once for your virtual Python environment for a local install.  If you are installing locally, make sure to install to the same virtual environment you created for this course.  The following commands install YOLO directly from its GitHub repository.

In [3]:
!git clone https://github.com/ultralytics/yolov5
%cd /content/yolov5
%pip install -qr requirements.txt

from yolov5 import utils
display = utils.notebook_init()

YOLOv5 🚀 v6.0-187-gf3085ac torch 1.10.0+cu111 CUDA:0 (A100-SXM4-40GB, 40536MiB)


Setup complete ✅ (12 CPUs, 83.5 GB RAM, 42.1/166.8 GB disk)


### Running YOLOv5

In addition to the command line execution we just saw, the YOLO library can easily integrate with Python applications.  The following code adds the downloaded YOLOv5 to Python's environment, allowing **yolov5** to be imported like a regular Python library.

In [4]:
import sys
sys.path.append(str("/content/yolov5"))

import argparse
import os
from pathlib import Path

import cv2
import torch
import torch.backends.cudnn as cudnn

from models.common import DetectMultiBackend
from utils.datasets import IMG_FORMATS, VID_FORMATS, LoadImages, LoadStreams
from utils.general import (LOGGER, check_file, check_img_size, check_imshow, check_requirements, colorstr,
                           increment_path, non_max_suppression, print_args, scale_coords, strip_optimizer, xyxy2xywh)
from utils.plots import Annotator, colors, save_one_box
from utils.torch_utils import select_device, time_sync

from PIL import Image
import requests
from io import BytesIO
import torchvision.transforms.functional as TF

We are now ready to load YOLO, with pretrained weights provided by the creators of YOLO.  It is also possible to train YOLO to recognize images of your own.

In [5]:
device = select_device('')
weights = '/content/yolov5/yolov5s.pt'
model = DetectMultiBackend(weights, device=device, dnn=False)
stride, names, pt, jit, onnx, engine = model.stride, model.names, model.pt, model.jit, model.onnx, model.engine

YOLOv5 🚀 v6.0-187-gf3085ac torch 1.10.0+cu111 CUDA:0 (A100-SXM4-40GB, 40536MiB)

Fusing layers... 
Model Summary: 213 layers, 7225885 parameters, 0 gradients


I built the following function from the code presented in the course module. The function combines some of the code from the module to accept an image and return what YOLO recognizes. Make sure to use the same two thres_xxx values I provided below to match the results that I got.

In [18]:
import numpy as np

half = False
conf_thres=0.1  # confidence threshold (use this value)
iou_thres=0.25  # NMS IOU threshold (use this value)
classes = None
agnostic_nms=False,  # class-agnostic NMS
max_det=1000

def process_yolo(img):
  # Resize image, if needed
  imgsz = [img.height, img.width]
  imgsz = check_img_size(imgsz, s=stride)  # check image size
  original_size = imgsz[:]

  # Prepare model for this image
  model.warmup(imgsz=(1, 3, *imgsz), half=half)  # warmup
  dt, seen = [0.0, 0.0, 0.0], 0
  img2 = img.resize([imgsz[1],imgsz[0]], Image.ANTIALIAS)
      
  # Preprocess image
  img_raw = torch.from_numpy(np.asarray(img2)).to(device)
  img_raw = img_raw.half() if half else img_raw.float()  # uint8 to fp16/32
  img_raw /= 255  # 0 - 255 to 0.0 - 1.0
  img_raw = img_raw.unsqueeze_(0)
  img_raw = img_raw.permute(0, 3, 1, 2)

  # Query YoLo
  pred = model(img_raw, augment=False, visualize=False)
  pred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det)

  # convert these raw predictions into the bounding boxes, labels, and 
  # confidences for each of the images that YOLO recognized.
  results = []
  for i, det in enumerate(pred):  # per image
    gn = torch.tensor(img_raw.shape)[[1, 0, 1, 0]]  

    if len(det):
        # Rescale boxes from img_size to im0 size
        det[:, :4] = scale_coords(original_size, det[:, :4], imgsz).round()

        # Write results
        for *xyxy, conf, cls in reversed(det):
          xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()
          # Choose between xyxy and xywh as your desired format.
          results.append([names[int(cls)], float(conf), [*xyxy]]) 
  return results

### Starter Code

In [7]:
url = "https://data.heatonresearch.com/data/t81-558/sidewalk/sidewalk1.jpg"
response = requests.get(url,headers={'User-Agent': 'Mozilla/5.0'})
img = Image.open(BytesIO(response.content))
results = process_yolo(img)

for itm in results:
  print(itm)

['suitcase', 0.26384803652763367, [tensor(1140., device='cuda:0'), tensor(435., device='cuda:0'), tensor(1214., device='cuda:0'), tensor(502., device='cuda:0')]]
['person', 0.2687147557735443, [tensor(1530., device='cuda:0'), tensor(321., device='cuda:0'), tensor(1546., device='cuda:0'), tensor(361., device='cuda:0')]]
['person', 0.2705391049385071, [tensor(858., device='cuda:0'), tensor(522., device='cuda:0'), tensor(917., device='cuda:0'), tensor(626., device='cuda:0')]]
['umbrella', 0.3001454770565033, [tensor(1545., device='cuda:0'), tensor(294., device='cuda:0'), tensor(1599., device='cuda:0'), tensor(319., device='cuda:0')]]
['umbrella', 0.36170196533203125, [tensor(1474., device='cuda:0'), tensor(323., device='cuda:0'), tensor(1528., device='cuda:0'), tensor(353., device='cuda:0')]]
['person', 0.3645065724849701, [tensor(843., device='cuda:0'), tensor(527., device='cuda:0'), tensor(866., device='cuda:0'), tensor(568., device='cuda:0')]]
['person', 0.5860028266906738, [tensor(907



In [None]:
# Add your solution here, put your results into submit_df

# This is your student key that I emailed to you at the beginnning of the semester.
key = "5iuwhudihwiao6dsfw7dE2ml08iNfVOg6l0O3M06"  # This is an example key and will not work.

# You must also identify your source file.  (modify for your local setup)
file='/content/drive/MyDrive/Colab Notebooks/assignment_yourname_class7.ipynb'  # Google CoLab

submit_df.to_csv("/content/drive/MyDrive/7.csv")
submit(source_file=file,data=[submit_df],key=key,no=7)