# Image Detector from a WebCam in Real Time

This script detect objects form the webcam. By default the webcam is set to 640 x 480, 5 fps, no audio.
ML model is yolo4.  It was trained with 320x320 images to detect the following objects:

 ['person', 'bicycle', 'car', 'motorbike', 'aeroplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'sofa', 'pottedplant', 'bed', 'diningtable', 'toilet', 'tvmonitor', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']

When the webcam starts, there will be as buttons as objects we had pre-selected in the code to be identified. When the button is active the bouding borders are draw in the video, otherwise the object is not bounding bordered.

The video stops when the ESC key is pressed. As soon as the video stops, the script requests to press key "y", if you want to delete all captured images used to make the output video (delete all files in the folder), or "n" to keep them in the folder.

The captured images are saved into "/image_for_video" folder
The output video is saved into "/video_output" folder, and named as "video_output.mp4".

## Import modules

In [32]:
import cv2
import os
import glob
import re
from tkinter import Tk     # from tkinter import Tk for Python 3.x
from tkinter.filedialog import askopenfilename
from gui_buttons import Buttons
from numpy import append


## Create a button for objects (classes) to be detected

In [33]:
button = Buttons()
button.add_button("person",20,20)        #last parameter(,20) is y coordenate, next button has +20 coordinate, and so on.
button.add_button("sports ball",20,40)
button.add_button("cell phone",20,60)

colors = button.colors

## AI Model Parameters

In [34]:
net = cv2.dnn.readNet('dnn_model\yolov4-tiny.weights','dnn_model\yolov4-tiny.cfg')
model = cv2.dnn_DetectionModel(net)
model.setInputParams(size=(320,320), scale=1/255)

## Function definition for mouse clicks on buttons

In [35]:
def click_button(event, x,y,flags,params):
    global button_object
    if event == cv2.EVENT_LBUTTONDOWN:
        button.button_click(x,y)
        

## Window creation to be selected to the mouse callback mode 

In [36]:
cv2.namedWindow('Frame')
cv2.setMouseCallback('Frame',click_button)

## Read the classes file and generate a numpy array with them

In [37]:
classes=[]
with open('dnn_model\classes.txt','r') as file_object:
    for class_name in file_object.readlines():
        class_name = class_name.strip()
        classes.append(class_name)

#print("Objects list")
#print(classes)

## Open Webcam

In [38]:
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
print("Openning webcam")

Openning webcam


## Capture frames, predict, show and stop video until "Esc" key is pressed

In [39]:
count = 100000
while(cap.isOpened()):
    ret, frame = cap.read()
    #frame = cv2.flip(frame, flipCode=-1)   #Flip the frame in both axes.

    if ret==False:
        print("Error Loading Image")
        break
    
    # get activated buttons list
    active_buttons = button.active_buttons_list() 

    (class_ids, scores, bboxes) = model.detect(frame, confThreshold=0.3, nmsThreshold=0.4)
    
    for class_id, score, bbox in zip(class_ids, scores, bboxes):
        class_id = int(class_id )    # need just for cv2 4.5.3
        (x, y, w , h) = bbox
        class_name = classes[class_id]
        color = colors[class_id]
        
        if class_name in active_buttons:
            cv2.putText(frame,str(class_name), (x,y-10), cv2.FONT_HERSHEY_PLAIN,2, color,2 )
            cv2.rectangle(frame,(x,y),(x+w,y+h),color,3)
    
    button.display_buttons(frame)
    
    cv2.imshow("Frame",frame)
    
    cv2.imwrite((r'images_for_video\frame'+str(count)+'.png'), frame)     # save frame as JPEG file      
    count += 1
    
    key =  cv2.waitKey(1)
    if key == 27 :
        break

cap.release()
cv2.destroyAllWindows() 

# Make a video from captured frames

In [40]:
image_folder = 'images_for_video'
video_file = 'video_output\\video_output.mp4'
image_size = (640, 480)
fps = 5

images = [img for img in os.listdir(image_folder) if img.endswith(".png")]
images.sort()

out = cv2.VideoWriter(video_file, cv2.VideoWriter_fourcc(*'MP4V'), fps, image_size)

img_array = []
print("Making output video...")
for filename in images:
    img = cv2.imread(os.path.join(image_folder, filename))
    img_array.append(img)
    out.write(img)
print("Output video was saved into video_output folder and named ...video_output.mp4")
out.release()

Making output video...
Output video was saved into video_output folder and named ...video_output.mp4


# Delete or keep frames

In [41]:
answer = input("Do you want to erase the images? Press y or any other key to keep them, then press Enter key: ") 
if answer == "y": 
    path = 'images_for_video'
    print ("Deleting captured images...")
    for file_name in os.listdir(path):
    # construct full file path
        file = path + "\\" + file_name
        if os.path.isfile(file):
            os.remove(file)
    print ("Captured images have been deleted")
else :
    print ("Captured images are saved into /images_for_video") 

Deleting captured images...
Captured images have been deleted


In [42]:
cv2.destroyAllWindows() 