computer-vision-course

Sources & udemy course

Udemy course: https://www.udemy.com/share/10143y3@GmhgL2nFG3QDxH4DGeblXwsz6uiLPxBsfSLtQVXG8rsIo30x1YTrkmNiyJJGdCgVwA==/ Great visualization page: https://setosa.io/ev/image-kernels/

Environment setup

Python Virtualenv

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r requirements.txt

Anaconda

Python Path

Add the conda to your python path:

export PATH="/home/peter/anaconda3/bin/:$PATH"

Open Anaconda terminal bash

source ~/anaconda3/bin/activate

Create the conda enviroment of the course

conda env create -f cvcourse_linux.yml

Activate and deactivate the created conda environment

# To activate this environment, use                                             
#     $ conda activate python-cvcourse                                          
# To deactivate an active environment, use                                                   
#     $ conda deactivate

Google Colab

Google colab already has the environment set up to all the needs of the udemy course. Packages needed:

matplotlib
cv2
Image
numpy

Create a google colab file

Go to your google drive account and click on the three stripes on the top left. Then click "more". You should see now Google Colab as a creatable file.

This will create you a python notebook that is stored in your google drive. You can sync it also with your GitHub Account and give it access to your google drive data.

Sections

1 - Numpy and image basics

This section covers all the numpy and image basics that you need to get started.

Summary & Notebook

NOTEBOOK: section_1_numpy_and_image.ipynb

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from PIL import Image

plt.imhshow(image)  # Most used command to show images throughout the lectures

3 - Image Basics With OpenCV

This section covers opening and storing images, as well as drawing on images and mouse click events to draw on images.

Summary & Notebooks

NOTEBOOK: section_3_opening_files_with_openCv_script.py

import cv2

img = cv2.imread(DATA_DIR)
while True:
  cv2.imshow("image_name", img)
  cv2.waitKey(1) & 0xFF == 27:   # This stops the while loop when pressing the "esc" key
    break
cv2.destroyAllWindows()    #  This closes the window

NOTEBOOK: section_3_opening_files_with_openCv_script.py

# OpenCV and matplotlib have different order of the color channels
# matplotlib -> RGB: Red, Green, Blue
# OpenCV -> BGR: Blue, Green, Red
rgb_colored_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2RGB) # conversion between color channels

# Read image in grayscale
img_grayscale = cv2.imread(DATA_DIR, cv2.IMREAD_GRAYSCALE)

# Resizing an image
resized_img = cv2.resize(rgb_colored_img, (1000, 400)) # fixed values
resized_img = cv2.resize(rgb_colored_img, (0,0),fix_img,width_ratio,height_ratio) # with a ratio

# flipping an image
cv2.flip(img, 0) # Around the x axis

Object Tracking

Optical Flow

MeanShift and CamShift

TODO: Add description and links TODO: Add disadvantages

Tracking API Methods

OpenCV offers API methods that can be used for object tracking. The API makes it easy to swap and replace single methods in order to change the tracking algorithm.

Overview / Examples

Boosting Tracker

Based off AdaBoost algorithm (HAAR Cascades). Evaluation across multiple framse.

Pros	Cons
Very well known and studied algorithm	Doesn't know when tracking has failed
	Better techniques available

Multiple Instance Learning (MIL) Tracker

Similar to BOOSTING but considers neighborhood of points around the current location to create multiple instances.

Pros	Cons
Good performance and dosn't drift as much as Boosting	Doesn't know when tracking has failed
	Can't recover from full obstruction

Kernelized Correlation Filters (KCF) Tracker

Exploits some properties of MIL Tracker and considers many overlapping points in the data. This leads to more accuracte and faster tracking.

Pros	Cons
Better performance than MIL and Boosting	Can't recover from full obstruction
Good first choice

Tracking, Learning and Detection (TLD) Tracker

Tracker follows the object through the frames (frame by frame). The detector localizes all appearances that have been observed so far and corrects the tracker if necessary.

Pros	Cons
Good performance	Can provide many false positives
Can handle obstruction in frames
Scale invariant (Can deal with large changes in scale)

MedianFlow Tracker

Tracks the object in both directions (forward and backward) in time and measures the discrepancies between these two trajectories.

Pros	Cons
Good at reporting failed tracking	Fails under large motion
Works well with predictable motion (not pop up things)

DeepLearning for Computer Vision

This section will use the python keras library.

Machine Learning Basics

Method of data analysis that automates analytical model building.

It uses algorithms that iteratively learn from data.

Applications of Machine Learning

Fraud detection
real-time ads on web pages
pricing models
pattern and image recognition
text sentiment analysis
email spam filters
...

Supervised learning

The algorithms are trained with labelled example data, that are used as input where the desired output is known.

Example: Pictures of a dog or a certain letter and the respective label.

Supervised learning is commonly used in applications where the historical data predicts likely future events.

Process

Data Acquisition: Getting the data that is needed. Here open source data can be used
Cleaning up: Resizing,...
Splitting data into test & training data
Training & Building: Apply the training and build a model for prediction
Model testing: Apply the built model to the testdata and see if the classification worked out well
Model deployment: Use the trained model for data that was not part of the training & test data set

Classification metrics

Accuracy
Recall
Precision
F1-Score

Convolutional Neural Networks (CNNs)

Tensors

Tensors are needed to feed in images to the network Tensors are N-Dimensional that are build up to

scalar values (e.g. 3)
Vectors (e.g. [3,4,5])
Matrix (e.g. [[3,4,5], [4,5,6]])
Tensor (e.g. [[[3,4,5], [4,5,6]], [[2,1,5], [9,7,6]]]) -> Matrix of Matrices (Higher dimensional arrays)

Tensors make it convenient to feed in set of images into our models - (Image, Height, Width, Color) -> Multiple arrays

Architecture

TODO

Pooling / subsampling layer

Creating a filter e.g. 2x2 and move it use the filter to set up a next image. E.g. [[2,1][-1,1]] and apply it to the image. This creates a new image where 4 pixels move to one multiplied by the 2x2 filter.

This will remove a lot of inromation. It reduces computation time a lot.

Dropout

During training, units are randomly dropped along with their connections.

Used as regularization to hlep prevent overfitting. Prevents units co-adapting too much.

Famous CNNs

LeNet-5
AlexNet
GoogLeNet
ResNet

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
data		data
documentation		documentation
images		images
scripts		scripts
trained_models		trained_models
.gitignore		.gitignore
6_54_test_detection.ipynb		6_54_test_detection.ipynb
README.md		README.md
assessment_drawing_circles.py		assessment_drawing_circles.py
requirements.txt		requirements.txt
section_1_numpy_and_image.ipynb		section_1_numpy_and_image.ipynb
section_3_drawing_circles_on_images_with_a_mouse.py		section_3_drawing_circles_on_images_with_a_mouse.py
section_3_drawing_on_images.ipynb		section_3_drawing_on_images.ipynb
section_3_drawing_rectangles_on_images_with_a_mouse.py		section_3_drawing_rectangles_on_images_with_a_mouse.py
section_3_image_basis_opencv.ipynb		section_3_image_basis_opencv.ipynb
section_3_opening_files_with_openCv_script.py		section_3_opening_files_with_openCv_script.py
section_4_27_image_processing_blurring_smothing.ipynb		section_4_27_image_processing_blurring_smothing.ipynb
section_4_28_image_processing_morphological_operators.ipynb		section_4_28_image_processing_morphological_operators.ipynb
section_4_30_image_processing_histograms.ipynb		section_4_30_image_processing_histograms.ipynb
section_4_image_processing_blending_and_pasting_images.ipynb		section_4_image_processing_blending_and_pasting_images.ipynb
section_4_image_processing_color_conversion.ipynb		section_4_image_processing_color_conversion.ipynb
section_4_image_processing_image_thresholding.ipynb		section_4_image_processing_image_thresholding.ipynb
section_6_42_object_detection_template_matching.ipynb		section_6_42_object_detection_template_matching.ipynb
section_6_44_object_detection_corner_detection.ipynb		section_6_44_object_detection_corner_detection.ipynb
section_6_45_object_detection_edge_detection.ipynb		section_6_45_object_detection_edge_detection.ipynb
section_6_46_object_detection_grid_detection.ipynb		section_6_46_object_detection_grid_detection.ipynb
section_6_47_object_detection_contour_detection.ipynb		section_6_47_object_detection_contour_detection.ipynb
section_6_48_object_detection_feature_matching.ipynb		section_6_48_object_detection_feature_matching.ipynb
section_6_50_watershed_algorithm.ipynb		section_6_50_watershed_algorithm.ipynb
section_6_51_watershed_custom_seeds.ipynb		section_6_51_watershed_custom_seeds.ipynb
section_6_53_face_detection.ipynb		section_6_53_face_detection.ipynb
section_7_57_object_tracking.ipynb		section_7_57_object_tracking.ipynb
section_8_75_keras_basics.ipynb		section_8_75_keras_basics.ipynb
section_8_77_CNN_on_MNIST_image_data.ipynb		section_8_77_CNN_on_MNIST_image_data.ipynb
section_8_78_keras_cnn_cifar10.ipynb		section_8_78_keras_cnn_cifar10.ipynb
section_8_79_custom_image_classification.ipynb		section_8_79_custom_image_classification.ipynb

pkhurt/computer-vision-course

Folders and files

Latest commit

History

Repository files navigation