Udemy course: https://www.udemy.com/share/10143y3@GmhgL2nFG3QDxH4DGeblXwsz6uiLPxBsfSLtQVXG8rsIo30x1YTrkmNiyJJGdCgVwA==/ Great visualization page: https://setosa.io/ev/image-kernels/
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r requirements.txt
Add the conda to your python path:
export PATH="/home/peter/anaconda3/bin/:$PATH"
source ~/anaconda3/bin/activate
conda env create -f cvcourse_linux.yml
# To activate this environment, use
# $ conda activate python-cvcourse
# To deactivate an active environment, use
# $ conda deactivate
Google colab already has the environment set up to all the needs of the udemy course. Packages needed:
- matplotlib
- cv2
- Image
- numpy
Go to your google drive account and click on the three stripes on the top left. Then click "more". You should see now Google Colab as a creatable file.
This will create you a python notebook that is stored in your google drive. You can sync it also with your GitHub Account and give it access to your google drive data.
This section covers all the numpy and image basics that you need to get started.
NOTEBOOK: section_1_numpy_and_image.ipynb
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from PIL import Image
plt.imhshow(image) # Most used command to show images throughout the lectures
This section covers opening and storing images, as well as drawing on images and mouse click events to draw on images.
import cv2
img = cv2.imread(DATA_DIR)
while True:
cv2.imshow("image_name", img)
cv2.waitKey(1) & 0xFF == 27: # This stops the while loop when pressing the "esc" key
break
cv2.destroyAllWindows() # This closes the window
# OpenCV and matplotlib have different order of the color channels
# matplotlib -> RGB: Red, Green, Blue
# OpenCV -> BGR: Blue, Green, Red
rgb_colored_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2RGB) # conversion between color channels
# Read image in grayscale
img_grayscale = cv2.imread(DATA_DIR, cv2.IMREAD_GRAYSCALE)
# Resizing an image
resized_img = cv2.resize(rgb_colored_img, (1000, 400)) # fixed values
resized_img = cv2.resize(rgb_colored_img, (0,0),fix_img,width_ratio,height_ratio) # with a ratio
# flipping an image
cv2.flip(img, 0) # Around the x axis
TODO: Add description and links TODO: Add disadvantages
OpenCV offers API methods that can be used for object tracking. The API makes it easy to swap and replace single methods in order to change the tracking algorithm.
Based off AdaBoost algorithm (HAAR Cascades). Evaluation across multiple framse.
Pros | Cons |
---|---|
Very well known and studied algorithm | Doesn't know when tracking has failed |
Better techniques available |
Similar to BOOSTING but considers neighborhood of points around the current location to create multiple instances.
Pros | Cons |
---|---|
Good performance and dosn't drift as much as Boosting | Doesn't know when tracking has failed |
Can't recover from full obstruction |
Exploits some properties of MIL Tracker and considers many overlapping points in the data. This leads to more accuracte and faster tracking.
Pros | Cons |
---|---|
Better performance than MIL and Boosting | Can't recover from full obstruction |
Good first choice |
Tracker follows the object through the frames (frame by frame). The detector localizes all appearances that have been observed so far and corrects the tracker if necessary.
Pros | Cons |
---|---|
Good performance | Can provide many false positives |
Can handle obstruction in frames | |
Scale invariant (Can deal with large changes in scale) |
Tracks the object in both directions (forward and backward) in time and measures the discrepancies between these two trajectories.
Pros | Cons |
---|---|
Good at reporting failed tracking | Fails under large motion |
Works well with predictable motion (not pop up things) |
This section will use the python keras library.
Method of data analysis that automates analytical model building.
It uses algorithms that iteratively learn from data.
- Fraud detection
- real-time ads on web pages
- pricing models
- pattern and image recognition
- text sentiment analysis
- email spam filters
- ...
The algorithms are trained with labelled example data, that are used as input where the desired output is known.
Example: Pictures of a dog or a certain letter and the respective label.
Supervised learning is commonly used in applications where the historical data predicts likely future events.
- Data Acquisition: Getting the data that is needed. Here open source data can be used
- Cleaning up: Resizing,...
- Splitting data into test & training data
- Training & Building: Apply the training and build a model for prediction
- Model testing: Apply the built model to the testdata and see if the classification worked out well
- Model deployment: Use the trained model for data that was not part of the training & test data set
- Accuracy
- Recall
- Precision
- F1-Score
Tensors are needed to feed in images to the network Tensors are N-Dimensional that are build up to
- scalar values (e.g. 3)
- Vectors (e.g. [3,4,5])
- Matrix (e.g. [[3,4,5], [4,5,6]])
- Tensor (e.g. [[[3,4,5], [4,5,6]], [[2,1,5], [9,7,6]]]) -> Matrix of Matrices (Higher dimensional arrays)
Tensors make it convenient to feed in set of images into our models - (Image, Height, Width, Color) -> Multiple arrays
TODO
Creating a filter e.g. 2x2 and move it use the filter to set up a next image. E.g. [[2,1][-1,1]] and apply it to the image. This creates a new image where 4 pixels move to one multiplied by the 2x2 filter.
This will remove a lot of inromation. It reduces computation time a lot.
During training, units are randomly dropped along with their connections.
Used as regularization to hlep prevent overfitting. Prevents units co-adapting too much.
- LeNet-5
- AlexNet
- GoogLeNet
- ResNet