![](https://raw.githubusercontent.com/RohanGautam/intro-to-opencv-workshop/master/assets/workshop-banner.png)

# Introduction to computer vision with python
> This workshop introduces several computer vision fundamentals.

The notebook source is locaed at [the github repository](https://github.com/RohanGautam/intro-to-opencv-workshop/blob/master/notebook/cv_workshop.ipynb)

Some resources you might find helpful:
- [Intro to colab](https://colab.research.google.com/notebooks/welcome.ipynb)
- [Intro to python](https://github.com/RohanGautam/intro-to-python-workshop)

## Libraries we'll be using 
- [Numpy](https://github.com/numpy/numpy) has powerful multidimensional arrays, and is a fundamental package for scientific computing with python.
- [Matplotlib](https://github.com/matplotlib/matplotlib) is a comprehensive library for creating static, animated, and interactive visualizations in Python.
- [Skimage](https://scikit-image.org/) A collection of algorithms for working with images.
- [OpenCv](https://docs.opencv.org/4.5.2/d6/d00/tutorial_py_root.html) is a open source compter vision library in python.


In [None]:
import numpy as np
import matplotlib.pyplot as plt
import cv2 
from skimage import io

## Loading, Displaying and Saving

Let's load some images! Here, we'll be loading(downloading) images from URLs. We could have also read the images locally from our filesystem.


In [None]:
puppy_url = "https://raw.githubusercontent.com/RohanGautam/intro-to-opencv-workshop/master/assets/doggo.jpg"
image_rgb = io.imread(puppy_url) # could have also been a path to an image in your filesystem

# for historical reasons, opencv uses BGR for everything, so we have to convert it
image = cv2.cvtColor(image_rgb, cv2.COLOR_RGB2BGR)

In [None]:
def show_cv_image(image):
    """This is a utility function to show an opencv image in  jupyter notebook.
    On your system, you can use the `cv2.imshow` method. Here, we show the BGR image as RGB and vice versa,
    so that we can always see opencv's version of the image correctly!"""
    # convert BGR to RGB then show ( reorder image channels)
    plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))

In [None]:
show_cv_image(image)

In [None]:
show_cv_image(cv2.hconcat((image_rgb, image)))

In [None]:
cv2.imwrite("puppy.jpg", image)

In [None]:
# EXERCISE
# Find an image on google images, and get it's URL by Right-clicking>copy image address(chrome)/copy image link(firefox)
# Load it up in this cell, convert the image to opencv's BGR format, and show it using the `show_cv_image` 
# convinience function!
# assign it to the variable name `my_image`. We'll use it in further exercises below!

## Image basics

### Pixels
An image is a set of pixels. It doesn't get finer-grained than that. You can think of an image as a grid of pixels.

Most pixels are represented in two ways :
- Color : In the RGB color space, each pixel consists of 3 numbers - one for red, green, and blue respectively. Each of them go between 0 and 255.
- Grayscale : In a grayscale image, each pixel is represented by one number between 0 and 255. 0 for black, 255 for white.

Some examples of a single pixel color in the RGB color space:
- (255,255,255) : We fill up all the buckets for white
- (0,0,0) : Empty the buckets for black
- (255,0,0): Fill up only the red bucket for pure red

Start to see the pattern? We get all the colors by tweaking the RGB values.

---
> RGB is the most common color space. Why does opencv use BGR? This is because of legacy reasons. Really old film cameras used BGR, so that's how opencv started using it. But when convention changed, changing the default color space would break a lot of programs for people already using opencv. So BGR stuck 🤷

In [None]:
image

In [None]:
# image is a `numpy` array. the array thus has properties such as "shape" which we can access.
print(image.shape)
print(f"Width: {image.shape[1]}px")
print(f"Height: {image.shape[0]}px")
print(f"Channels: {image.shape[2]}")

### The image coordinate system
Understanding the image coordinate system is fundamental to how we will access pixels. It begins at the top-left corner, and increased downwards.

Note that it begins with zero, as python is zero indexed! so an 8x8 grid like the one below, goes from (0,0) as the top-left corner and (7,7) and the bottom-right corner.
![](https://raw.githubusercontent.com/RohanGautam/intro-to-opencv-workshop/master/assets/image-coord-system.png)


In [None]:
image[0,0] # note that this is in BGR!

### Cropping and Drawing on images

In [None]:
# crop the image to 100px by 100px starting from the top-right by "slicing the array"
# start_y, end_y, start_x, end_x
show_cv_image(image[0:900, 0:900])

In [None]:
from copy import deepcopy as copy
img_copy = copy(image)

# directly modifying pixels of `img_copy`
img_copy[0:900, 0:900] = (0,255,0) #(blue, GREEN, red)

# "horizontally concatenate" the original and modified image to show the change
show_cv_image(cv2.hconcat((image, img_copy)))


In [None]:
# making a diagonal on an image
diagonal_len = 900
img_copy = copy(image)
for i in range(diagonal_len):
    img_copy[i,i] = (255,0,0)
show_cv_image(cv2.hconcat((image, img_copy)))
    

OpenCV also has drawing utilities if we want to draw other basic shapes, lines, etc. 

In [None]:
img_line = copy(image)
img_circle = copy(image)
img_rectangle = copy(image)

green = (0,255,0)
blue = (255,0,0)
red = (0,0,255)

cv2.line(img_line, (0,0), (900,900), green, 20)
cv2.rectangle(img_rectangle, (200,200), (600,900), blue, 20)
cv2.circle(img_circle, (500,500), 200, red, 20)

show_cv_image(cv2.hconcat((image, img_line, img_circle, img_rectangle)))

We should be proud, we just learnt how to access and manipulate pixels to our liking! Let's do an exercise before we carry on.

In [None]:
# EXERCISE
# In our first exercise, you downloaded and displayed your own image, called `my_image`. This is where we use it!
# 1. Print the pixel at index [0,0].
# 2. Draw a circle anywhere on your image using the opencv utility!

_SELF EXPLORATION_: Challenge yourself with this problem! Add some cells below and get to making some \~art\~.
In the picture below, The following is randomly generated
- The center of the circle
- The radius of the circle
- The color of the circle (a random number between 0-255 for B, G and R each)

This gives you a unique piece of art every time you run your code!

Hint: Python as a built-in module called `random` which is used to generate random numbers. `random.randint` might be a useful function to check out.

Hint: You can create a blank canvas of size 300x300 like so : `canvas = np.zeros((300, 300, 3), dtype = "uint8")`. Execute and play with it to see how the array is created! Don't hesitate to google if you're confused. Googling is a big part on learning how to program.
![](https://raw.githubusercontent.com/RohanGautam/intro-to-opencv-workshop/master/assets/art-circles.png)

## Image processing
We’ll start off with basic image transformations. Then, we’ll explore other types of image processing techniques, including image arithmetic, bitwise operations, and masking.

### Image Transformations
We'll take a look at basic image transformation like translating, rotation, and so on.

#### Translation
The shifting of an image along the `x` and `y` axis.

We use [Affine transformations](https://docs.opencv.org/3.4/d4/d61/tutorial_warp_affine.html) to carry out these transformations. An Affine transformation is a transformation of the image that preserves lines and parallelism (but not necessarily distances and angles). 

![](https://raw.githubusercontent.com/RohanGautam/intro-to-opencv-workshop/master/assets/affine_matrix.png)

In [None]:
x = 200
y = 200

h = image.shape[0]
w = image.shape[1]

# Embedded in the matrix below is a 2x2 identity matrix (`A`, the image isn't changed) and a 2x1 `b` matrix
# which is the part which is added to every pixel.
matrix = np.float32([
    [1, 0, x],
    [0, 1, y]
])

# pass the image, matrix, and the (width,height)
shifted = cv2.warpAffine(image, matrix, (w,h))
show_cv_image(shifted)

In [None]:
# Exercise: experiment with negative x and y values in the transform matrix above and see what happens!

#### Rotation

It is what it is.

Rotate an image about a point by an angle.

In [None]:
h = image.shape[0]
w = image.shape[1]

center = (w//2, h//2)
# can also specify the scale of an image. Note that we don't have to manually define the matrix
rotation_matrix = cv2.getRotationMatrix2D(center, 45, 1)
rotated = cv2.warpAffine(image, rotation_matrix, (w,h))
show_cv_image(rotated)

In [None]:
# EXERCISE: Rotate your image (`my_image`) by 75 degrees!

Not all image transformations are affine transforms. For example, Resizing, cropping, blurring and so on. Some examples are below!

In [None]:
# 1 to flip it vertically, 0 to flip horizontally, and -1 for both horizontal and vertical flip
flipped = cv2.flip(image, 1)
show_cv_image(flipped)

In [None]:
# you'll have to write custom logic to preserve aspect ratios. Preserving them will be a self exploration exercise!
resized = cv2.resize(image, (500, 500), interpolation = cv2.INTER_AREA)
show_cv_image(resized)

### Image arithmetic

#### Addition and subtraction

In [None]:
# a matrix with 100 as every element
M = np.ones(image.shape, dtype="uint8") * 100
M

In [None]:
image.shape == M.shape

In [None]:
added = cv2.add(image, M) # brings the pixels closer to white. If value goes above 255, it clips it and keeps it at 255.
subtracted = cv2.subtract(image, M) # brings pixels closer to black

show_cv_image(cv2.hconcat((image, added, subtracted)))


### Masks
Now we are ready to explore masking, an extremely powerful and useful technique in computer vision and image processing.

It helps us focus only on the areas that interest us.

Let's look at an example directly.

In [None]:
# make a all-black canvas to draw a mask in. Note, this doesn't have channels as it's a greyscale image.
mask = np.zeros(image.shape[:2], dtype="uint8") 
# calculate the image center
(cX, cY) = (image.shape[1] // 2, image.shape[0] // 2) 
# pass in the mask, the start and end points of the rectangle's diagonal, the color (255 is white), and the mode
# (-1) means fill the rectangle
cv2.rectangle(mask, (cX-200, cY-200), (cX+200, cY+200), 255, -1)
show_cv_image(mask)

In [None]:
mask.shape

In [None]:
img_copy = copy(image)
# a "bitwise and" keeps the area corresponding to the white pixels in the mask
masked_image = cv2.bitwise_and(img_copy,img_copy, mask=mask)
# show_cv_image(img_copy)
show_cv_image(cv2.hconcat((img_copy, masked_image)))

In [None]:
img_copy = copy(image)
flipped_mask = cv2.bitwise_not(mask,mask)
# a "bitwise and" keeps the area corresponding to the white pixels in the mask
masked_image = cv2.bitwise_and(img_copy,img_copy, mask=flipped_mask)
# show_cv_image(img_copy)
show_cv_image(cv2.hconcat((img_copy, masked_image)))

Masks might not seem interesting right now, but they help us focus our computation to specific parts of an image. This will all come together soon!

## Mini project! Counting coins
![](https://raw.githubusercontent.com/RohanGautam/intro-to-opencv-workshop/master/assets/coins.png)

This is the process we'll be following.
- We'll first blur the image, so as to remove the random coin detailing. This will make it easier for the next step,
- Edge detection! Blurring has helped us avoid detect unesessary edges.
- Find the closed curves, known as "contours".


In [None]:
# get the image 
coins_rgb = io.imread("https://raw.githubusercontent.com/RohanGautam/intro-to-opencv-workshop/master/assets/coins.png")
coins_bgr = cv2.cvtColor(coins_rgb, cv2.COLOR_RGB2BGR) # convert it to grayscale
coins = cv2.cvtColor(coins_bgr, cv2.COLOR_BGR2GRAY) # convert it to grayscale
show_cv_image(coins)

In [None]:
# (5,5) is the kernel size.
# A gaussian blur works by replacing each pixel with the weighted mean of a 5x5 pixel region around it.
blurred = cv2.GaussianBlur(coins, (5,5), 0)
show_cv_image(cv2.hconcat((coins, blurred)))

In [None]:
# threshold1 and threshold2
# Any gradient value larger than threshold2 is considered
# to be an edge. 
# Any value below threshold1 is considered not to be an edge.
canny = cv2.Canny(blurred, 30, 150)
show_cv_image(cv2.hconcat((blurred, canny)))

In [None]:
# find the contours, ie, the closed curves in the canny output picture.
# A contour is a curve of points, with no
# gaps in the curve. Contours are extremely useful for such
# things as shape approximation and analysis.
(contours, hierarchy) = cv2.findContours(canny.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# we want to draw contours on top of the original(BGR) image
contours_img = coins_bgr.copy() 
for i in range(len(contours)):
    # draw contours of index i
    cv2.drawContours(contours_img, contours, i, (0,255,0), 2)
show_cv_image(contours_img)

In [None]:
## and drum roll ....
print(f"I count {len(contours)} coins!!")

# nice.

---
#### How you might continue...

In [None]:
# create a mask
mask = np.zeros(coins_bgr.shape[:2], dtype="uint8")
for c in (contours):
    cv2.fillPoly(mask, pts =[c], color=(255,255,255))
show_cv_image(mask)

In [None]:
show_cv_image(cv2.bitwise_and(coins_bgr, coins_bgr, mask=mask))

... and keep building on top of this, depending on your application!