Disclaimer - This workshop was made for the $PATH to iNTUition workshop series leading to iNTUition v3.0.
OpenCV is an open source library of pre implemented computer vision algorithms. OpenCV was implemented in C++ and meant to be used in C++ but now we have bindings for OpenCV in Python, Java, Matlab and even in node.js.
Clone this repository or download the zip file from GitHub.
- Install python 2.7 (For OSX and Linux users, python 2.7 should be installed by default).
- Add python to your system path. (For Windows users, navigate to This PC -> System Properties or Settings -> Advanced System Settings -> Environment Variables -> Path -> Edit and then add
C:\Python27
andC:\Python27\Scripts
to the Path). - Restart your terminal or command prompt.
- On your terminal/command prompt enter
pip install numpy
to install numpy. - For OpenCV install instructions, refer here.
- (Additional) Install a text editor such as sublime text to edit your code.
- (Additional) For Windows users, I would advice downloading a bash shell such as git bash.
- Navigate to the directory with the script in your command prompt or terminal. eg:
cd <Path to Repository>/Task\ 1
. - Run on command prompt or terminal with
python <script>.py
.
Images are seen by computers as a collection of pixels. A RGB image which appears to us as colored images would be seen by a computer as a collection of pixels whose values range from 0 to 255. A RGB image has 3 components for each pixel denoting Red, Green and Blue. A grayscaled image on the other hand would have 1 component denoting the intensity of the color black.
Computer vision makes use of these pixels to derive different conclusion about an image. A simple example would be - a transition from a pixel value of 0 to 1 is not very significant whereas a transition from 0 to 255 could be considered an edge.
Write a program to simply read and display an image with OpenCV. Use the cv2.imwrite()
function to then, store this image in your directory.
Write a program to capture a video and display it with OpenCV. In addition, write the grayscaled frames to a file. Try to read this image in later.
Let us explore drawing functions. OpenCV allows you to draw on images with drawing functions. The options are lines, circles, rectangles, polygons and texts.
Exercise - Recreate the Olympics symbol as close as possible!
Let us now explore Image Operations. OpenCV-python stores images as numpy arrays. These arrays have several properties which turn out to be very useful factors of the images.
- Shape of the image returns the number of rows, columns and channels. The channels factor can determine whether the image is grayscaled or not.
- Size of the image returns the total number of pixels.
- Dtype of the image returns the datatype.
You can also pick a region of interest from this image.
Another operation is an Image Blend.
Exercise - Try to overwrite a region of the image with a black image.
Color spaces are an important concept in image processing. A few commonly known color spaces are RGB, HSV and GRAY.
Color maps let you map known color spaces to other pixel values. A few known color maps are RAINBOW, AUTUMN and so on.
Exercise - Try to get the image looking close to a thermal camera output.
Canny Edge Detection is a commonly used computer vision algorithm to detect edges in images. We take a look at the transition from black to white in a grayscaled and if continuous elements map the same transition it can be determined as an edge.
HoughTransform is another commonly used computer vision algorithm to detect lines or circles in an image. HoughLine Transform makes use of the parametric equation of a line to determine if an edge is a line or not.
Exercise - Find an image online with two circles. Detect the two circles with OpenCV and connect the centers with a line.
HaarCascades is based on a paper by Viola and Jones on Rapid Object Detection. This algorithm can be used to train models to detect different types of objects.
Exercise - Extend the detector to detect eyes. Then, extend the code to detect faces and eyes in a video.
If you have any questions or feedback on the workshop, write to me at nikv96@gmail.com. I'd love to hear what you think about this workshop. :)
If you find any mistakes/typos/bugs, please post an issue or create a pull request and I'll take a look at it! Thanks!