From d7a7e94ef7297a0143256d53fce0226271997427 Mon Sep 17 00:00:00 2001 From: Abhijeet Date: Mon, 14 Jul 2014 19:10:14 +0530 Subject: [PATCH 1/3] ANPR notebook added --- .../computer_vision/ANPR.ipynb | 1066 +++++++++++++++++ 1 file changed, 1066 insertions(+) create mode 100644 doc/ipython-notebooks/computer_vision/ANPR.ipynb diff --git a/doc/ipython-notebooks/computer_vision/ANPR.ipynb b/doc/ipython-notebooks/computer_vision/ANPR.ipynb new file mode 100644 index 00000000000..a1a9577428a --- /dev/null +++ b/doc/ipython-notebooks/computer_vision/ANPR.ipynb @@ -0,0 +1,1066 @@ +{ + "metadata": { + "name": "", + "signature": "sha256:0224f014906318b507669d07c79b43a19f6049e114bfa8ff5121c58f4ab1dcd7" + }, + "nbformat": 3, + "nbformat_minor": 0, + "worksheets": [ + { + "cells": [ + { + "cell_type": "heading", + "level": 1, + "metadata": {}, + "source": [ + "Automatic Number Plate Recognition in Shogun" + ] + }, + { + "cell_type": "heading", + "level": 4, + "metadata": {}, + "source": [ + "By Abhijeet Kislay (GitHub ID: kislayabhi)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This notebook is about performing ANPR to detect automobile license plates in photographs taken between 1-2 metres from a car. We will be introduced to techniques related to image segmentation, feature extraction, pattern recognition , and two important pattern recogntion algorithms SVM and ANN. " + ] + }, + { + "cell_type": "heading", + "level": 2, + "metadata": {}, + "source": [ + "Background" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***Automatic Number Plate Recognition (ANPR)***, is a surveillance method that uses ***Optical Character Recognition (OCR)*** and other methods such as segmentations and\n", + "detection to read vehicle registration plates.\n", + "\n", + "The best results in an **ANPR** system can be obtained with an infrared (IR) camera,\n", + "because the segmentation steps for detection and OCR segmentation are easy, clean,\n", + "and minimize errors. Sadly, we do not use IR photographs here. That is we are going to try and get same results with regular photographs only!\n", + "\n", + "Each country has different license plate sizes and specifications; it is useful to know\n", + "these specifications in order to get the best results and reduce errors. The algorithms\n", + "used here are intended to explain the basics of **ANPR** and the specifications\n", + "for license plates from **Croatia** (Why? because I found it's license plate database fairly easily on the internet), but we can extend them to any country or specification.(check references for the link)\n", + "\n", + "I have loaded few of those plates here.\n", + "\n", + "\n", + "\n", + "The whole algorithmic approach is structured as following:\n", + "1. **Plate Detection**\n", + "2. **Segmentation**\n", + "3. **Classification**\n", + "4. **Plate Recognition**\n", + "5. **OCR Segmentation**\n", + "6. **Feature Extraction**\n", + "7. **OCR Classification**\n", + "\n", + "We will go through each of these steps one by one, covering all glitches and glory and prospects that may further enhance this framework in future." + ] + }, + { + "cell_type": "heading", + "level": 2, + "metadata": {}, + "source": [ + "1. Plate Detection" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First load a car image. It's better to see what we are dealing here" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "#Use the following function when reading an image through OpenCV and displaying through plt.\n", + "def showfig(image, ucmap):\n", + " #There is a difference in pixel ordering in OpenCV and Matplotlib.\n", + " #OpenCV follows BGR order, while matplotlib likely follows RGB order.\n", + " if len(image.shape)==3 :\n", + " b,g,r = cv2.split(image) # get b,g,r\n", + " image = cv2.merge([r,g,b]) # switch it to rgb\n", + " imgplot=plt.imshow(image, ucmap)\n", + " imgplot.axes.get_xaxis().set_visible(False)\n", + " imgplot.axes.get_yaxis().set_visible(False)\n", + " \n", + " \n", + "import matplotlib.pyplot as plt\n", + "import cv2\n", + "import numpy as np\n", + "%matplotlib inline \n", + "\n", + "plt.rcParams['figure.figsize'] = 10, 10 \n", + "\n", + "# Actual Code starts here\n", + "plt.title('Sample Car')\n", + "image_path=\"../../../data/ANPR/sample_2.jpg\"\n", + "carsample=cv2.imread(image_path)\n", + "showfig(carsample,None)" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this step we have to detect all the plates in the current camera frame. Two broad categories in which they can be defined are:\n", + "\n", + "1. ** Segmentation**\n", + "2. **Classification**" + ] + }, + { + "cell_type": "heading", + "level": 3, + "metadata": {}, + "source": [ + "Segmentation" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Segmentation is the process of dividing an image into multiple segments. This\n", + "process is to simplify the image for analysis and make feature extraction easier.\n", + "\n", + "One important feature that we can exploit from Number plates are the high number of vertical edges. But before that, we need to do some handy preprocessing of the current image namely:\n", + "\n", + "1. **grayscale conversion** : color won't help us in this task\n", + "2. **Remove Noise** : A 5x5 Gaussian blur to remove unwanted vertical edges\n", + "\n", + "To find the vertical edges, we will use a Sobel filter and find its first horizontal derivative." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "plt.rcParams['figure.figsize'] = 7,7\n", + "\n", + "# convert into grayscale\n", + "gray_carsample=cv2.cvtColor(carsample, cv2.COLOR_BGR2GRAY)\n", + "showfig(gray_carsample, plt.get_cmap('gray'))" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "# blur the image\n", + "blur=cv2.GaussianBlur(gray_carsample,(5,5),0)\n", + "showfig(blur, plt.get_cmap('gray'))" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "# find the sobel gradient. use the kernel size to be 3\n", + "sobelx=cv2.Sobel(blur, cv2.CV_8U, 1, 0, ksize=3)\n", + "showfig(sobelx, plt.get_cmap('gray'))" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After a Sobel Filter, we apply a threshold filter to obtain a binary image with a threshold value obtained through Otsu's Method.Otsu's algorithm needs an 8-bit input\n", + "image and Otsu's method automatically determines the optimal threshold value:" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "#Otsu thresholding\n", + "_,th2=cv2.threshold(sobelx, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)\n", + "showfig(th2, plt.get_cmap('gray'))" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "By applying a close morphological operation, we can remove blank spaces between\n", + "each vertical edge line, and connect all regions that have a high number of edges. In\n", + "this step we have the possible regions that can contain plates.\n", + "\n", + "First we define our structural element to use in our morphological operation. We will\n", + "use the **getStructuringElement()** function to define a structural rectangular element\n", + "with a 23 x 2 dimension size in our case; this may be different in other image sizes and use this structural element in a close morphological operation using the **morphologyEx()** function:" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "#Morphological Closing\n", + "se=cv2.getStructuringElement(cv2.MORPH_RECT,(23,2))\n", + "closing=cv2.morphologyEx(th2, cv2.MORPH_CLOSE, se)\n", + "showfig(closing, plt.get_cmap('gray'))" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After all these preprocessing steps, we have regions in the image that have the possibility of containing license plates. \n", + "\n", + "This calls for the use of **findContours()** function. This function retrieves the contours of a binary image. We only need the external contours. Hierarchical relationships between the contours does not matter here but later in this notebook, we will be discussing it in a bit more detail." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "contours,_=cv2.findContours(closing, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Lets visualize all the detected contours approximated to there rectangular BoundingBoxes. OpenCV has **minAreaRect()** function exactly for this task. We use **BoxPoints()** function to extract all the four co-ordinates of the rectangle which is then used to draw the boundingbox." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "for cnt in contours:\n", + " rect=cv2.minAreaRect(cnt) \n", + " box=cv2.cv.BoxPoints(rect) \n", + " box=np.int0(box) \n", + " cv2.drawContours(carsample, [box], 0, (0,255,0),2)\n", + "showfig(carsample, None)" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There's too much errors! Well, Lets try to remove the outliers by validating against their area and aspect ratio. \n", + "\n", + "* A normal number plate should have an aspect ratio of atleast more than 3.\n", + "* Outstanding area should be around 8000. Lets make a rough range(maybe there's a exception) of area between 3000 to 16000 pixels. We just don't want the actual number plate to disappear!\n", + "\n", + "To carry out this task we define a separate function." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "#validate a contour. We validate by estimating a rough area and aspect ratio check.\n", + "def validate(cnt): \n", + " rect=cv2.minAreaRect(cnt) \n", + " box=cv2.cv.BoxPoints(rect) \n", + " box=np.int0(box) \n", + " output=False\n", + " width=rect[1][0]\n", + " height=rect[1][1]\n", + " if ((width!=0) & (height!=0)):\n", + " if (((height/width>3) & (height>width)) | ((width/height>3) & (width>height))):\n", + " if((height*width<16000) & (height*width>3000)): \n", + " output=True\n", + " return output\n", + "\n", + "#Lets draw validated contours with red.\n", + "for cnt in contours:\n", + " if validate(cnt):\n", + " rect=cv2.minAreaRect(cnt) \n", + " box=cv2.cv.BoxPoints(rect) \n", + " box=np.int0(box) \n", + " cv2.drawContours(carsample, [box], 0, (0,0,255),2)\n", + "showfig(carsample, None)" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This is a lot better. We are now reduced to few possibilities that may contain a number plate. This is awesome!\n", + "But mind it. We are not going to get more strict and remove those few outliers left using the above mentioned checks! That will be killing the robustness of the system against any new car image!\n", + "\n", + "We need to exploit something other than this. You may see that our License Plates generally have a white background. That means we can use a **flood fill** algorithm.\n", + "\n", + "**flood fill** is very similar to the old Fill color that we used to have in Paint! It tries to spread your choosen color from the point of origin to every direction untill it faces a tangible boundary preventing it to go any further.\n", + "\n", + "For applying this algorithm, we need to have the origin (also known as **seeds**). Since we have no idea how to choose a specific **seed** within these validated rectangles, we will randomize it and hope that atleast one of them suceeds in exploiting the white background of a actual License Plate and **floodfills** a considerable chunk of it" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "# defining a function doing this will come handy.\n", + "def generate_seeds(centre, width, height):\n", + " minsize=int(min(width, height))\n", + " seed=[None]*10\n", + " for i in range(10):\n", + " random_integer1=np.random.randint(1000)\n", + " random_integer2=np.random.randint(1000)\n", + " seed[i]=(centre[0]+random_integer1%int(minsize/2)-int(minsize/2),centre[1]+random_integer2%int(minsize/2)-int(minsize/2))\n", + " return seed" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We want to select the white region and we need several seeds to touch at least one\n", + "white pixel. Now for each seed that we get from the previous function, we use a **floodFill()** function to draw a new mask image to store the new closest cropping region\n", + "\n", + "The **floodFill()** function fills a connected component with color into a mask image starting from a seed point, and sets maximal lower and upper brightness/color difference between the pixel to fill.\n", + "\n", + "The required parameters that are needed:\n", + "* **newval**: the new color that we want to put in\n", + "* **lodiff & updiff**: the maximal lower and maximal upper brightness/color difference\n", + "* The **flag** parameter is a combination of **Lower bits** and **Upper bits**\n", + "\n", + "Here:\n", + "* **Lower bits** contain the connectivity value, It determines which neighbours of a pixel are considered.\n", + "* **Upper bits** is here a combination of **CV_FLOODFILL_FIXED_RANGE** and **CV_FLOODFILL_MASK_ONLY** \n", + " \n", + "Here:\n", + "* **CV_FLOODFILL_FIXED_RANGE** sets the difference between the current pixel and the seed pixel.\n", + "* **CV_FLOODFILL_MASK_ONLY** will only fill the image and not change the image itself." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "#masks are nothing but those floodfilled images per seed.\n", + "def generate_mask(image, seed_point):\n", + " h=carsample.shape[0]\n", + " w=carsample.shape[1]\n", + " #OpenCV wants its mask to be exactly two pixels greater than the source image.\n", + " mask=np.zeros((h+2, w+2), np.uint8)\n", + " #We choose a color difference of (50,50,50). Thats a guess from my side.\n", + " lodiff=50\n", + " updiff=50\n", + " connectivity=4\n", + " newmaskval=255\n", + " flags=connectivity+(newmaskval<<8)+cv2.cv.CV_FLOODFILL_FIXED_RANGE+cv2.cv.CV_FLOODFILL_MASK_ONLY\n", + " _=cv2.floodFill(image, mask, seed_point, (255, 0, 0),\n", + " (lodiff, lodiff, lodiff), (updiff, updiff, updiff), flags)\n", + " return mask" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "# we will need a fresh copy of the image so as to draw masks.\n", + "carsample_mask=cv2.imread(image_path)\n", + "\n", + "# for viewing the different masks later\n", + "mask_list=[]\n", + "\n", + "for cnt in contours:\n", + " if validate(cnt):\n", + " rect=cv2.minAreaRect(cnt) \n", + " centre=(int(rect[0][0]), int(rect[0][1]))\n", + " width=rect[1][0]\n", + " height=rect[1][1]\n", + " seeds=generate_seeds(centre, width, height)\n", + " \n", + " #now for each seed, we generate a mask\n", + " for seed in seeds:\n", + " # plot a tiny circle at the present seed.\n", + " cv2.circle(carsample, seed, 1, (0,0,255), -1)\n", + " # generate mask corresponding to the current seed.\n", + " mask=generate_mask(carsample_mask, seed)\n", + " mask_list.append(mask) " + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "#We plot 1st ten masks here\n", + "plt.rcParams['figure.figsize'] = 15,4\n", + "fig = plt.figure()\n", + "plt.title('Masks!')\n", + "for mask_no in range(10):\n", + " fig.add_subplot(2, 5, mask_no+1)\n", + " showfig(mask_list[mask_no], plt.get_cmap('gray'))" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will use our area and aspect ratio checks again on the approximated bounding boxes of the above masks!" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "validated_masklist=[]\n", + "for mask in mask_list:\n", + " contour=np.argwhere(mask.transpose()==255)\n", + " if validate(contour):\n", + " validated_masklist.append(mask)" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "try:\n", + " assert (len(validated_masklist)!=0)\n", + "except AssertionError:\n", + " print \"No valid masks could be generated\"" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Most probable masks are now here. But there may be those cases where almost same masks are repeated. This is possible as the seeds were random and there can always be more than 1 seed producing the same mask." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "# We check for repetation of masks here.\n", + "#from scipy import sum as\n", + "#import scipy.sum as scipy_sum\n", + "# This function quantifies the difference between two images in terms of RMS.\n", + "def rmsdiff(im1, im2):\n", + " diff=im1-im2\n", + " output=False\n", + " if np.sum(abs(diff))/float(min(np.sum(im1), np.sum(im2)))<0.01:\n", + " output=True\n", + " return output\n", + "\n", + "# final masklist will be the final list of masks we will be working on.\n", + "final_masklist=[]\n", + "index=[]\n", + "for i in range(len(validated_masklist)-1):\n", + " for j in range(i+1, len(validated_masklist)):\n", + " if rmsdiff(validated_masklist[i], validated_masklist[j]):\n", + " index.append(j)\n", + "for mask_no in list(set(range(len(validated_masklist)))-set(index)):\n", + " final_masklist.append(validated_masklist[mask_no])" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that the segmentation process is finished and we have valid regions, we can remove any possible rotation, crop the image region, resize the image, and equalize the light of cropped image regions." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "cropped_images=[]\n", + "for mask in final_masklist:\n", + " contour=np.argwhere(mask.transpose()==255)\n", + " rect=cv2.minAreaRect(contour)\n", + " width=int(rect[1][0])\n", + " height=int(rect[1][1])\n", + " centre=(int(rect[0][0]), int(rect[0][1]))\n", + " box=cv2.cv.BoxPoints(rect) \n", + " box=np.int0(box)\n", + " #check for 90 degrees rotation\n", + " if ((width/float(height))>1):\n", + " # crop a particular rectangle from the source image\n", + " cropped_image=cv2.getRectSubPix(carsample_mask, (width, height), centre)\n", + " else:\n", + " # crop a particular rectangle from the source image\n", + " cropped_image=cv2.getRectSubPix(carsample_mask, (height, width), centre)\n", + "\n", + " # convert into grayscale\n", + " cropped_image=cv2.cvtColor(cropped_image, cv2.COLOR_BGR2GRAY)\n", + " # equalize the histogram\n", + " cropped_image=cv2.equalizeHist(cropped_image)\n", + " # resize to 260 cols and 63 rows. (Just something I have set as standard here)\n", + " cropped_image=cv2.resize(cropped_image, (260, 63))\n", + " cropped_images.append(cropped_image)" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Lets see these cropped regions." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "_=plt.subplots_adjust(hspace=0.000)\n", + "number_of_subplots=len(cropped_images)\n", + "for i,v in enumerate(xrange(number_of_subplots)):\n", + " v = v+1\n", + " ax1 = plt.subplot(number_of_subplots,1,v)\n", + " showfig(cropped_images[i], plt.get_cmap('gray'))" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "heading", + "level": 3, + "metadata": {}, + "source": [ + "Classification" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After we preprocess and segment all possible parts of an image, we need now to decide if each segment is or is not a license plate. To do this, we will use Shogun's Support Vector Machines Framework.\n", + "\n", + "Basically we will be training a 2 class LibSVM. One class for positive training image of License Plate and other class for the negatives.\n", + "\n", + "I cropped almost 198 positive license plates images from the before mentioned database. Along with it, a set of 79 negative images are also cropped. These are already histogram equalized and reshaped into a size of 63 rows and 260 columns.(I have choosen this as the standard here) \n", + "\n", + "Lets see a part of it." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "import os\n", + "def get_imlist(path):\n", + " return [[os.path.join(path,f) for f in os.listdir(path) if f.endswith('.jpg')]]\n", + "\n", + "\n", + "training_sample=[] \n", + "#We plot 1st ten positive and negative license plates here\n", + "path_train='../../../data/ANPR/svm_train/positive/'\n", + "filenames=np.array(get_imlist(path_train))\n", + "for i in range(10):\n", + " temp=cv2.imread(filenames[0][i])\n", + " temp=cv2.cvtColor(temp, cv2.COLOR_BGR2GRAY)\n", + " training_sample.append(temp)\n", + "path_train='../../../data/ANPR/svm_train/negative/'\n", + "filenames=np.array(get_imlist(path_train))\n", + "for i in range(10):\n", + " temp=cv2.imread(filenames[0][i])\n", + " temp=cv2.cvtColor(temp, cv2.COLOR_BGR2GRAY)\n", + " training_sample.append(temp)\n", + " \n", + "plt.rcParams['figure.figsize'] = 15,4\n", + "fig = plt.figure()\n", + "plt.title('first 10 are positives and rest 10 are negatives')\n", + "for image_no in range(20):\n", + " fig.add_subplot(4, 5, image_no+1)\n", + " showfig(training_sample[image_no], plt.get_cmap('gray'))" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Below is function **get_svm()**. It will initialize a 2 class LibSVM on the training and testing dataset." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "from modshogun import *\n", + "def get_vstacked_data(path):\n", + " filenames=np.array(get_imlist(path))\n", + " #read the image\n", + " #convert the image into grayscale.\n", + " #change its data-type to double.\n", + " #flatten it\n", + " vmat=[]\n", + " for i in range(filenames[0].shape[0]):\n", + " temp=cv2.imread(filenames[0][i])\n", + " temp=cv2.cvtColor(temp, cv2.COLOR_BGR2GRAY)\n", + " temp=cv2.equalizeHist(temp)\n", + " temp=np.array(temp, dtype='double')\n", + " temp=temp.flatten()\n", + " vmat.append(temp)\n", + " vmat=np.vstack(vmat)\n", + " return vmat\n", + "\n", + "def get_svm():\n", + " \n", + " #set path for positive training images\n", + " path_train='../../../data/ANPR/svm_train/positive/'\n", + " pos_trainmat=get_vstacked_data(path_train)\n", + " \n", + " #set path for negative training images\n", + " path_train='../../../data/ANPR/svm_train/negative/'\n", + " neg_trainmat=get_vstacked_data(path_train)\n", + "\n", + " #form the observation matrix\n", + " obs_matrix=np.vstack([pos_trainmat, neg_trainmat])\n", + "\n", + " print obs_matrix.shape\n", + " \n", + " #shogun works in a way in which columns are samples and rows are features.\n", + " #Hence we need to transpose the observation matrix\n", + " obs_matrix=obs_matrix.T\n", + "\n", + " #get the labels. Positive training images are marked with +1 and negative with -1\n", + " labels=np.ones(obs_matrix.shape[1])\n", + " labels[pos_trainmat.shape[0]:obs_matrix.shape[1]]*=-1\n", + " \n", + " #convert the observation matrix and the labels into Shogun RealFeatures and BinaryLabels structures resp. .\n", + " sg_features=RealFeatures(obs_matrix)\n", + " sg_labels=BinaryLabels(labels)\n", + "\n", + " #Initialise a basic LibSVM in Shogun.\n", + " width=2\n", + " #kernel=GaussianKernel(sg_features, sg_features, width)\n", + " kernel=LinearKernel(sg_features, sg_features)\n", + " C=1.0\n", + " svm=LibSVM(C, kernel, sg_labels)\n", + " _=svm.train()\n", + " \n", + " dfdf=svm.apply(sg_features)\n", + " #print dfdf.get_labels()\n", + " return svm\n" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Lets run the above function to train this SVM." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "svm=get_svm()" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Our classifier is ready to predict a possible cropped image using the **apply()**\n", + "function of our SVM class; this function returns the class identifier i. In our case,\n", + "we label a plate class with 1 and no plate class with -1. Then for each detected region\n", + "that can be a plate, we use SVM to classify it as a plate or no plate, and save only\n", + "the correct responses." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "segmentation_output=[]\n", + "for cropped_image in cropped_images:\n", + " cropped_image=np.array([np.double(cropped_image.flatten())])\n", + " sg_cropped_image=RealFeatures(cropped_image.T)\n", + " output=svm.apply(sg_cropped_image)\n", + " print output.get_labels()[0]\n", + " # if it passes we append it\n", + " if(output.get_labels()[0]==1):\n", + " segmentation_output.append(cropped_image[0])" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "try:\n", + " assert (len(segmentation_output)!=0)\n", + "except AssertionError:\n", + " print \"SVM couldn't find a single License Plate here. Restart to crosscheck!The current framework is closing\"" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "heading", + "level": 2, + "metadata": {}, + "source": [ + "Plate recognition\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The second step in license plate recognition aims to retrieve the characters of the license plate with optical character recognition. For each detected plate, we proceed to segment the plate for each character, and use an Artificial Neural Network (ANN) machine-learning algorithm to recognize the character\n", + "\n", + "Again I created the dataset for training this ANN(Matlab scripts rescued me in carrying this task) by cropping the each letter from the License Plate Database. \n", + "\n", + "Unlike the plate detection feature-extraction step that is used in SVM, we don't use all of the image pixels; we will apply more common features used in optical character recognition containing horizontal and vertical accumulation histograms and a low-resolution image sample(10 rows and 5 cols here).\n", + "\n", + "For each character, we count the number of pixels in a row or column with a non-zero value and use them as features. Thus the three features that we intend to use here are:\n", + "1. **horizontal histogram**\n", + "2. **vertical histogram**\n", + "3. **10x5 downsampled character image**\n" + ] + }, + { + "cell_type": "heading", + "level": 3, + "metadata": {}, + "source": [ + "Feature Extraction" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "im=cv2.imread(\"../../../data/ANPR/ann/Z/12.jpg\")\n", + "im=cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)\n", + "im=cv2.resize(im, (5,10), interpolation=cv2.INTER_AREA)\n", + "_,im=cv2.threshold(im, 70, 255, cv2.THRESH_BINARY)\n", + "\n", + "horz_hist=np.sum(im==255, axis=0)\n", + "vert_hist=np.sum(im==255, axis=1)\n", + "plt.rcParams['figure.figsize'] = 10,4\n", + "fig = plt.figure()\n", + "plt.title('Downsampled character Z with its horz. and vert. histogram respectively')\n", + "fig.add_subplot(131)\n", + "showfig(im, plt.get_cmap('gray'))\n", + "fig.add_subplot(132)\n", + "plt.bar(range(0,5), horz_hist)\n", + "fig.add_subplot(133)\n", + "_=plt.bar(range(0,10), vert_hist)\n" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "heading", + "level": 3, + "metadata": {}, + "source": [ + "OCR Segmentation" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "But first we need to remove the bad bounding boxes that our **findContour()** may give us. It is very similar to the one we did previously except that here we have a different aspect ratio and area to be checked. We define a function for that." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def validate_ann(cnt):\n", + " rect=cv2.minAreaRect(cnt) \n", + " box=cv2.cv.BoxPoints(rect) \n", + " box=np.int0(box) \n", + " output=False\n", + " width=rect[1][0]\n", + " height=rect[1][1]\n", + " if ((width!=0) & (height!=0)):\n", + " if (((height/width>1.12) & (height>width)) | ((width/height>1.12) & (width>height))):\n", + " if((height*width<1700) & (height*width>100)):\n", + " if((max(width, height)<64) & (max(width, height)>35)):\n", + " output=True\n", + " return output " + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Form ANN with one hidden layer. Few things to be kept in mind are:\n", + "\n", + "* **number of classes**=32\n", + "* **number of features per sample image**=5(horz. hist) + 10(vert. hist)+ 10x5(flatten form of the downsampled image) i.e 65" + ] + }, + { + "cell_type": "heading", + "level": 3, + "metadata": {}, + "source": [ + "OCR Classification" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "values=['0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','J','K','L','M','N','P','R','S','T','U','V','W','X','Z']\n", + "keys=range(32)\n", + "data_map=dict(zip(keys, values))\n", + "\n", + "def get_ann(data_map):\n", + " feature_mat=[] \n", + " label_mat=[]\n", + " for keys in data_map:\n", + " path_train=\"../../../data/ANPR/ann/%s\"%data_map[keys]\n", + " filenames=get_imlist(path_train)\n", + " perfeature_mat=[]\n", + " perlabel_mat=[]\n", + " \n", + " for image in filenames[0]:\n", + " raw_image=cv2.imread(image)\n", + " raw_image=cv2.cvtColor(raw_image, cv2.COLOR_BGR2GRAY)\n", + " \n", + " #resize the image into 5 cols(width) and 10 rows(height)\n", + " raw_image=cv2.resize(raw_image,(5, 10), interpolation=cv2.INTER_AREA)\n", + " #Do a hard thresholding.\n", + " _,th2=cv2.threshold(raw_image, 70, 255, cv2.THRESH_BINARY)\n", + " \n", + " #generate features\n", + " horz_hist=np.sum(th2==255, axis=0)\n", + " #print horz_hist\n", + " vert_hist=np.sum(th2==255, axis=1)\n", + " #print vert_hist\n", + " sample=th2.flatten()\n", + " \n", + " #concatenate these features together\n", + " feature=np.concatenate([horz_hist, vert_hist, sample])\n", + " \n", + " # append these features together along with their respective labels\n", + " perfeature_mat.append(feature)\n", + " perlabel_mat.append(keys)\n", + " \n", + " feature_mat.append(perfeature_mat)\n", + " label_mat.append(perlabel_mat)\n", + " \n", + " # These are the final product.\n", + " bigfeature_mat=np.vstack(feature_mat)\n", + " biglabel_mat=np.hstack(label_mat)\n", + " \n", + " # As usual. We need to convert them into double type for Shogun.\n", + " bigfeature_mat=np.array(bigfeature_mat, dtype='double')\n", + " biglabel_mat=np.array(biglabel_mat, dtype='double')\n", + " \n", + " #shogun works in a way in which columns are samples and rows are features.\n", + " #Hence we need to transpose the observation matrix\n", + " obs_matrix=bigfeature_mat.T\n", + "\n", + " #convert the observation matrix and the labels into Shogun RealFeatures and MulticlassLabels structures resp. .\n", + " sg_features=RealFeatures(obs_matrix)\n", + " sg_labels=MulticlassLabels(biglabel_mat)\n", + " \n", + " #initialize a simple ANN in Shogun with one hidden layer.\n", + " layers=DynamicObjectArray()\n", + " layers.append_element(NeuralInputLayer(65))\n", + " layers.append_element(NeuralLogisticLayer(65))\n", + " layers.append_element(NeuralSoftmaxLayer(32))\n", + " net=NeuralNetwork(layers)\n", + " net.quick_connect()\n", + " net.initialize()\n", + "\n", + " net.io.set_loglevel(MSG_INFO)\n", + " net.l1_coefficient=3e-4\n", + " net.epsilon = 1e-6\n", + " net.max_num_epochs = 600\n", + "\n", + " net.set_labels(sg_labels)\n", + " net.train(sg_features) \n", + " return net" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Train the ANN." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "net=get_ann(data_map)" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For each output of the **findContours()** function, we validate it with the aspect ratio checks and apply ANN over it." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "for svm_output in segmentation_output:\n", + " car_number=[]\n", + " x_distance=[]\n", + " working_image=np.resize(np.uint8(svm_output),(63,260))\n", + " #we follow same preprocessing routines\n", + " working_image=cv2.equalizeHist(working_image)\n", + " working_image=cv2.GaussianBlur(working_image,(5,5),0)\n", + " _,th2=cv2.threshold(working_image, 75, 255, cv2.THRESH_BINARY_INV)\n", + " contours=np.copy(th2)\n", + " crop_copy=np.copy(th2)\n", + " contours,h=cv2.findContours(contours, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE)\n", + " count=-1\n", + " for cnt in contours:\n", + " count=count+1\n", + " if (validate_ann(cnt)):\n", + " rect=cv2.minAreaRect(cnt)\n", + " box=cv2.cv.BoxPoints(rect)\n", + " box=np.int0(box)\n", + " cv2.drawContours(working_image, [box], 0, (0, 255, 0),1)\n", + " centre=rect[0]\n", + " cropped=cv2.getRectSubPix(crop_copy,(int(min(rect[1])), int(max(rect[1]))) , centre)\n", + " cropped_resize=cv2.resize(cropped, (5,10), interpolation=cv2.INTER_AREA)\n", + " _, th2=cv2.threshold(cropped_resize, 70, 255, cv2.THRESH_BINARY)\n", + " \n", + " #generate the respective features\n", + " horz_hist=np.sum(th2==255, axis=0)\n", + " vert_hist=np.sum(th2==255, axis=1)\n", + " sample=th2.flatten()\n", + " \n", + " feature_set=np.concatenate([horz_hist, vert_hist, sample])\n", + " feature_set=np.array([np.double(feature_set)])\n", + " feature_set=feature_set.T\n", + " \n", + " testfeature=RealFeatures(feature_set)\n", + " output=net.apply_multiclass(testfeature)\n", + " data_alpha=data_map[output.get_labels()[0]]\n", + " \n", + " car_number.append(data_alpha)\n", + " x_distance.append(centre[0])\n", + "\n", + " print [car_number for (x_distance, car_number) in sorted(zip(x_distance, car_number))]\n", + " plt.figure()\n", + " showfig(working_image, plt.get_cmap('gray'))" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "heading", + "level": 3, + "metadata": {}, + "source": [ + "References:\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "* Mastering OpenCV with Practical Computer Vision Projects, Ch-5\n", + "* Automatic license plate recognition by Shyang-Lih Chang\n", + "* A Neural Network Based Artificial Vision System for Licence Plate Recognition by Sorin Draghici\n", + "* Image database : http://www.zemris.fer.hr/projects/LicensePlates/english/results.shtml" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [], + "language": "python", + "metadata": {}, + "outputs": [] + } + ], + "metadata": {} + } + ] +} \ No newline at end of file From 9b5c8bc916ada3abc41b059c521af0af76efdbf0 Mon Sep 17 00:00:00 2001 From: Abhijeet Date: Thu, 24 Jul 2014 08:10:19 +0530 Subject: [PATCH 2/3] tweaked findcontours --- .../computer_vision/ANPR.ipynb | 52 +++++++++++-------- 1 file changed, 31 insertions(+), 21 deletions(-) diff --git a/doc/ipython-notebooks/computer_vision/ANPR.ipynb b/doc/ipython-notebooks/computer_vision/ANPR.ipynb index a1a9577428a..a203df7e686 100644 --- a/doc/ipython-notebooks/computer_vision/ANPR.ipynb +++ b/doc/ipython-notebooks/computer_vision/ANPR.ipynb @@ -1,7 +1,7 @@ { "metadata": { "name": "", - "signature": "sha256:0224f014906318b507669d07c79b43a19f6049e114bfa8ff5121c58f4ab1dcd7" + "signature": "sha256:0186c6faa82bc5612ecc93b5286634a5ec7caf0e0d2172893440ba94ff809a50" }, "nbformat": 3, "nbformat_minor": 0, @@ -28,7 +28,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This notebook is about performing ANPR to detect automobile license plates in photographs taken between 1-2 metres from a car. We will be introduced to techniques related to image segmentation, feature extraction, pattern recognition , and two important pattern recogntion algorithms SVM and ANN. " + "This notebook is about performing ANPR to detect automobile license plates in photographs taken between 1-2 metres from a car. We will be introduced to techniques related to image segmentation, feature extraction, pattern recognition , and two important pattern recognition algorithms SVM and ANN. " ] }, { @@ -55,10 +55,25 @@ "used here are intended to explain the basics of **ANPR** and the specifications\n", "for license plates from **Croatia** (Why? because I found it's license plate database fairly easily on the internet), but we can extend them to any country or specification.(check references for the link)\n", "\n", - "I have loaded few of those plates here.\n", - "\n", - "\n", + "I have loaded few of those plates here." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "from IPython.display import Image\n", "\n", + "Image(filename='../../../data/ANPR/sample_plates.png')" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ "The whole algorithmic approach is structured as following:\n", "1. **Plate Detection**\n", "2. **Segmentation**\n", @@ -68,7 +83,7 @@ "6. **Feature Extraction**\n", "7. **OCR Classification**\n", "\n", - "We will go through each of these steps one by one, covering all glitches and glory and prospects that may further enhance this framework in future." + "We will go through each of these steps one by one." ] }, { @@ -83,7 +98,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "First load a car image. It's better to see what we are dealing here" + "First load a car image. It's better to see what we are dealing with" ] }, { @@ -111,7 +126,7 @@ "\n", "# Actual Code starts here\n", "plt.title('Sample Car')\n", - "image_path=\"../../../data/ANPR/sample_2.jpg\"\n", + "image_path=\"../../../data/ANPR/sample_1.jpg\"\n", "carsample=cv2.imread(image_path)\n", "showfig(carsample,None)" ], @@ -281,10 +296,10 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "There's too much errors! Well, Lets try to remove the outliers by validating against their area and aspect ratio. \n", + "There are too many false positives! Well, Lets try to remove the outliers by validating against their area and aspect ratio. \n", "\n", "* A normal number plate should have an aspect ratio of atleast more than 3.\n", - "* Outstanding area should be around 8000. Lets make a rough range(maybe there's a exception) of area between 3000 to 16000 pixels. We just don't want the actual number plate to disappear!\n", + "* Outstanding area should be around 8000. Lets make a rough range(maybe there's a exception) of area between 3000 to 16000 pixels. We don't want the actual number plate to disappear!\n", "\n", "To carry out this task we define a separate function." ] @@ -324,12 +339,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This is a lot better. We are now reduced to few possibilities that may contain a number plate. This is awesome!\n", + "This has improved the results but we want to do still better.\n", "But mind it. We are not going to get more strict and remove those few outliers left using the above mentioned checks! That will be killing the robustness of the system against any new car image!\n", "\n", "We need to exploit something other than this. You may see that our License Plates generally have a white background. That means we can use a **flood fill** algorithm.\n", "\n", - "**flood fill** is very similar to the old Fill color that we used to have in Paint! It tries to spread your choosen color from the point of origin to every direction untill it faces a tangible boundary preventing it to go any further.\n", + "**flood fill** is very similar to the old Fill color that you might have used in MS Paint or other drawing programs! It tries to spread your choosen color from the point of origin to every direction untill it faces a tangible boundary preventing it to go any further.\n", "\n", "For applying this algorithm, we need to have the origin (also known as **seeds**). Since we have no idea how to choose a specific **seed** within these validated rectangles, we will randomize it and hope that atleast one of them suceeds in exploiting the white background of a actual License Plate and **floodfills** a considerable chunk of it" ] @@ -674,8 +689,6 @@ "\n", " #form the observation matrix\n", " obs_matrix=np.vstack([pos_trainmat, neg_trainmat])\n", - "\n", - " print obs_matrix.shape\n", " \n", " #shogun works in a way in which columns are samples and rows are features.\n", " #Hence we need to transpose the observation matrix\n", @@ -697,9 +710,8 @@ " svm=LibSVM(C, kernel, sg_labels)\n", " _=svm.train()\n", " \n", - " dfdf=svm.apply(sg_features)\n", - " #print dfdf.get_labels()\n", - " return svm\n" + " _=svm.apply(sg_features)\n", + " return svm" ], "language": "python", "metadata": {}, @@ -778,7 +790,7 @@ "source": [ "The second step in license plate recognition aims to retrieve the characters of the license plate with optical character recognition. For each detected plate, we proceed to segment the plate for each character, and use an Artificial Neural Network (ANN) machine-learning algorithm to recognize the character\n", "\n", - "Again I created the dataset for training this ANN(Matlab scripts rescued me in carrying this task) by cropping the each letter from the License Plate Database. \n", + "The dataset that we are using for training this ANN are generated by the cropped images of each letter from the License Plate Database. We can use a standard ocr data set also.\n", "\n", "Unlike the plate detection feature-extraction step that is used in SVM, we don't use all of the image pixels; we will apply more common features used in optical character recognition containing horizontal and vertical accumulation histograms and a low-resolution image sample(10 rows and 5 cols here).\n", "\n", @@ -904,9 +916,7 @@ " \n", " #generate features\n", " horz_hist=np.sum(th2==255, axis=0)\n", - " #print horz_hist\n", " vert_hist=np.sum(th2==255, axis=1)\n", - " #print vert_hist\n", " sample=th2.flatten()\n", " \n", " #concatenate these features together\n", @@ -995,7 +1005,7 @@ " _,th2=cv2.threshold(working_image, 75, 255, cv2.THRESH_BINARY_INV)\n", " contours=np.copy(th2)\n", " crop_copy=np.copy(th2)\n", - " contours,h=cv2.findContours(contours, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE)\n", + " contours,_=cv2.findContours(contours, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)\n", " count=-1\n", " for cnt in contours:\n", " count=count+1\n", From a2d5aad437cd6f4c11f6ca3eecff63f074a39259 Mon Sep 17 00:00:00 2001 From: Abhijeet Date: Mon, 4 Aug 2014 17:17:14 +0530 Subject: [PATCH 3/3] addded a try catch for import cv2 --- doc/ipython-notebooks/computer_vision/ANPR.ipynb | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/doc/ipython-notebooks/computer_vision/ANPR.ipynb b/doc/ipython-notebooks/computer_vision/ANPR.ipynb index a203df7e686..f27f89c8e27 100644 --- a/doc/ipython-notebooks/computer_vision/ANPR.ipynb +++ b/doc/ipython-notebooks/computer_vision/ANPR.ipynb @@ -1,7 +1,7 @@ { "metadata": { "name": "", - "signature": "sha256:0186c6faa82bc5612ecc93b5286634a5ec7caf0e0d2172893440ba94ff809a50" + "signature": "sha256:650e37209aa8a6eb3b058f63464f9152da49f31ec783fd9324c8494ad42e4bb8" }, "nbformat": 3, "nbformat_minor": 0, @@ -108,7 +108,7 @@ "#Use the following function when reading an image through OpenCV and displaying through plt.\n", "def showfig(image, ucmap):\n", " #There is a difference in pixel ordering in OpenCV and Matplotlib.\n", - " #OpenCV follows BGR order, while matplotlib likely follows RGB order.\n", + " #OpenCV follows BGR order, while matplotlib follows RGB order.\n", " if len(image.shape)==3 :\n", " b,g,r = cv2.split(image) # get b,g,r\n", " image = cv2.merge([r,g,b]) # switch it to rgb\n", @@ -116,9 +116,12 @@ " imgplot.axes.get_xaxis().set_visible(False)\n", " imgplot.axes.get_yaxis().set_visible(False)\n", " \n", - " \n", + "#import Opencv library\n", + "try:\n", + " import cv2\n", + "except ImportError:\n", + " print \"You must have OpenCV installed\"\n", "import matplotlib.pyplot as plt\n", - "import cv2\n", "import numpy as np\n", "%matplotlib inline \n", "\n",