# Hobbits and Histograms Tutorial

## A How-To Guide to Building Your First Image Search Engine in Python

This tutorial is provided on pyimagesearch and can be found at this link: https://www.pyimagesearch.com/2014/01/27/hobbits-and-histograms-a-how-to-guide-to-building-your-first-image-search-engine-in-python/

## Overview:
Build an image search engine<br>
Learn the 4 steps that are required

## Goal:

We have 25 images in our dataset that are categorized into five different locations in the Lord of the Rings. We will create an image search engine using this data. Our goal is given an query (input) image from one of the categories, we return all five images from said category in the top 10 search results

## 4 Steps to Building an Image Search Engine

(1) Define your descriptor<br>
(2) Index your dataset<br>
(3) Define your similarity metric <br>
(4) Searching: apply descriptor to your query image. sort results via similary and examine them

## Step 1: The Descriptor - A 3D RGB Color HIstogram

we compute a 3D histogram with 8 bins. We have to flatten it to reshape the array in numpy

In [12]:
import imutils
import cv2
import os
import pickle

In [26]:
# Create class for RGB Histogram
class RGBHistogram:
    def __init__(self, bins):
        # num bins in histogram
        self.bins = bins
    
    def describe(self, image):
        # compute normlaized 3D histogram in RGB colorspace
        hist = cv2.calcHist([image], [0, 1, 2], None, self.bins, 
                          [0, 256, 0, 256, 0, 256])
        
        if imutils.is_cv2():
            hist = cv2.normalize(hist)
        else:
            hist = cv2.normalize(hist, hist)
        
        # return histogram as flattened array
        return hist.flatten()          
            

It is good practice to define image descriptors as classes rather than functions because you rarely ever extract features from a single image alone

## Step 2: Indexing our Dataset

Apply our  image descriptor to each image in the dataset

In [29]:
# The index dictionary will keep the value of the descriptors for each file
index = {}

# Initalize descriptor object
desc = RGBHistogram([8, 8, 8])

# Loop over every file in the images directory
for _, _, files in os.walk(os.getcwd() + '/images'):
    for file in files:
        # Get image path
        path = os.getcwd() + '/images/' + file
        print(path.spli)
        # load image, describe it and update the histogram
        image = cv2.imread(path)
        cv2.imshow('image', image)
        features = desc.describe(image)
        index[file] = features        
        
# # Save index to pickle file
# f = open(os.getcwd() + '/index.pkl', 'wb')
# f.write(pickle.dumps(index))
# f.close()

C:\Users\conno\Documents\GitHub\pyimagesearch\hobbits_histograms/images/Dol-Guldur-001.png
C:\Users\conno\Documents\GitHub\pyimagesearch\hobbits_histograms/images/Dol-Guldur-002.png
C:\Users\conno\Documents\GitHub\pyimagesearch\hobbits_histograms/images/Dol-Guldur-003.png
C:\Users\conno\Documents\GitHub\pyimagesearch\hobbits_histograms/images/Dol-Guldur-004.png
C:\Users\conno\Documents\GitHub\pyimagesearch\hobbits_histograms/images/Dol-Guldur-005.png
C:\Users\conno\Documents\GitHub\pyimagesearch\hobbits_histograms/images/Goblin-001.png
C:\Users\conno\Documents\GitHub\pyimagesearch\hobbits_histograms/images/Goblin-002.png
C:\Users\conno\Documents\GitHub\pyimagesearch\hobbits_histograms/images/Goblin-003.png
C:\Users\conno\Documents\GitHub\pyimagesearch\hobbits_histograms/images/Goblin-004.png
C:\Users\conno\Documents\GitHub\pyimagesearch\hobbits_histograms/images/Golbin-005.png
C:\Users\conno\Documents\GitHub\pyimagesearch\hobbits_histograms/images/index.pkl


error: OpenCV(4.1.0) C:\projects\opencv-python\opencv\modules\highgui\src\window.cpp:352: error: (-215:Assertion failed) size.width>0 && size.height>0 in function 'cv::imshow'


In [3]:
























Spacing




























NameError: name 'Spacing' is not defined