Skip to content

cswingle/opencv_3d_rgb_histograms

Repository files navigation

RGB Histogram

The idea is to generate a 3-D RGB histogram (bins for R and G and B combinations) for a series of images so we can order them by similarity or search for images most similar to an existing image.

The code basically comes from:

http://www.pyimagesearch.com/2014/01/27/hobbits-and-histograms-a-how-to-guide-to-building-your-first-image-search-engine-in-python/

Files:

  • chi_square_distance.sql: A SQL function to calculate the chi square distance between to histogram arrays.
  • chi_square_distance.c: C version of the chi square distance calculation.
  • Makefile: Makefile for the C shared library. After building:
$ sudo cp chi_square_distance.so $(pg_config --pkglibdir)
CREATE FUNCTION chi_square_distance(a double precision[], b double precision[])
RETURNS double precision
AS 'chi_square_distance', 'chi_square_distance'
LANGUAGE C STRICT;
  • generate_photolog_histograms.py: A Python script to populate the photolog_histograms table.
  • rgb_histogram.py: A Python script that builds a CSV of 3D RGB histogram distances for all photos in a directory.

Updating distances:

INSERT INTO photolog_histogram_distances
(a_id, b_id, bins, distance)
(SELECT a.id, b.id, 32,
   chi_square_distance(a.rgb_histogram, b.rgb_histogram)
 FROM photolog_histograms AS a
   INNER JOIN photolog_histograms AS b
      ON a.id < b.id AND a.bins = b.bins
 WHERE a.bins = 32 AND a.id || '-' || b.id NOT IN
 (SELECT a_id || '-' || b_id FROM photolog_histogram_distances
  WHERE bins = 32));

REFRESH MATERIALIZED VIEW all_histogram_distances;

Notes:

  • generate_photolog_histograms.py took 45 minutes with 32 bins.
  • rgb_histogram.py took a little over 29 minutes to do every pairwise distance twice (1-2 and 2-1) with 8 bins.
  • chi_square_distance.sql took just under two hours to do every pairwise distance once with 8 bins.
  • chi_square_distance.c took 79 seconds using 8 bin histograms and 450 seconds using 16 bit histograms. 32 bin histogram distances took 54 minutes to compute.

About

Routines to calculate 3d RGB histogram distances in Python, SQL and C.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published