# Automated Retail Shelf Analytics

__Note__: Click the following button to show or hide the code.

In [1]:
from IPython.display import HTML

HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Show/Hide Code"></form>''')

## Introduction

In today's competitive and evolving __Consumer Packaged Goods (CPG)__ market, shelf management is a very important aspect for CPG manufacturers and retailers. __Automated Retail Shelf Analytics__ is a powerful, accurate, and efficient way for CPG manufacturers and retailers to collect, measure, and analyze what is happening on the physical shelf; It helps them improve operational efficiencies, elevate shopper experience, and maximize revenues. In __Automated Retail Shelf Analytics__, power of __advanced computer vision__ is harnessed and store shelves images are transformed into a comprehensive set of SKU-level metrics and insights which are listed as follows:
1. Count of each SKU
2. Linear shelf share of each SKU
3. List of out of stock SKUs
4. Metrics related to the placement of SKUs
5. Promos and price tags displayed against the SKUs

Once shelf pictures are captured and uploaded to the server, this technology automatically recognizes all SKUs, computes the aforementioned KPIs, and delivers actionable reports back in the hands of the sales reps.



![title](images/intro_rsa.jpg)

## Objective

To develop an __Intelligent Retail Shelf Analytics__ system which can accept multiple shelf images at once, and generate aforementioned valuable metrics and actionable insights.

## System Pipeline


All the modules of the system are as follows, which are explained in detail in coming sections.

![title](images/System_pipeline_rsa.png)

## Dataset

Since this project was for one of our CPG clients at __Fractal Analytics__, I cannot present the project details on actual shelf data, so I will be presenting all the details with a public dataset that is available on __Github__ for similar research. The dataset can be downloaded from https://github.com/gulvarol/grocerydataset.

A few details regarding the dataset are as follows:
1. It was collected from around 40 groceries, with 4 cameras.
2. It consists of products from 10 different cigarette brands.
3. It consists of a total of 354 shelf images.
4. Out of these 354 shelf images, 300 has been used for training and the rest has been used for testing.
5. Annotation looks as shown below. Where the first entry is shelf image name (C1_P04_N1_S3_2.JPG), then the number of products on that shelf (42), then (x,y,w,h) (1024, 1660, 228, 336) corresponding to these bounding boxes, and then brand id of the product (0).

![title](images/Annotation_rsa.png)
6. It has only bounding box co-ordinates but no class information for all these bounding boxes.
7. Co-ordinates are given as (x, y, w, h) where (x, y) is co-ordinate of upper-left corner, and (w,h) is the width and the height of the bounding box.
8. All the bounding boxes are labeled as class 0. Therefore, data for classification has to be labeled manually.   

![title](images/Shelf_1_rsa.JPG)

## Implementation details and results

### Image Stitching Module

A few implementation details of this module are as follows:
1. It can take one or more images from one or more shelves.
2. Images must be taken with some overlap for each shelf.
3. Images must be inputted in left to right order.
4. It stitches two images at a time and repeats the process.
5. Images must be resized such that they have equal height.
6. It uses __Sift descriptor__ to stitch images based on matching key-points between the images.
7. No. of matched key-points must be greater than 5% of the total no. of pixels in the smaller image, otherwise, we say that there is not sufficient overlap and stitching can't be performed.

In [None]:
#This section contains the code for image stitching module.

import sys
import cv2
import numpy as np
import os
import errno

fileLog = open('myapp.txt','a')

# Use the keypoints to stitch the images
def get_stitched_image(img1, img2, M):

	# Get width and height of input images	
	w1,h1 = img1.shape[:2]
	w2,h2 = img2.shape[:2]

	# Get the canvas dimesions
	img1_dims = np.float32([ [0,0], [0,w1], [h1, w1], [h1,0] ]).reshape(-1,1,2)
	img2_dims_temp = np.float32([ [0,0], [0,w2], [h2, w2], [h2,0] ]).reshape(-1,1,2)


	# Get relative perspective of second image
	img2_dims = cv2.perspectiveTransform(img2_dims_temp, M)
	#print(img1_dims,img2_dims)
	# Resulting dimensions
	result_dims = np.concatenate( (img1_dims, img2_dims), axis = 0)

	# Getting images together
	# Calculate dimensions of match points
	[x_min, y_min] = np.int32(result_dims.min(axis=0).ravel() - 0.5)
	[x_max, y_max] = np.int32(result_dims.max(axis=0).ravel() + 0.5)
	
	# Create output array after affine transformation 
	transform_dist = [-x_min,-y_min]
	transform_array = np.array([[1, 0, transform_dist[0]], 
								[0, 1, transform_dist[1]], 
								[0,0,1]]) 
	#print(x_max-x_min, y_max-y_min)
	# Warp images to get the resulting image
	result_img = cv2.warpPerspective(img2, transform_array.dot(M), 
									(x_max-x_min, y_max-y_min))
	result_img[transform_dist[1]:w1+transform_dist[1], 
				transform_dist[0]:h1+transform_dist[0]] = img1

	# Return the result
	return result_img

# Find SIFT and return Homography Matrix
def get_sift_homography(img1, img2):

	# Initialize SIFT 
	sift = cv2.xfeatures2d.SIFT_create()

	# Extract keypoints and descriptors
	k1, d1 = sift.detectAndCompute(img1, None)
	k2, d2 = sift.detectAndCompute(img2, None)

	# Bruteforce matcher on the descriptors
	bf = cv2.BFMatcher()
	matches = bf.knnMatch(d1,d2, k=2)

	# Make sure that the matches are good
	verify_ratio = 0.8 # Source: stackoverflow
	verified_matches = []
	for m1,m2 in matches:
		# Add to array only if it's a good match
		if m1.distance < 0.8 * m2.distance:
			verified_matches.append(m1)

	# Mimnum number of matches
	#min_matches = 1000
	fileLog.write("VERIFIED MATCHES: "+str(len(verified_matches)))
	#if len(verified_matches) > min_matches:
		
	# Array to store matching points
	img1_pts = []
	img2_pts = []

	# Add matching points to array
	for match in verified_matches:
		img1_pts.append(k1[match.queryIdx].pt)
		img2_pts.append(k2[match.trainIdx].pt)
	img1_pts = np.float32(img1_pts).reshape(-1,1,2)
	img2_pts = np.float32(img2_pts).reshape(-1,1,2)
	
	# Compute homography matrix
	M, mask = cv2.findHomography(img1_pts, img2_pts, cv2.RANSAC, 5.0)
	return (M,len(verified_matches))
	#else:
	#	print('Error: Not enough matches')
	#	exit()

# Equalize Histogram of Color Images
def equalize_histogram_color(img):
	img_yuv = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
	img_yuv[:,:,0] = cv2.equalizeHist(img_yuv[:,:,0])
	img = cv2.cvtColor(img_yuv, cv2.COLOR_YUV2BGR)
	return img

def stitch_two(img1,img2):
	img1 = equalize_histogram_color(img1)
	img2 = equalize_histogram_color(img2)
	M,len_verified_matches =  get_sift_homography(img1, img2)
	stitch_res = get_stitched_image(img2, img1, M)
	return (stitch_res,len_verified_matches)
# Main function definition

def one_stitch_iter(filepath, img1, img2_idx, stitch_out = [], stitchIndex = [], size_thresh = 0.8):
	#stitch_out=[]
	len_verified_matches_lst = []
	temp_stitch_lst = []
	min_matches = 1000
	# print("inone img2_idx", img2_idx)
	
	for i in img2_idx:
		img2 = cv2.imread(filepath[i])
		temp_stitch, len_verified_matches = stitch_two(img1, img2)
		#print(temp_stitch.shape,img1.shape,img2.shape)
		if(temp_stitch.shape[1] > size_thresh*(img1.shape[1] + img2.shape[1]) and len_verified_matches > min_matches):
			len_verified_matches_lst.append(len_verified_matches)
		else:
			len_verified_matches_lst.append(-1)
		temp_stitch_lst.append(temp_stitch)

	max_len_verified_matches, max_len_verified_matches_idx = np.max(len_verified_matches_lst), np.argmax(len_verified_matches_lst)
	
	if(max_len_verified_matches > -1):
		add_stitch_res = True
		img1 = temp_stitch_lst[max_len_verified_matches_idx]
		stitchIndex.append(img2_idx[max_len_verified_matches_idx])
		img2_idx.remove(img2_idx[max_len_verified_matches_idx])
		stitch_out.append(temp_stitch_lst[max_len_verified_matches_idx])
	
		if(len(img2_idx) != 0):
			stitch_out, img2_idx, stitchIndex = one_stitch_iter(filepath, img1, img2_idx, stitch_out, stitchIndex)

	return (stitch_out, img2_idx, stitchIndex)

def main(filepath):
	path = os.path.join('/data1/naquib.alam/images', filepath)

	#Get number of input images
	filelist = []

	for f in os.listdir(path):
		filelist.append(os.path.join(path, f))
	n_images = len(filelist)
	filelist = sorted(filelist)
	# print(filelist)
	# print(filelist)

	# img1 = cv2.imread(filepath[0])
	res_images = []
	res_img_names = []
	stitch_possible = True
	cnt = 0
	# Get input set of images
	img2_idx = list(range(0, n_images))
	stitch_res = []
	# while True:
	stitch_indices = []
	while len(img2_idx) > 1:
		# print("ID:", img2_idx)
		
		img1 = cv2.imread(filelist[img2_idx[0]])
		temp = [img2_idx[0]]
		img2_idx.pop(0)
		# print("IDX:", img2_idx)
		stitch_out, img2_idx, stitch_index_temp = one_stitch_iter(filelist, img1, img2_idx, [], temp)
		# print("stitch_index_temp: ",stitch_index_temp)
		if(len(stitch_out) > 0):
			stitch_res.append(stitch_out[-1])
		else:
			stitch_res.append(img1)

		stitch_indices.append(stitch_index_temp)
		fileLog.write("LEN:"+str(len(stitch_res)))
		#print(img2_idx)

	if(len(img2_idx) == 1):
		stitch_res.append(cv2.imread(filelist[img2_idx[0]]))
		stitch_indices.append(img2_idx)
		
	fileLog.write("LEN FINAL:"+str(len(stitch_res)))

	try:
		os.mkdir(os.path.join('/data1/naquib.alam/static/temp', filepath))
	except OSError as exc:
		try:
			os.mkdir('/data1/naquib.alam/static/temp')
			os.mkdir(os.path.join('/data1/naquib.alam/static/temp', filepath))
		except OSError as e:
			if e.errno != errno.EEXIST:
				raise
			pass

		if exc.errno != errno.EEXIST:
			raise
		pass


	# Save the results
	for i in range(len(stitch_res)):
		res_image_name = '/data1/naquib.alam/static/temp/' + filepath + '/result_' + str(i+1) + '.jpg'
		# res_image_nameC = 'static/temp/resultC_'+str(i)+'.JPG'
		# if stitch_res[i].shape[0] > 1200:
			# rows = 1200
			# cols = int(1200*stitch_res[i].shape[1]/stitch_res[i].shape[0])
		# else:
		try:
			os.mkdir(os.path.join(os.path.join('/data1/naquib.alam/static/temp', filepath), "result_"+str(i+1)))
		except OSError as exc:
			if exc.errno != errno.EEXIST:
				raise
			pass
		for index in stitch_indices[i]:
			path = os.path.join(os.path.join('/data1/naquib.alam/static/temp', filepath), "result_"+str(i+1))
			cv2.imwrite(os.path.join(path, str(index)+".jpg"), cv2.imread(filelist[index]))
		rows = stitch_res[i].shape[0]
		cols = stitch_res[i].shape[1]

		cv2.imwrite(res_image_name, stitch_res[i])
		fileLog.close()
		# cv2.imwrite(res_image_nameC, cv2.resize(stitch_res[i], (rows, cols)))

	# Save the inputs 
	# for f in filelist:
	# 	res_image_nameC = 'static/images/'
	# 	img = cv2.imread(f);
	# 	if img.shape[0] > 1200:
	# 		rows = 1200
	# 		cols = int(1200*img.shape[1]/img.shape[0])
	# 	else:
	# 		rows = img.shape[0]
	# 		cols = img.shape[1]

	# 	cv2.imwrite(res_image_nameC + f[30:], cv2.resize(img, (rows, cols)))

	# for file in os.path.join("/data1/naquib.alam/images", filepath):
	# 	if not os.path.isfile(os.path.join(os.path.join("/data1/naquib.alam/images", filepath), file)):
	# 		continue
	# 	os.remove(os.path.join(os.path.join("/data1/naquib.alam/images", filepath), file))
	# os.remove(os.path.join("/data1/naquib.alam/images", filepath))

	return stitch_indices

# Call main function
if __name__=='__main__':
	# print(sys.argv[1])
	main(sys.argv[1])

![title](images/Stitch_input_rsa.png)  
  
__Note:__ Red rectangles here represent the overlapping region between adjacent images to be stitched.

![title](images/Stitch_output_rsa.png)

### Object Detection Module

A few implementation details of this module are as follows:
1. Object detection was used just to draw bounding boxes (localization) but not for classification.
2. Each bounding box was classified as one class since we didn't have any class information.
3. For object detection, __YOLO__ in __Keras__ and __Google API__ in __TensorFlow__ have been used.
4. __Google Object Detection API__ was found to give better results.
5. In Google API, different architectures have been tried but __Faster_rcnn_resnet50_coco__ architecture was found to perform better.
6. Pre-trained weights were fine-tuned with this dataset. For this purpose, __config__ and __labelmap.txt__ files were modified as per our dataset. Labelmap.txt file contains class information and config file contains all the information related to image size; architecture; training; and paths for train.record, val.record and labelmap.txt.
7. Finetuned model is exported as a Tensorflow graph proto (.pb format) file which is loaded for inference.
8. At inference time, this module outputs 4 tensors:
    1. __num_detections__
    2. __detection_scores__
    3. __detection_boxes__
    4. __detection_classes__
9. All boxes with a score of less than 0.3 were ignored.
10. Overlapping boxes were removed using __Non Maximum Suppression (NMS)__.
11. An __F1-score__ of __0.798__ was achieved.

In [None]:
# This section contains code for object detection module.
import pandas as pd
import cv2 as cv
from PIL import Image
import numpy as np
import tensorflow as tf

df = pd.read_csv('/data1/naquib.alam/data/test_labels.csv')

with tf.gfile.FastGFile('/data1/naquib.alam/model/frozen_inference_graph.pb','rb') as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())

sess = tf.Session()
sess.graph.as_default()
tf.import_graph_def(graph_def, name='')

for name in df.filename.unique():
    #name = df.filename[0]
    # Read and preprocess an image.
    img = Image.open('/data1/naquib.alam/CIGDataset/ShelfImages/'+name)
    inp = np.array(img)
    inp = inp[:, :, ::-1].copy()
    rows = inp.shape[0]
    cols = inp.shape[1]
    out = sess.run([sess.graph.get_tensor_by_name('num_detections:0'),
                sess.graph.get_tensor_by_name('detection_scores:0'),
                sess.graph.get_tensor_by_name('detection_boxes:0'),
                sess.graph.get_tensor_by_name('detection_classes:0')],
               feed_dict={'image_tensor:0': inp.reshape(1, inp.shape[0], inp.shape[1], 3)})
    num_detections = int(out[0][0])
    
    for i in range(num_detections):
	classId = int(out[3][0][i])
    	score = float(out[1][0][i])
    	bbox = [float(v) for v in out[2][0][i]]
    	if score > 0.3:
      	   x = bbox[1] * cols
           y = bbox[0] * rows
           right = bbox[3] * cols
           bottom = bbox[2] * rows
           cv.rectangle(inp, (int(x), int(y)), (int(right), int(bottom)), (255,0,0), thickness=2)    
    

    print('Saving file name: ',name)
    result = Image.fromarray(inp)
    result.save('/data1/naquib.alam/CIGDataset/Results/'+name)
    print('Saved')

### Object Classification Module

A few implementation details of this module are as follows:
1. Since all the bounding boxes were annotated with class 0, we had to label the data for classification manually. For this purpose, detected objects from the training data were selected and then further refined to make a better dataset for classification module.
2. Total 6550 images were split (stratified) as follows:
    1. Training data: 85 %
    2. Validation data: 10 %
    3. Test data: 5%
3. All images were divided into __11 classes__ including one class as __Other__ which contains all those classes which had very few images.

4. Different __pre-trained__ architectures were tried but __VGG16__ was found to work better.
5. Added __FC layers__ and __last 10 layers__ of the network were fine-tuned.
6. Few things which improved the accuracy are as follows:
    1. __Data augmentation__
    2. __Discriminative fine-tuning__
    3. __Gradual unfreezing__
    4. __Stochastic Gradient Descent With Restarts__
    5. __Adam optimizer__
7. An __F1-score__ of __0.976__ was achieved.
8. Overall score of both detection and classification was __0.778__.

In [None]:
# This cell contains code for classification module.

import os, sys, math, shutil, PIL, json
import numpy as np
import tensorflow as tf
import scipy.io as sio
import keras 
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras.callbacks import ModelCheckpoint
from PIL import Image
from keras.models import model_from_json
from sklearn.metrics import confusion_matrix
from collections import Counter

def train_test_img_split(base_dir,dosplit=False):
    if dosplit is False:
        return
    #print("returned_2")
    folders=os.listdir(base_dir)
    folders=sorted(np.array(folders).astype(int))
    folders=np.array(folders).astype(str)
    #print(folders)
    train_img=[]
    val_img=[]
    test_img=[]
    train_test_img_path=[]
    for x in folders:
        class_files=os.listdir(os.path.join(base_dir,x))
        len_class_files=len(class_files)
        #file_sel_idx=np.random.shuffle(np.arange(len_class_files)
        end_train_idx=int(len_class_files*0.85)
        end_val_idx=int(len_class_files*0.95)
        #train_files=class_files[:end_train_idx]
        #test_files=class_files[end_train_idx:]
        train_files=class_files[:end_train_idx]
        val_files=class_files[end_train_idx:end_val_idx]
        test_files=class_files[end_val_idx:]
        train_img.append(train_files)
        val_img.append(val_files)
        test_img.append(test_files)
        train_test_img_path.append(os.path.join(base_dir,x))
    return(train_test_img_path,train_img,val_img,test_img,folders)
    
def create_train_val_test_dir(train_test_img_src_path,base_dir_split,train_img,val_img,test_img,create_dir=False):
    if create_dir is False:
        return
    #print("Returned_2")
    dirs=["train","val","test"]
    for i,x in enumerate(dirs):
        if (os.path.exists(os.path.join(base_dir_split,x))) is False:
            os.mkdir(os.path.join(base_dir_split,x))
        if i==0:
            for j,y in enumerate(train_img):
                if (os.path.exists(os.path.join(base_dir_split,x,str(j)))) is False:
                    os.mkdir(os.path.join(base_dir_split,x,str(j)))
                for z in y:
                    shutil.copy(os.path.join(train_test_img_src_path[j],z),os.path.join(base_dir_split,x,str(j)))
        elif i==1:
            for j,y in enumerate(val_img):
                if (os.path.exists(os.path.join(base_dir_split,x,str(j)))) is False:
                    os.mkdir(os.path.join(base_dir_split,x,str(j)))
                for z in y:
                    shutil.copy(os.path.join(train_test_img_src_path[j],z),os.path.join(base_dir_split,x,str(j)))
        else:
            for j,y in enumerate(test_img):
                if (os.path.exists(os.path.join(base_dir_split,x,str(j)))) is False:
                    os.mkdir(os.path.join(base_dir_split,x,str(j)))
                for z in y:
                    shutil.copy(os.path.join(train_test_img_src_path[j],z),os.path.join(base_dir_split,x,str(j)))

def get_number_images(base_dir):
    files=["train","valid","test"]
    n_images=[0,0,0]
    for i,x in enumerate(files):
        folders=os.listdir(os.path.join(base_dir,x))
        for y in folders:
            class_image=os.listdir(os.path.join(base_dir,x,y))
            n_images[i]+=len(class_image)
    return n_images      
       
def get_class_weights(y, smooth_factor=0):
    """
    Returns the weights for each class based on the frequencies of the samples
    :param smooth_factor: factor that smooths extremely uneven weights
    :param y: list of true labels (the labels must be hashable)
    :return: dictionary with the weight for each class
    """
    counter = Counter(y)
    print("COUNTER: ",counter)
    if smooth_factor > 0:
        p = max(counter.values()) * smooth_factor
        for k in counter.keys():
            counter[k] += p

    majority = max(counter.values())

    return {cls: float(majority / count) for cls, count in counter.items()}

def fine_tune_dense(base_dir_split,n_images,do_train=False):
    if do_train is False:
        return
    summ_file=open('model_summary.txt','w')
    train_dir=os.path.join(base_dir_split,"train")
    val_dir=os.path.join(base_dir_split,"valid")
    #test_dir=os.path.join(base_dir_split,"test")
    img_width, img_height = 224,224
    n_class=16
    batch_size=32
    epochs=20
    learning_rate=0.001
    decay_rate=1/epochs
    nb_train_samples=n_images[0]
    nb_validation_samples=n_images[1]
    step_per_epoch=nb_train_samples/batch_size
    validation_step= nb_validation_samples/batch_size
    model = applications.VGG16(weights='imagenet', include_top=False,input_shape = (224,224,3))
    print('Model loaded.')
    #return model
    
    top_model = Sequential()
    for layer in model.layers:
        top_model.add(layer)
    top_model.add(Flatten(input_shape=model.output_shape[1:]))
    #return model
    top_model.add(Dense(256, activation='relu',kernel_initializer='he_normal',bias_initializer=keras.initializers.Constant(value=0.1)))
    top_model.add(Dropout(0.2))
    top_model.add(Dense(n_class, activation='softmax',kernel_initializer='he_normal',bias_initializer=keras.initializers.Constant(value=0.1)))
    #model.add(top_model)
    print('Top dense layer added')
    #return top_model
    for layer in top_model.layers[:19]:
        layer.trainable = False
    sgd=optimizers.SGD(lr=learning_rate, momentum=0.9, decay=decay_rate)
    top_model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
    # prepare data augmentation configuration
    train_datagen = ImageDataGenerator(rotation_range=15,width_shift_range=0.1,height_shift_range=0.1,zoom_range=0.1,rescale=1. / 255)
    val_datagen = ImageDataGenerator(rescale=1. / 255)

    train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(img_height, img_width),batch_size=batch_size,class_mode='categorical')

    validation_generator = val_datagen.flow_from_directory(
    val_dir,target_size=(img_height, img_width),batch_size=batch_size, class_mode='categorical')
    #print("Train_Class_indices:",train_generator.class_indices)
    #print("Val_Class_Indices:",validation_generator.class_indices)
    class_weight=get_class_weights(train_generator.classes,0.1)
    #print("CLASS_WEIGHTS: ",class_weight)
    checkpoint_path='CIGDataset/SplitData/models/KerasTest/VGG16_Adam_30_0.hdf5'
    checkpoint= ModelCheckpoint(checkpoint_path, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
    callback_list=[checkpoint]  
    hist=top_model.fit_generator(
    train_generator,
    steps_per_epoch=step_per_epoch,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=validation_step,
    class_weight=class_weight,
    callbacks= callback_list)
    
    model_json=top_model.to_json()
    with open("CIGDataset/SplitData/models/KerasTest/VGG16_Adam_30_0.json","w") as json_file:
        json.dump(model_json,json_file)
    top_model.save_weights("CIGDataset/SplitData/models/KerasTest/VGG16_Adam_30_0.h5") 
    with open('CIGDataset/SplitData/models/KerasTest/VGG16_Hist_Adam_30_0.json',"w") as json_file:
        json.dump(hist.history,json_file)
    print(top_model.summary(),file=summ_file)
    summ_file.close()
    return top_model

def do_predict(model_json,weights,base_dir_split,n_images,dopredict=False):
    if dopredict is False:
        return
    test_dir=os.path.join(base_dir_split,"test")
    nb_test_samples=n_images[2]
    #nb_test_samples=7724
    img_height,img_width=224,224
    batch_size=32
    arch_file_json=open(model_json,'r')
    model_json=arch_file_json.load()
    arch_file_json.close()
    loaded_model=model_from_json(model_json)
    loaded_model.load_weights(weights)
    test_datagen=ImageDataGenerator(rescale=1. / 255)
    test_generator=test_datagen.flow_from_directory(test_dir,target_size=(img_height,img_width),batch_size=batch_size,class_mode='categorical',shuffle=False)
    print("test class indices: ",test_generator.class_indices)
    #print("test classes: ",test_generator.classes)
    pred_label=loaded_model.predict_generator(test_generator,int(math.ceil(nb_test_samples/batch_size))+1)
    print("Shape: ",pred_label.shape)
    y_true=test_generator.classes
    y_pred=np.argmax(pred_label,1)
    print(confusion_matrix(y_true,y_pred))
    #print(np.max(pred_label,1),np.argmax(pred_label,1))
    return pred_label
    
if __name__ == "__main__":
    base_dir_whole= r"/data1/naquib.alam/CIGDataset/CroppedImages"
    base_dir_split= r"/data1/naquib.alam/CIGDataset/SplitData"
    #print(base_dir_whole,base_dir_split)
    #x="/data1/naquib.alam/ShelfMonitoring/ObjectDetection/CiggarateDataset/grocerydataset/CroppedImageCategoryRefined/WholeData"
    #print("Trax",type(base_dir_whole),x==base_dir_whole)
    #train_test_img_src_path,train_img,val_img,test_img,folders = train_test_img_split(base_dir_whole)
    #print(type(train_test_img_src_path),type(train_img),type(val_img),type(test_img))
    #create_train_val_test_dir(train_test_img_src_path,base_dir_split,train_img,val_img,test_img)
    n_images = get_number_images(base_dir_split)
    print(n_images)
    model = fine_tune_dense(base_dir_split,n_images,True)
    #pred_label=do_predict("model_30_32_LIT.json","weights_30_32_LIT.h5",base_dir_split,n_images,True)
    #print(np.argmax(pred_label,1))  


![title](images/Object_detection_1_rsa.png)
  
1. Bounding boxes with different colors represent different brands of cigarettes classified.
2. Red dots represent the misclassified objects.
3. Yellow dots represent boxes which were not detected.

### Metrics Calculation Module


A few implementation details of this module are as follows:
1. Outputs from previous modules were used to calculate all the relevant metrics.
2. Number of racks on a shelf were found by sorting the x co-ordinates and then finding the transition points.
3. Length (width) of each SKU were found by using the predicted width (w) of a bounding box.
4. Length of each racks were also found in order to calculate the linear shelf share.
5. All these informations were used to calculate aforementioned metrics.


In [None]:
#this cell contains the code for metrics calculation and promo detection module.

import os 
import re
import cv2 as cv
import numpy as np
from PIL import Image
from matplotlib import style
style.use("ggplot")
import random
import tensorflow as tf
import pandas as pd
import sys
import json
import time, datetime, errno
from sklearn.cluster import KMeans
import pytesseract

from keras.models import model_from_json

class GetOutOfLoop(Exception):
    pass

class ShelfShare(object):
    
    graph_def = None
    loaded_model = None
    noOfPromo = None
    
    def __init__(self, path = '/data1/naquib.alam/model/frozen_inference_graph.pb',
                 model = "/data1/naquib.alam/model_31_32_2_sgd.json",
                 weights = "/data1/naquib.alam/weights_31_32_2_sgd.h5"):
    # def trial(self, path = '/data1/naquib.alam/model/frozen_inference_graph.pb',
    #              model = "/data1/naquib.alam/model_31_32_2_sgd.json",
    #              weights = "/data1/naquib.alam/weights_31_32_2_sgd.h5"):
        if self.graph_def == None:
            self._graph_load(path)
        if self.loaded_model == None:
            self._load_classifier(model, weights)
        if self.noOfPromo == None:
            self.noOfPromo = []
        
    def _graph_load(self, path):
        graph_path = path
        with tf.gfile.FastGFile(graph_path,'rb') as f:
            self.graph_def = tf.GraphDef()
            self.graph_def.ParseFromString(f.read())
            # print('Detection module loaded...')
    
    def _load_classifier(self, model, weights):
        model_json = model
        arch_file_json = open(model_json,'r')
        model_json = json.load(arch_file_json)
        arch_file_json.close()
        self.loaded_model = model_from_json(model_json)
        self.loaded_model.load_weights(weights)
        # print('Classifier loaded...')
        
    def _image_load(self, path = '/data1/naquib.alam/CIGDataset/ShelfImages/C4_P06_N1_S4_1.JPG'):
        image_path = path
        img = Image.open(image_path)
        self.ing = np.array(img)
        self.ing = self.ing[:, :, ::-1].copy()
        # print('Image uploaded for processing...')
    
    def _find_object(self):
        
        self.imgCropTest = self.ing
        self.ing = cv.resize(self.ing,(600,600))
                
        self._out = self._sess.run([self._sess.graph.get_tensor_by_name('num_detections:0'),
            self._sess.graph.get_tensor_by_name('detection_scores:0'),
            self._sess.graph.get_tensor_by_name('detection_boxes:0'),
            self._sess.graph.get_tensor_by_name('detection_classes:0')],
           feed_dict={'image_tensor:0': self.ing.reshape(1, self.ing.shape[0], self.ing.shape[1], 3)})
        # print("Bounding boxes calculated...")
        self._sess.close()
        return self._out
    
    def _process_boxes(self, inputfolder):
        num_detections = int(self._out[0][0])
        #bound_box = self._out[2][0]
        bound_box = self._remove_overlap(self._out[2][0])
        bound_box = self._remove_overlap(bound_box, True)
        self._out[2][0] = bound_box
        #num_detections = bound_box[(bound_box[:,0] != 0) & (bound_box[:,1] != 0) & (bound_box[:,2] != 0) & (bound_box[:,3] != 0),:].shape[0]
        Y = np.zeros([num_detections,2])
        #Y[:,0] = bound_box[np.any([bound_box[:,0]!=0, bound_box[:,1]!=0],axis=0),0]
        #Y[:,1] = bound_box[np.any([bound_box[:,0]!=0, bound_box[:,1]!=0],axis=0),2]
        Y = bound_box[0:num_detections,:]

        rack_info = self._find_racks(Y)
        self._draw_boxes(rack_info)
        results = self._classify_boxes(bound_box, rack_info)
        self._write_results(results, rack_info[1], inputfolder)
        return results
        
    def _remove_overlap(self, bound_box, flag=False):
        if not flag:
            indexS = np.argsort(bound_box[:,1])
        else:
            indexS = np.argsort(bound_box[:,3])
        indexSR = np.argsort(indexS)
        bound_box = bound_box[indexS,:]
        for i in range(2, bound_box.shape[0]):
            if (bound_box[i-1,0] == 0) & (bound_box[i-1,1] == 0) & (bound_box[i-1,2] == 0) & (bound_box[i-1,3] == 0):
                continue
            area1 = 1000000*(bound_box[i,3]-bound_box[i,1])*(bound_box[i,2]-bound_box[i,0])
            area2 = 1000000*(bound_box[i-1,3]-bound_box[i-1,1])*(bound_box[i-1,2]-bound_box[i-1,0])
            overlap_area = 1000000*(np.minimum(bound_box[i,3], bound_box[i-1,3])-np.maximum(bound_box[i,1], bound_box[i-1,1]))*(np.minimum(bound_box[i,2], bound_box[i-1,2])-np.maximum(bound_box[i,0], bound_box[i-1,0]))

            if overlap_area >= 0.8*np.minimum(area2,area1):
                if area2 > area1:
                    bound_box[i,:] = 0
                else:
                    bound_box[i-1,:] = 0
                
        return bound_box[indexSR,:]

    def _find_racks(self, Y):
        trial = Y[:,0]
        ind = np.argsort(trial)
        trial = np.sort(trial)
        diff = trial[1:]-trial[:-1]
        # diff2 = diff[1:]-diff[:-1]
    
        labInd = np.zeros(len(trial))
        for i in range(len(diff)):
            if diff[i] > np.max(diff)*0.3:
                labInd[i+1] = labInd[i]+1
            else:
                labInd[i+1] = labInd[i]
    
        col = [(255,0,0),(0,255,0),(0,0,255)]
        for i in range(3,len(np.unique(labInd))):
            col.append((255,random.randint(0,256),random.randint(0,256)))    
        # print('Rack information calculated...')
        return ind, labInd, col
    
    def _draw_boxes(self, rack_info):
        num_detections = int(self._out[0][0])
        rowsCT = self.imgCropTest.shape[0]
        colsCT = self.imgCropTest.shape[1]
        self.ing = cv.resize(self.ing, (colsCT,rowsCT))

        col = rack_info[2]
        labInd = rack_info[1]
        ind = rack_info[0]

        if (rowsCT >= 1500) or (colsCT >= 1500):
            thickness = 10
        elif (rowsCT >= 500) or (colsCT >= 500):
            thickness = 4 + (colsCT/1000 - 0.5)*6
        else:
            thickness = 4
                
        for i in range(num_detections):
            index = np.where(ind == i)
            score = float(self._out[1][0][i])
            bbox = [float(v) for v in self._out[2][0][i]]
            if (bbox[0] == 0) & (bbox[1] == 0) & (bbox[2] == 0) & (bbox[3] == 0):
                continue 
            if score > 0.3:
                x = bbox[1] * colsCT
                y = bbox[0] * rowsCT
                right = bbox[3] * colsCT
                bottom = bbox[2] * rowsCT
                cv.rectangle(self.ing, (int(x), int(y)), (int(right), int(bottom)), col[int(labInd[index[0][0]])], thickness=thickness)
    
        #cv.imshow('TensorFlow MobileNet-SSD', self.ing)

    def _draw_promo_boundary(self):
        # print("Running Contour Detection...")
        imgray = cv.cvtColor(self.ing_promo, cv.COLOR_BGR2GRAY)
        ret, thresh = cv.threshold(imgray, 127, 255, 0)
        image, contours, hierarchy = cv.findContours(thresh, cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
        # print("Total number of contours:", len(contours))
        area_contours=[]

        for cnt in contours:
            area_contours.append(cv.contourArea(cnt))
        area_contours = np.array(area_contours)
        top_area_args = np.argsort(-area_contours)[:30]
        top_area_contours = area_contours[top_area_args]
        rect_xywh_txt = []

        for idx in top_area_args:
            cnt = contours[idx]
            x, y, w, h = cv.boundingRect(cnt)
            im_ocr = self.ing_promo[y:y+h, x:x+w]
            im_ocr_gray = cv.cvtColor(im_ocr, cv.COLOR_BGR2GRAY)
            _, im_ocr_thresh = cv.threshold(im_ocr_gray, 127, 255, 0)

            text = pytesseract.image_to_string(Image.fromarray(im_ocr_thresh))
            print(len(text))
            if(len(text) > 0):
                rect_xywh_txt.append([x,y,w,h])
        
        if len(rect_xywh_txt) == 0:
            return
        
        rect_xywh_sorted = sorted(rect_xywh_txt, key = lambda t:t[1])
        tolerance_y = int(self.ing_promo.shape[0]*0.06)
        rect_xywh_promos = []
        rect_xywh_promos.append([rect_xywh_sorted[0]])
        current_miny_rect = rect_xywh_sorted[0]
        idx = 0
        for rect in rect_xywh_sorted[1:]:
            if rect[1] < current_miny_rect[1] + tolerance_y:
                rect_xywh_promos[idx].append(rect)
            else:
                idx = idx+1
                rect_xywh_promos.append([rect])
                current_miny_rect = rect
                
        rect_xywh_promo_combined = []
        
        for rects in rect_xywh_promos:
            rects_arr = np.array(rects)
            n_rows = rects_arr.shape[0]
            right_arr = (rects_arr[:, 0] + rects_arr[:, 2]).reshape(n_rows, 1)
            bottom_arr= (rects_arr[:, 1] + rects_arr[:, 3]).reshape(n_rows, 1)
            rects_arr= np.hstack((rects_arr, right_arr, bottom_arr))
            
            min_arr = np.min(rects_arr, axis = 0)
            max_arr = np.max(rects_arr, axis=0)
            min_x, min_y = min_arr[0], min_arr[1]
            max_right, max_bottom = max_arr[4], max_arr[5]
            
            rect_xywh_promo_combined.append([min_x, min_y, max_right, max_bottom])

        rowsCT = self.imgCropTest.shape[0]
        colsCT = self.imgCropTest.shape[1]
        if (rowsCT >= 1500) or (colsCT >= 1500):
            thickness = 10
        elif (rowsCT >= 500) or (colsCT >= 500):
            thickness = 4 + (colsCT/1000 - 0.5)*6
        else:
            thickness = 4
            
        for xywh in rect_xywh_promo_combined:
            cv.rectangle(self.imgCropTest, (int(xywh[0]), int(xywh[1])), (int(xywh[2]), int(xywh[3])), (0,255,240), thickness=thickness)

        self.noOfPromo.append(len(rect_xywh_promo_combined))

        return
    
    def _classify_boxes(self, bound_box, rack_info):
        ind = rack_info[0]
        labInd = rack_info[1]
                
        rowsCT = self.imgCropTest.shape[0]
        colsCT = self.imgCropTest.shape[1]

        self.ing_promo = self.ing.copy()
        
        # print(rowsCT, colsCT)
        x_max = np.amax(bound_box[:,3])
        x_min = np.amin(bound_box[bound_box[:,1] > 0,1])

        shelfRatios = np.zeros((len(np.unique(labInd)),16))
        shelfCoverage = np.zeros((len(np.unique(labInd)),16))
        shelfStats = [None]*16
        boxClass = [None]*16
        length = np.zeros((len(np.unique(labInd)),1))
        infoList = []
        # include = [False]*16

        # for visibility index
        finalCoverData = []; placements = []; coverage = []; xInfo = [None]*16; vacInfo = [];

        size = np.ones((16,1))
        # print('Starting classification module...')
        noOfShelves = len(np.unique(labInd))
        for i in range(noOfShelves):
            index = np.where(labInd == i)
            bbox = bound_box[ind[index],:]

            # To remove zero size boxes
            if len(np.unique(bbox[:,0])) == 1:
                continue
            infoList.append(i)
            
            rowCoverDataA = []
            rowCoverDataB = []
            rowUncoverData = []
            rowPlacement = []

            # To remove boxes at the edges of the images
            bbox = np.sort(bbox, axis = 0)
            if 0 in np.unique(bbox):
                locs = np.unique(np.where(bbox == 0)[0])
                # print(locs)
                if len(np.unique(bbox[locs,:])) == 1:
                    bbox = bbox[np.unique(np.where(bbox != 0)[0]),:]
                else:
                    for loc in locs:
                        if len(np.where(bbox[loc, :] == 0)[0]) == 4:
                            bbox[loc,:] = np.nan
                    bbox = bbox[~np.any(np.isnan(bbox), axis=1)]

            maxX = 0
            minX = np.amin(bbox[:,1])
            images = np.zeros((bbox.shape[0],224,224,3)) 
            for j in range(bbox.shape[0]):
                testImg = self.imgCropTest[int(rowsCT*bbox[j,0]):int(rowsCT*bbox[j,2])+1,int(colsCT*bbox[j,1]):int(colsCT*bbox[j,3])+1]
                testImgR = cv.resize(testImg,(224,224))
                images[j,:,:,:] = testImgR
            pred_label = self.loaded_model.predict(images/255)

            if bbox[:,[1,3]].shape[0] < 3:
                cluster_info = [0]*(bbox[:,[1,3]].shape[0])
            else:
                cluster_info = KMeans(n_clusters = 3).fit_predict(bbox[:,[1,3]])
            
            lastLabel = 0
            newClusterLabels = []
            for cluster in range(len(cluster_info)):
                if cluster == 0:
                    newClusterLabels.append(lastLabel)
                else:
                    if cluster_info[cluster - 1] != cluster_info[cluster]:
                        lastLabel += 1
                    newClusterLabels.append(lastLabel)
            cluster_info = newClusterLabels

            # print("Cluster Info: ", cluster_info)

            y_min = int(np.min(bbox[:,0]*rowsCT))
            y_max = int(np.max(bbox[:,2]*rowsCT))
            self.ing_promo[y_min:y_max, :] = 0
            # print(pred_label, np.sum(pred_label, axis=1))

            #print(np.max(pred_label,1),np.argmax(pred_label,1))
            for j in range(bbox.shape[0]):
                # if np.max(pred_label[j]) < 0.3:
                #     continue
                if shelfStats[np.argmax(pred_label[j,:])] == None:
                    shelfStats[np.argmax(pred_label[j,:])] = []
                    boxClass[np.argmax(pred_label[j,:])] = []
                    xInfo[np.argmax(pred_label[j,:])] = [np.zeros((noOfShelves, 1)), np.zeros((noOfShelves, 1)), np.zeros((noOfShelves, 1))]
                    # include[np.argmax(pred_label[j,:])] = True

                shelfStats[np.argmax(pred_label[j,:])].append(bbox[j,3]-np.maximum(bbox[j,1],maxX))
                boxClass[np.argmax(pred_label[j,:])].append([i,j])
                shelfRatios[i,np.argmax(pred_label[j,:])] += 1
                shelfCoverage[i,np.argmax(pred_label[j,:])] += bbox[j,3]-np.maximum(bbox[j,1],maxX)

                rowPlacement.append(np.argmax(pred_label[j,:]))
                rowCoverDataA.append(bbox[j,3])
                rowCoverDataB.append(np.maximum(bbox[j,1], maxX))

                xInfo[np.argmax(pred_label[j,:])][cluster_info[j]][i] += 1
                
                maxX = bbox[j,3]
                size[np.argmax(pred_label[j,:])] = np.minimum(size[np.argmax(pred_label[j,:])],bbox[j,3]-bbox[j,1])

                if j == 0:
                    # print("xmin", infoList.index(i), bbox[j, 1], x_min)
                    vacInfo.append([infoList.index(i), bbox[j, 1] - x_min, -1, np.argmax(pred_label[j+1,:])])
                    # print("x", infoList.index(i), bbox[j+1, 1], bbox[j, 3])
                    vacInfo.append([infoList.index(i), bbox[j+1, 1] - bbox[j, 3], -1, np.argmax(pred_label[j+1,:])])
                elif j == bbox.shape[0]-1:
                    blankSpace = x_max - bbox[j, 3]
                    # print("xmax", infoList.index(i), x_max, bbox[j, 3])
                    vacInfo.append([infoList.index(i), x_max - bbox[j, 3], np.argmax(pred_label[j-1,:]), -1])
                else:
                    # print("x", infoList.index(i), bbox[j+1, 1], bbox[j, 3])
                    vacInfo.append([infoList.index(i), bbox[j+1, 1] - bbox[j, 3], np.argmax(pred_label[j-1,:]), np.argmax(pred_label[j+1,:])])

                # if bbox[j,1] > maxX:
                #     rowUncoverData.append()

            placements.append(rowPlacement)
            coverage.append([rowCoverDataA, rowCoverDataB])
            length[i] = 0.12*(x_max - x_min) + 0.88*(maxX - minX)
            # print("i ", length)

        self._draw_promo_boundary()

        # print('Shelf coverage calculated...')
        # print("All Shelves: ", xInfo)
        
        shelfRatios = shelfRatios[infoList,:]
        shelfCoverage = shelfCoverage[infoList,:]
        length = length[infoList,:]
        for i in range(16):
            if xInfo[i] != None:
                xInfo[i][0] = xInfo[i][0][infoList]
                xInfo[i][1] = xInfo[i][1][infoList]
                xInfo[i][2] = xInfo[i][2][infoList]
        # print("Selected Shelves: ", xInfo)

        weights = self._generate_weights(len(placements))
        heatBucket = np.zeros((17, 1))
        for i in range(len(weights)):
            S = 100 * weights[i] / np.sum(weights)
            for j in range(len(placements[i])):
                x2 = 1000*coverage[i][1][j]
                x1 = 1000*coverage[i][0][j]
                n = (2*length[i][0] - x1 - x2) * (x2 - x1)
                d = np.square(1000*length[i][0])
                heatBucket[placements[i][j]] += S*n/d
        # print(heatBucket, np.sum(heatBucket))
        # heatBucket = heatBucket / np.sum(heatBucket)
        heatBucket[16] = np.sum(heatBucket)
        heatBucket = heatBucket[heatBucket != 0]
        heatBucket[heatBucket < 0] = 0
        if heatBucket[-1:] > 100:
            heatBucket[-1:] = 100
            heatBucket[:-1] = 100*heatBucket[:-1]/np.sum(heatBucket[:-1])
        # print("HB:    ",heatBucket)

        # toReturn = self._process_for_data(shelfRatios, shelfCoverage, shelfStats, length, size, heatBucket, xInfo)
        toReturn = self._process_for_data(shelfRatios, shelfCoverage, shelfStats, length, vacInfo, heatBucket, xInfo, infoList)
        # for i in range(16):
        #     try:
        #         if toReturn[2][i] != None:
        #             toReturn[2][i][0] = toReturn[2][i][0][infoList]
        #             toReturn[2][i][1] = toReturn[2][i][1][infoList]
        #             toReturn[2][i][2] = toReturn[2][i][2][infoList]
        #     except ValueError as ex:
        #         pass

        locs = toReturn[4]
        for i in range(len(locs)):
            # print(locs[i])
            boxLoc = boxClass[locs[i][0]][locs[i][1]]
            # print(boxLoc)
            index = np.where(labInd == boxLoc[0])
            bbox = bound_box[ind[index],:]
            bbox = np.sort(bbox, axis = 0)
            finalBox = bbox[boxLoc[1],:]
            # print(finalBox)
            x = finalBox[1] * colsCT
            y = finalBox[0] * rowsCT
            right = finalBox[3] * colsCT
            bottom = finalBox[2] * rowsCT
            # print(int(x), int(y), int(right), int(bottom))
            cv.rectangle(self.ing, (int(x), int(y)), (int(right), int(bottom)), (255,255,255), thickness=8)

        return toReturn[0], toReturn[1], toReturn[2], toReturn[3], toReturn[5] 

    def _generate_weights(self, len):
        
        if len == 2:
            return [1, 1]
        elif len == 3:
            return [1,3,2]
        elif len == 4:
            return [2, 2.5, 2.5, 1]
        elif len == 5:
            return [1, 3, 5, 4, 2]

        return []

    def _max_find(self, one, two):
        if one < 3:
            maxOne = 0
        elif one <= 5:
            maxOne = 5
        elif one < 10:
            maxOne = 10
        else:
            maxOne = 15
        
        if two < 3:
            maxTwo = 0
        elif two <= 5:
            maxTwo = 5
        elif one < 10:
            maxTwo = 10
        else:
            maxTwo = 15

        return maxOne, maxTwo

    def _contest_compliance(self, one, two):
        maxOne, maxTwo = self._max_find(one, two)
        # print(maxOne, maxTwo)
        if maxOne == 0:
            return 2
        elif maxTwo == 0:
            return 1
        else:
            if ((one+1)/maxOne > 1) and ((two+1)/maxTwo > 1):
                return 0
            if ((one+1)/maxOne < 1) and ((two+1)/maxTwo < 1):
                if (one+1)/maxOne == (two+1)/maxTwo:
                    return 0
                elif (one+1)/maxOne > (two+1)/maxTwo:
                    return 1
                else:
                    return 2
            else:
                if ((one+1)/maxOne > 1):
                    return 2
                else:
                    return 1

    # def _process_for_data(self, shelfRatios, shelfCoverage, shelfStats, length, size, heatBucket, xInfo):
    def _process_for_data(self, shelfRatios, shelfCoverage, shelfStats, length, vacInfo, heatBucket, xInfo, infoList):

        # prodNames = ['Kent','Chesterfield','2000','Muratti','Monte_Carlo','Pall_Mall','LD','LM','Camel','Marlboro','Parliament','Lark','Lucky_Strike','Davidoff','Viceroy','Winston','West']
        prodNames = ['Kent','Chesterfield','2000','Muratti','MonteCarlo','PallMall','LM','LuckyStrike','Marlboro','Parliament','Lark','Other','Davidoff','Viceroy','Winston','West']

        shlfDf = pd.DataFrame(np.transpose(shelfRatios))
        shlfDf.set_index([prodNames], inplace = True)
        shlfDf.columns = ['Shelf_'+ str(i + 1) for i in range(shlfDf.shape[1])]
        shlfDf['Total'] = shlfDf.sum(axis = 1)
        shlfDf.Total.replace(0, np.nan, inplace = True)
        shlfDf.dropna(subset = ['Total'], axis = 0, inplace = True)
        shlfDf['Number_of_Products_(%)'] = 100*shlfDf.Total/shlfDf.Total.sum()

        df = pd.DataFrame(shelfCoverage)
        df.columns = prodNames
        df['Length Of Rack'] = length
        df = df.T
        df.columns = ['Shelf_' + str(i+1) for i in range(df.shape[1])]
        df['Occupied_Product_Area_(%)'] = df.sum(axis=1)
        df['Occupied_Product_Area_(%)'].replace(0,np.nan,inplace=True)
        df.dropna(subset=['Occupied_Product_Area_(%)'],axis=0,inplace=True)

        #statsDf = pd.DataFrame(index = [prodNames], columns = ["Mean", "Std. Dev", "Median", "Q1", "Q3"])
        #for j in range(len(shelfStats)):
           # prodStat = shelfStats[j]
            #if prodStat == None:
                #continue
            #statsDf.at[prodNames[j], "Mean"] = np.mean(prodStat)
            #statsDf.at[prodNames[j], "Std. Dev"] = np.std(prodStat)
            #statsDf.at[prodNames[j], "Median"] = np.median(prodStat)
            #statsDf.at[prodNames[j], "Q1"] = np.percentile(prodStat,25)
            #statsDf.at[prodNames[j], "Q3"] = np.percentile(prodStat,75)
        #statsDf.dropna(subset=['Mean'],axis=0,inplace=True)

        statsDf = pd.read_fwf('/data1/naquib.alam/stats/ShelfStatDF_354.txt')
        statsDf.set_index([statsDf['Unnamed: 0'].tolist()], inplace = True)
        statsDf.drop(columns = ['Unnamed: 0'], inplace = True)
        statsDf = statsDf.loc[df.index.tolist()[:-1]]

        vacancyComp = [None]*16
        # print(vacInfo)
        for i in range(len(vacInfo)):            
            stn = str(int(vacInfo[i][0]) + 1)
            # print(i)
            if vacInfo[i][2] == vacInfo[i][3]:
                width = statsDf['Mean'][prodNames[vacInfo[i][2]]] - 1.08*statsDf['Std. Dev'][prodNames[vacInfo[i][2]]]
                numberOfProdsPossible = np.floor(vacInfo[i][1]/width)
                # print("If Out: ", i, vacInfo[i][0], vacInfo[i][1], vacInfo[i][2], vacInfo[i][3], width, numberOfProdsPossible)
                if numberOfProdsPossible <= 0:
                    continue
                # print(vacancyComp[vacInfo[i][2]])
                try:
                    if vacancyComp[vacInfo[i][2]] == None:
                        vacancyComp[vacInfo[i][2]] = np.zeros((len(infoList), 1))
                except ValueError as ex:
                    pass
                
                vacancyComp[vacInfo[i][2]][vacInfo[i][0]] += numberOfProdsPossible
            else:
                # print("Else Out: ", vacInfo[i])
                # print('Shelf_'+stn, prodNames[vacInfo[i][2]])
                if (vacInfo[i][2] == -1) or (vacInfo[i][3] == -1):
                    if vacInfo[i][2] == -1:
                        prodOneCount = 0
                        prodTwoCount = shlfDf['Shelf_'+stn][prodNames[vacInfo[i][3]]]
                        width = [10, statsDf['Mean'][prodNames[vacInfo[i][3]]]]
                    else:
                        prodOneCount = shlfDf['Shelf_'+stn][prodNames[vacInfo[i][2]]]
                        prodTwoCount = 0
                        width = [statsDf['Mean'][prodNames[vacInfo[i][2]]], 10]
                else:
                    prodOneCount = shlfDf['Shelf_'+stn][prodNames[vacInfo[i][2]]]
                    prodTwoCount = shlfDf['Shelf_'+stn][prodNames[vacInfo[i][3]]]
                    width = [statsDf['Mean'][prodNames[vacInfo[i][2]]] - statsDf['Std. Dev'][prodNames[vacInfo[i][2]]], statsDf['Mean'][prodNames[vacInfo[i][3]]] - statsDf['Std. Dev'][prodNames[vacInfo[i][3]]]]

                # print(i, width, prodOneCount, prodTwoCount)
                
                blankSpace = vacInfo[i][1]
                flag = False

                try:
                    # print("Blankspace: ", blankSpace, width[0], width[1])
                    if (blankSpace < width[0]) and (blankSpace < width[1]):
                        raise Exception

                    while blankSpace > 0:
                        # print("Enter while for different boxes")
                        try:
                            # print("-_-")
                            if flag == False:
                                chosenProd = self._contest_compliance(prodOneCount, prodTwoCount)
                            # print("chosenIn: ",chosenProd)
                            
                            if chosenProd == 0:
                                chosenProd = 1

                            numberOfProdsPossible = np.floor(blankSpace/width[chosenProd-1])
                            if numberOfProdsPossible < 0:
                                numberOfProdsPossible = 0
                            # print("In: ", i, vacInfo[i][0], blankSpace, vacInfo[i][2], vacInfo[i][3], chosenProd, numberOfProdsPossible)

                            if numberOfProdsPossible == 0:
                                chosenProd = -chosenProd + 3
                                if flag == False:
                                    flag = True
                                    raise GetOutOfLoop
                                else:
                                    raise Exception 
                            if vacancyComp[vacInfo[i][1+chosenProd]] == None:
                                vacancyComp[vacInfo[i][1+chosenProd]] = np.zeros((df.shape[1], 1))

                            vacancyComp[vacInfo[i][1+chosenProd]][vacInfo[i][0]] += 1
                            blankSpace -= width[chosenProd-1]
                            if chosenProd == 1:
                                prodOneCount += 1
                            else:
                                prodTwoCount += 1

                        except GetOutOfLoop:
                            pass

                except Exception:
                    pass
        # print(vacancyComp)
        
        vacancyDf = pd.DataFrame([], ['Shelf_'+ str(i + 1) for i in range(shlfDf.shape[1]-2)], [])
        count = 0
        for j in range(len(vacancyComp)):
            try:
                if vacancyComp[j] == None:
                    count += 1
                    continue
            except ValueError as ex:
                vacancyDf[prodNames[j]] = vacancyComp[j]#[infoList]
                pass
        # print(vacancyDf)
        if count == 16:
            vacancyDf[shlfDf.index.tolist()[0]] = np.zeros((shlfDf.shape[1]-2,1))


        quantFail = [None]*statsDf.shape[0]
        lenFail = [None]*statsDf.shape[0]
        locs = []

        horzDf = pd.DataFrame([], ["Left", "Middle", "Right"], [])
        for j in range(len(prodNames)):
            if xInfo[j] != None:
                horzDf[prodNames[j]] = xInfo[j]
        # print(horzDf)

        for j in range(statsDf.shape[0]):
            prodStat = shelfStats[prodNames.index(statsDf.index.tolist()[j])]
            #print("Product: ", statsDf.index.tolist()[j])
            #print("Data from Image: ", prodStat)
            #print("Overall Data: ", statsDf.iloc[j,:])
            iqf = statsDf.iloc[j, 1] #- statsDf.iloc[j, 3]
            #print('IQF: ', iqf)
            for i in range(len(prodStat)):
                if (prodStat[i] < statsDf.iloc[j,0] - 3*iqf) or (prodStat[i] > statsDf.iloc[j,0] + 3*iqf):
                    locs.append([prodNames.index(statsDf.index.tolist()[j]),i])
                    
            pQLess = sum(a < statsDf.iloc[j,0] - 3*iqf for a in prodStat)
            pQMore = sum(a > statsDf.iloc[j,0] + 3*iqf for a in prodStat)
            #print(pQLess, pQMore)
            quantFail[j] = pQLess + pQMore
            
            pSLess = sum(a for a in prodStat if a < statsDf.iloc[j,0] - 3*iqf)
            pSMore = sum(a for a in prodStat if a > statsDf.iloc[j,0] + 3*iqf)
            lenFail[j] = pSLess + pSMore

        quantFail = [a for a in quantFail if a != None]
        quantFail.append(shlfDf.Total.sum())
        lenFail = [a for a in lenFail if a != None]
        lenFail.append(df.iloc[df.shape[0]-1, df.shape[1]-1])
        
        statsDf = pd.DataFrame([quantFail, lenFail])
        columns = shlfDf.index.tolist()
        columns.append("Total")
        statsDf.columns = columns
        statsDf = statsDf.T
        statsDf.columns = ["Shelf_1", "Shelf_2"]
        statsDf["Heat"] = heatBucket
        
        # size = size[size != 1]

        # return shlfDf, df, size, statsDf, locs, horzDf
        return shlfDf, df, vacancyDf, statsDf, locs, horzDf
    
    def _write_results(self, results, labInd, inputfolder):
        # result = Image.fromarray(self.ing)
        # if self.ing.shape[0] > 1200:
            # rows = 1200
            # cols = int(1200*self.ing.shape[1]/self.ing.shape[0])
        # else:
        rows = self.ing.shape[0]
        cols = self.ing.shape[1]

        if rows > 1200:
            aspectRatio = rows/cols
            cols = 1200
            rows = int(cols/aspectRatio)


        fullpath = os.path.join('/data1/naquib.alam/static/results', inputfolder)

        # resultC = Image.fromarray(cv.resize(self.ing,(rows,cols)))
        r = re.compile(r'result_\d+.jpg$')
        l = []
        for f in os.listdir(fullpath):
            if not os.path.isfile(os.path.join(fullpath, f)): continue;
            # print(f)
            if os.path.isfile(os.path.join(fullpath, f)) & (f[0:7]=='result_'): 
                l = f;
                # print('So So True ', True)

        if len(l) == 0:
            end = 1
        else:
            end = int(l[7:-4])+1
        # print(end)

        font = cv.FONT_HERSHEY_SIMPLEX
        txt = ""
        listOfIndices = results[0].index.tolist()
        rowsCT = self.imgCropTest.shape[0]
        colsCT = self.imgCropTest.shape[1]
        if (rowsCT >= 1500) or (colsCT >= 1500):
            thickness = 10
            inc = 70
        elif (rowsCT >= 500) or (colsCT >= 500):
            thickness = 5 + (colsCT/1000 - 0.5)*5
            inc = 20 + (colsCT/1000 - 0.5)*50
        else:
            thickness = 5
            inc = 20

        y = rowsCT - 50
        for i in range(results[0].shape[0]):
            txt = "{}: {}".format(listOfIndices[i], results[0]['Total'][i])
            cv.putText(self.ing, txt, (50, y), font, 2, (255,255,255), 4)
            y -= inc
        txt = "{}: {}".format('Total', results[0]['Total'].sum())
        cv.putText(self.ing, txt, (50, y), font, 2, (255,255,255), 4)
            
        
        cv.imwrite(fullpath + '/result_' + str(end) + '.jpg', cv.resize(self.ing, (rows, cols)))
        cv.imwrite(fullpath + '/resultP_' + str(end) + '.jpg', cv.resize(self.imgCropTest, (rows, cols)))
        # cv.imwrite('/data1/naquib.alam/static/resultC_'+str(end)+'.jpg', cv.resize(self.ing,(rows,cols)))
        # result.save('/data1/naquib.alam/static/result_'+str(end)+'.jpg')
        # resultC.save('/data1/naquib.alam/static/resultC_'+str(end)+'.jpg')
        # print('Resulting image saved...')
        
        df = results[1]
        # size = results[2]
        vacancy = results[2]

        with open(fullpath + '/ProdCount_' + str(end) + '.txt', 'w') as f:
           f.write(round(results[0]).to_string())
        with open(fullpath + '/ProdHorz_' + str(end) + '.txt', 'w') as f:
           f.write(round(results[4].T).to_string())
        with open(fullpath + '/ProdShare_' + str(end) + '.txt', 'w') as f:
           occupy = df["Occupied_Product_Area_(%)"]
           occupy = occupy[-1:]
           # print(occupy.to_string()) 
           resultDf = round(100*df/df.iloc[df.shape[0]-1])
           resultDf = resultDf.iloc[[i for i in range(resultDf.shape[0]-1)]]
           resultDf["Length"] = occupy[0]
           f.write(resultDf.to_string())

        # print(vacancy)

        if vacancy.T.shape[0] != 0:
            with open(fullpath + '/ProdVacancy_' + str(end) + '.txt', 'w') as f:
                # vacancy = np.zeros((resultDf.shape[0], resultDf.shape[1]-2))

                # lengthOfRacks = df.iloc[df.shape[0]-1]
                # for i in range(df.shape[1]-1):
                #     st = 'Shelf_' + str(i+1)
                #     if resultDf[st].sum() < 99.9:
                #         space = (100 - resultDf[st].sum())*lengthOfRacks[st]/100
                #         vacancy[:,i] = space/size
                # vacancy[(vacancy < 1) & (vacancy > 0.95)] = 1
                # vacancy = np.floor(vacancy)

                # vacDf = pd.DataFrame(vacancy)
                # vacDf.set_index(results[0].index,inplace=True)

                # vacDf.columns = ['Shelf_' + str(i+1) for i in range(df.shape[1]-1)]

                f.write(vacancy.T.to_string())
               
               #f.write('Rackwise Shelf Share:\n')
               # for i in range(len(np.unique(labInd))):
               #     val = 100*df.loc[i].values/df.loc[i,'sum']
               #     f.write('Rack ' + str(i) + '\n' + str(val[:-1]) + '\n')
               #     f.write(str(np.sum(val[:-1])) + '\n')
               # val = 100*df.loc[len(np.unique(labInd))].values/df.loc[len(np.unique(labInd)),'sum']
               # f.write('\nOverall Shelf Share: \n'+ str(val[:-1]) + '\n')
               # f.write(str(np.sum(val[:-1])))

        with open(fullpath + '/ProdSkew_' + str(end) + '.txt', 'w') as f:
           f.write(round(results[3], 2).to_string())

        # print('Shelf-Share calculated and saved...\n')
        
    def main(self, inputfile, inputfolder):
        # self.trial()
        self._image_load(inputfile)
        self._sess = tf.Session()
        self._sess.graph.as_default()
        tf.import_graph_def(self.graph_def, name='')
        
        self._find_object()
        dataTable = self._process_boxes(inputfolder)
        return dataTable
        
def main_fn(inputfolder):
    shelf = ShelfShare()
    # for f in os.listdir('static'):
    #     if os.path.isfile(os.path.join('static', f)):
    #         os.remove(os.path.join('static',f))

    fullpath = os.path.join('/data1/naquib.alam/static/results', inputfolder)
    try:
        os.mkdir(fullpath)
    except OSError as exc:
        try:
            os.mkdir('/data1/naquib.alam/static/results')
            os.mkdir(fullpath)
        except OSError as e:
            if e.errno != errno.EEXIST:
                raise
            pass

        if exc.errno != errno.EEXIST:
            raise
        pass

    path = os.path.join('/data1/naquib.alam/static/temp', inputfolder)
    noOfShelves = 0
    for f in os.listdir(path):
        if os.path.isfile(os.path.join(path, f)):
            if f[0:7] == "result_":
                table = shelf.main(os.path.join(path, f), inputfolder)
                noOfShelves += 1

    with open(fullpath + '/timeStamp.txt', 'w') as f:
        f.write(str(datetime.datetime.fromtimestamp(time.time()).strftime('%d-%m-%Y %H:%M:%S')) + "\n"+inputfolder + "\nABC\n" + str(noOfShelves) + "\nPromo " + str(shelf.noOfPromo))

    return table

if __name__=='__main__':
    main_fn(os.path.join('/data1/naquib.alam/static/temp', inputfolder))

#main_fn(sys.argv[1])

All these generated metrics are shown as follows:
1. __SKU count__  
  ![title](images/Product_count_rsa.png)
  
  
2. __SKU shelf share__
  ![title](images/Product_shelf_share_rsa.png)
  
  
3. __SKU vacancy__  
A few important points regarding the calculation of this metric are as follows:
    1. Average width of each SKU from the whole training image is found.
    2. Each vacancy is checked for if an SKU can be placed there or not.
    3. Priorities for an SKU to be placed in any of these vacancies are as follows:
        1. Adjacent SKUs
        2. An SKU from upper rack
        3. An SKU from below rack
        4. Any other SKU based on predicted count
 
   ![title](images/Product_vacancy_rsa.png)
  
  
  
4. __Out of stock SKUs__  
A few important points regarding the calculation of this metric are as follows:
  
    1. In order to find out of stock SKUs, users are asked for the shelf IDs.
    2. Compliance table for these shelf IDs in the database are checked.
    3. Predicted counts and values from database are compared for each SKU on that shelf.
    4. Different alarms are raised depending on how low the predicted count is from actual count
        1. __Low__: if between 75% and 50%
        2. __Medium__: if between 50% and 25%
        3. __High__: if between 25% and 0%
        4. __Critical__: if there are zero products
   ![title](images/Out_of_stock_rsa.png)
        

### Promo Detection Module
A few implementation details regarding this module are as follows:
1. Since this dataset didn't have annotations for displayed promotions, we couldn't use deep learning based techniques.
2. All the regions which have been predicted as presence of an SKU were filled with zero (black pixels).
3. All the rack dividers were filled with zeros too.
4. Entire image were __thresholded__ and then __Contour Detection__ were used.
5. False positives were filtered using contour areas.


![title](images/Promo_detection_rsa.png)