# INTRODUCTION:


 - The goal of this miniproject is to recommend the best online store in terms of price for users after comparing the price across the stores for various FMCG products. 
 - The data about the products is obtained via Web scrapping. 
    
>Group Name: __DATA SCIENTISTS__ <br>
>Group Members: __Soujanya M, Ramya B, Suresh R__

# OVERALL APPROACH: 

The four tasks that will be done as a part of this exercise will be as follows:
 
    A. Data Acquisition  
    B. Data Cleaning 
    C. Data Integration 
    D. Exploratory Data Analysis and Recommendation
  

# Data Acquisition

## Approach:

 1.  Identify set of categories and products that we want to scrap from various online e-commerce platforms.
     >There are around 30 categories and 150 items identified for web scrapping
 2.  Choose e-commerce platforms 
     >Following platforms have been chosen for scrapping
     - Big Basket
     - DMart
     - Grofers
 3.  Search the products on the browser and observe the URL/AJAX calls made as a part of fetching data from server
 4.  Identify various CSS classes for product attributes like Product name, MRP, Special Price
 5.  Write code to perform web scrapping for each of the platforms and store the results in a separate CSV file for later use
     > The results are stored in respective csv files as follows
     - big_basket.csv
     - dmart.csv
     - grofers.csv

## FMCG Categories and Products 

In [16]:
#Necessary Library Imports
from bs4 import BeautifulSoup
import requests 
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

In [17]:
#The input file items.csv contains various product categories and brands for which scrapping has to be performed
df_products = pd.read_csv("products.csv" )
print("Total categories to be scrapped ", len(df_products["Product"].unique()))  
print("Total products to be scrapped ", df_products.shape[0])  

Total categories to be scrapped  30
Total products to be scrapped  142


## Scrapping Bigbasket

 - The URL for big basket data is different for first page and subsequent pages. 
 - So, we need to maintain two separate URLs in the code
 - Since, we are very specific about each product, scrapping 2 pages is more than enough

In [11]:
#Big Basket Scrapping
def search_bigbasket(search_brand, search_product):
    
    #Max pages to scrap
    max_pages = 2
    
    #Product full name
    searchStr = search_brand + " " + search_product
    
    #URL for page 1
    url_page_1 = "https://www.bigbasket.com/custompage/getsearchdata/?type=deck&slug={0}"
    
    #URL for subsequent pages
    url_for_page_n = "https://www.bigbasket.com/product/get-products/?slug={0}&page={1}&tab_type=[%22all%22]&sorted_on=relevance&listtype=ps"
    
    #Replace spaces in the product name with +
    search_str_encoded = searchStr.replace(" ","+")
    
    #Form actual URL for page 1
    url = url_page_1.format(search_str_encoded)
 
    #Necessary request headers
    headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'}
    
    #Get the response for page 1
    productsInfo = requests.get(url, headers=headers).json()
    
    #Extract product data into a dictionary object
    products =  productsInfo['json_data']['tab_info'][0]['product_info']['products']
    product_data=[]
    for product in  products:
        options = product['all_prods']
        if(not options): 
            product_data.append({"search_brand": search_brand, "search_product": search_product, "shop": "Big Basket", "product_name": str(product['p_brand']).strip() + " " + str(product['p_desc']).strip(), "weight":product['w'], "mrp": product['mrp'], "special_price": product['sp'] })
            continue
        for option  in product['all_prods']:
            product_data.append({"search_brand": search_brand, "search_product": search_product, "shop": "Big Basket",  "product_name": str(product['p_brand']).strip() + " " + str(product['p_desc']).strip(), "weight":option['w'],
                                 "mrp": option['mrp'], "special_price": option['sp'] }) 
 
    
    #Do it for subsequent pages     
    for page in range(2,max_pages):
        newurl = url_for_page_n.format(search_str_encoded, page) 
        productsInfo = requests.get(newurl, headers=headers).json()
        products =  productsInfo['tab_info']['product_map']['all']['prods']

        for product in  products:
            options = product['all_prods']
            if(not options): 
                product_data.append({"search_brand": search_brand, "search_product": search_product, "shop": "Big Basket",    "product_name": str(product['p_brand']).strip() + " " + str(product['p_desc']).strip(), "weight":product['w'], "mrp": product['mrp'], "special_price": product['sp'] })
                continue
            for option  in product['all_prods']:
                product_data.append({"search_brand": search_brand, "search_product": search_product,  "shop": "Big Basket",  "product_name": str(product['p_brand']).strip() + " " + str(product['p_desc']).strip(), "weight":option['w'], "mrp": option['mrp'], "special_price": option['sp'] })

    #Return the scrapped data           
    return product_data
 

## Scrapping DMart

 - The URL for DMart is same for all pages as long as we include the page number. 
 - So, we need to maintain a URL with placeholder for page in the code
 - DMart requires a Store Id be sent in the header to be sent. Otherwise, it will return wrong results
 - Since, we are very specific about each product, scrapping 2 pages is more than enough

In [12]:
#Dmart Scrapping
def search_dmart(search_brand, search_product):
    #Max pages to scrap
    max_pages = 2
    #Product full name
    searchStr = search_brand + " " + search_product
    #URL for page 
    url_page  = "https://digital.dmart.in/api/v1/search/{0}?page={1}"
    #Necessary request headers
    headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'}
    
    #DMart requires store Id 
    headers['storeId']='10657'
    #Replace spaces in the product name with +
    search_str_encoded = searchStr.replace(" ","+")
    
    #Extract product data into a dictionary object
    product_data=[]
    for page in range(1,max_pages):
        url = url_page.format(search_str_encoded, page)
        productsInfo = requests.get(url, headers=headers).json()
        #If the scrapping did not yield any products, just continue with next product
        products=productsInfo.get( 'suggestionView')
        if(not products):
            continue
     
        for product in  products:
            skus = product['skus']
            for sku in skus:  
                if(sku['defining']):
                    name =   sku.get( 'name',  product['name']).strip()
                    product_data.append({"search_brand": search_brand, "search_product": search_product, "shop": "DMart",   "product_name": name,"weight": sku['defining'][0]['volume'], "mrp": sku['price_MRP'], "special_price": sku['price_SALE'] })
    #Return the scrapped data           
    return product_data
 

## Scrapping Grofers

 - The URL for Grofers is same for all pages as long as we include the page number. 
 - Grofers requires a special header to be sent. Otherwise, it won't return any results
 - So, we need to maintain a URL with placeholder for page in the code
 - Since, we are very specific about each product, scrapping 2 pages is more than enough

In [13]:
#Grofers Scrapping
def search_grofers(search_brand, search_product):
    #Max pages to scrap
    max_pages = 2
    #Product full name
    searchStr = search_brand + " " + search_product
    #URL for page. This should include Longitude and Latitude as well. 
    url_page  = "https://grofers.com/v5/search/merchants/26659/products/?lat=17.4196281427546&lon=78.3790778036223&q={0}&suggestion_type=0&t=1&start={1}&size=48"
    #Necessary request headers
    headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'}
    #Grofers requires this header 
    headers['app_client'] = 'consumer_web' 
    #Replace spaces in the product name with +
    search_str_encoded = searchStr.replace(" ","+")
    #Extract product data into a dictionary object
    product_data=[]
    for page in range(1,max_pages):
        start_pos = (page-1)*48
        if(start_pos==0):
            start_pos = 1
        #Create page specific URL
        url = url_page.format(search_str_encoded, start_pos)
        productsInfo = requests.get(url, headers=headers).json()
        #If the scrapping did not yield any products, just continue with next product
        products= productsInfo.get('products')
        if(not products):
            break
        for product in  products:
            prod_variants = product['variant_info']
            for var in prod_variants:
                product_data.append({"search_brand": search_brand, "search_product": search_product,   "shop": "Grofers",  "product_name": var["line_1"],"weight": var["unit"], "mrp": var["mrp"], "special_price": var["price"]})
    #Return the scrapped data    
    return product_data
 

## Perform Scrapping for all online portals

 - The following section performs scrapping for all the 3 portals for the products mentioned in input file. 
 - Ensure input.csv file is present in the current directory before calling this method.

In [14]:
# Scrapping for all sites
def perform_scrapping():
    #Read all items for which scrapping is needed
    df_input   = pd.read_csv("items.csv" )
    #Define an empty dataframe to store results for each of the shops
    column_names =   ["search_brand","search_product", "shop",   "product_name", "weight", "mrp", "special_price"]
    
    df_bigbasket = pd.DataFrame(columns = column_names)
    df_dmart = pd.DataFrame(columns = column_names)
    df_grofers = pd.DataFrame(columns = column_names)
    
    #For each of the products, peform scrapping
    for ( idx , search_brand, search_product) in df_input.itertuples():
  
        print('Scrapping BigBasket for ', idx, search_brand, search_product)
        product_data = search_bigbasket(search_brand, search_product)
        df_bigbasket = df_bigbasket.append(product_data[:5])

        print('Scrapping Grofers for ', idx, search_brand, search_product)
        product_data = search_grofers(search_brand, search_product)
        df_grofers = df_grofers.append(product_data[:5])

        print('Scrapping BigBasket for ', idx, search_brand, search_product)
        product_data = search_dmart(search_brand, search_product)
        df_dmart = df_dmart.append(product_data[:5])
  
    #Store final results for each of the stores in separate csv file
    df_bigbasket.to_csv('big_basket.csv', index=False,encoding='utf-8')
    df_grofers.to_csv('grofers.csv', index=False, encoding='utf-8')
    df_dmart.to_csv('dmart.csv', index=False, encoding='utf-8')
 

## Scrapping Results
This section shows the results of the scrapping

In [15]:
#Call scrapping routine
perform_scrapping()
#Load scrapped data into data frames and look at the metrics
df_bb  = pd.read_csv("big_basket.csv")
df_dmart  = pd.read_csv("dmart.csv")
df_grofers  = pd.read_csv("grofers.csv")
 
print("Bigbasket returned products count:" , df_bb.shape[0] )
print("DMart returned products count:" , df_dmart.shape[0] )
print("Grofers returned products count:" , df_grofers.shape[0] )

Scrapping BigBasket for  0 3 Roses Tea
Scrapping Grofers for  0 3 Roses Tea
Scrapping BigBasket for  0 3 Roses Tea
Scrapping BigBasket for  1 7 Up Soft Drink
Scrapping Grofers for  1 7 Up Soft Drink
Scrapping BigBasket for  1 7 Up Soft Drink
Scrapping BigBasket for  2 ACT II Popcorn
Scrapping Grofers for  2 ACT II Popcorn
Scrapping BigBasket for  2 ACT II Popcorn
Scrapping BigBasket for  3 Adidas Deodrant
Scrapping Grofers for  3 Adidas Deodrant
Scrapping BigBasket for  3 Adidas Deodrant
Scrapping BigBasket for  4 Anurag Cooking Oil
Scrapping Grofers for  4 Anurag Cooking Oil
Scrapping BigBasket for  4 Anurag Cooking Oil
Scrapping BigBasket for  5 Appy Fizz Juice
Scrapping Grofers for  5 Appy Fizz Juice
Scrapping BigBasket for  5 Appy Fizz Juice
Scrapping BigBasket for  6 Aswini Hair Oil
Scrapping Grofers for  6 Aswini Hair Oil
Scrapping BigBasket for  6 Aswini Hair Oil
Scrapping BigBasket for  7 Bisleri Mineral Water
Scrapping Grofers for  7 Bisleri Mineral Water
Scrapping BigBasket f

IndexError: list index out of range

# Data Cleaning

## Approach:

 For each of thep files generated in the previous section, perform following cleanup steps
 1. Separate weight and measure to separate columns from weight column
 2. Calculate the derived column discount based on MRP and Special Price
 3. Drop duplicate products that may have arrived due to wrong search results given by the site
 4. Re-arrange columns in proper order
 5. Save final set of products in respective csv files as follows
     - big_basket_cleaned.csv
     - dmart_cleaned.csv
     - grofers_cleaned.csv

## Clean up for Big Basket

In [None]:
df  = pd.read_csv("big_basket.csv",  engine='python')
print(df.product_name.unique())
print(df.weight.unique())

<font color=red>Inference: Clean up is required as follows.</font>

 - There are products with combo packs. So, remove them as identifying individual product becomes tough during recommendation
 - Remove products with weights that has invalid characters other than numbers
 - Drop duplicate products that may have arrived due to wrong search results given by the site

In [None]:
#Cleanup routine for bigbasket
def clean_bigbasket_data():
    #Read the dataset
    df  = pd.read_csv("big_basket.csv",  engine='python')
    # Pattern to identify combo packs
    searchfor = ['x', 'X','\+','Comb',  'each', '\(']
    #Drop the rows with above patterns
    df.drop( df[ df["weight"].str.contains('|'.join(searchfor)) ].index,inplace=True  )
    #Separate weight in to weight, measure
    df[['weight','measure']] = df.weight.str.split(expand=True) 
    #Remove rows that have invalid weight entries
    df.drop( df[~df["weight"].str.replace('.','',1).str.isnumeric() ].index,inplace=True  )
    #Check if product name has the keyword we searched for
    df['good_product_ind'] = [x[1] in x[0] for x in zip(df['product_name'], df['search_brand'])]
    #Drop the products that don;t meet the above criteria
    df.drop( df[~df["good_product_ind"] ].index,inplace=True  )   
    #Creae weight_measure column by combining weight and measure
    df[ 'weight_measure' ] = df.weight.str.strip() + df.measure.str.strip()
    #Re-order columns
    df = df[['search_product','search_brand', 'shop' ,'product_name','weight', 'measure', 'weight_measure', 'mrp','special_price']]
    #Calculate discount
    df['discount'] = (df.mrp - df.special_price) *100 / df.mrp 
    #Remove duplicate entries for given product and weight
    df.drop_duplicates(subset=['product_name','weight'], keep='last', inplace=True)
    #Save the results to big_basket_cleaned
    df.to_csv('big_basket_cleaned.csv', index=False)

## Clean up for DMart

In [None]:
df  = pd.read_csv("dmart.csv",  engine='python')
print(df.product_name.unique())
print(df.weight.unique())

<font color=red>Inference: Clean up is required as follows.</font>

 - There are products with combo packs. So, remove them as identifying individual product becomes tough during recommendation
 - Remove products with weights that has invalid characters other than numbers
 - Drop duplicate products that may have arrived due to wrong search results given by the site
 - Product name contains weight. Remove it. 

In [None]:
#Cleanup routine for dmart
def clean_dmart_data():
    #Read the dataset
    df  = pd.read_csv("dmart.csv")
    # Pattern to identify combo packs
    searchfor = ['x', 'X', 'W', 'Bags','Sachets','Drops','Wipes','U','Wipe','Pellets','Cubes','unit','units','tablets',"\+",'XL','L','M','S','Set','Pink']
    #Drop the rows with above patterns
    df.drop( df[ df["weight"].str.contains('|'.join(searchfor)) ].index,inplace=True  )
    #Drop the rows with zero weights
    df.drop( df[ df["weight"] == "0" ].index,inplace=True  )
    #Separate weight in to weight, measure
    df[['weight','measure']] = df.weight.str.split(expand=True) 
    #Replace gm meausre to g to make it consistent with other shops 
    df[ 'measure' ] = df.measure.str.replace('gm','g') 
    #Remove rows that have invalid weight entries
    df.drop( df[~df["weight"].str.replace('.','',1).str.isnumeric() ].index,inplace=True  )
    #Creae weight_measure column by combining weight and measure
    df[ 'weight_measure' ] = df.weight.str.strip() + df.measure.str.strip()
    #Check if product name has the keyword we searched for
    df['good_product_ind'] = [x[1] in x[0] for x in zip(df['product_name'], df['search_brand'])]
    #Drop the products that don;t meet the above criteria
    df.drop( df[~df["good_product_ind"] ].index,inplace=True  )   
    #Re-order columns
    df = df[['search_product','search_brand', 'shop' ,'product_name','weight', 'measure', 'weight_measure', 'mrp','special_price']]
    #Calculate discount
    df['discount'] = (df.mrp - df.special_price) *100 / df.mrp 
    #Replace the measure which is part of the product name with blank
    df['product_name'] = df.product_name.str.split(":").str[0].str.strip()
    #Remove duplicate entries for given product and weight
    df.drop_duplicates(subset=[  'product_name','weight' ],keep='last', inplace=True)
    #Save the results to dmart_cleaned
    df.to_csv('dmart_cleaned.csv', index=False)
 


## Clean up for Grofers

In [None]:
df  = pd.read_csv("grofers.csv",  engine='python')
print(df.product_name.unique())
print(df.weight.unique())

<font color=red>Inference: Clean up is required as follows.</font>

 - There are products with combo packs. So, remove them as identifying individual product becomes tough during recommendation
 - Remove products with weights that has invalid characters other than numbers
 - Drop duplicate products that may have arrived due to wrong search results given by the site

In [None]:
#Cleanup routine for grofers
def clean_grofers_data():
    #Read the dataset
    df  = pd.read_csv("grofers.csv")
    # Pattern to identify combo packs
    searchfor = ['x', 'X', ',', 'Refills', 'Bags','Sachets','Drops','Wipes','U','Wipe','Pellets','Cubes','unit','units','tablets',"\+",'XL','L','M','S','Set','Pink']
    #Drop the rows with above patterns
    df.drop( df[ df["weight"].str.contains('|'.join(searchfor)) ].index,inplace=True  )
    #Separate weight in to weight, measure
    df[['weight','measure']] = df.weight.str.split(expand=True) 
    #Remove rows that have invalid weight entries
    df.drop( df[~df["weight"].str.replace('.','',1).str.isnumeric() ].index,inplace=True  )
    #Creae weight_measure column by combining weight and measure
    df[ 'weight_measure' ] = df.weight.str.strip() + df.measure.str.strip()
    #Check if product name has the keyword we searched for
    df['good_product_ind'] = [x[1] in x[0] for x in zip(df['product_name'], df['search_brand'])]
    #Drop the products that don;t meet the above criteria
    df.drop( df[~df["good_product_ind"] ].index,inplace=True  )    
    #Re-order columns
    df = df[['search_product','search_brand', 'shop' ,'product_name','weight', 'measure', 'weight_measure', 'mrp','special_price']]
    #Calculate discount
    df['discount'] = (df.mrp - df.special_price) *100 / df.mrp 
    #Remove duplicate entries for given product and weight
    df.drop_duplicates(subset=[  'product_name','weight'],keep='last', inplace=True)
    #Save the results to grofers_cleaned
    df.to_csv('grofers_cleaned.csv', index=False)


## Clean up for All

In [None]:
#Run cleanup 
clean_bigbasket_data()
clean_dmart_data()
clean_grofers_data()

df_bb  = pd.read_csv("big_basket_cleaned.csv")
df_dmart  = pd.read_csv("dmart_cleaned.csv")
df_grofers  = pd.read_csv("grofers_cleaned.csv")
 
print("Bigbasket products count after cleanup:" , df_bb.shape[0] )
print("DMart products count after cleanup:" , df_dmart.shape[0] )
print("Grofers products count after cleanup:" , df_grofers.shape[0] )

# Data Integration

## Approach:

 For each of the files generated in the previous section, combine them into a single file as follows
 1. Load following scrapped and already cleaned files into separate data frames 
 >
     - big_basket_cleaned.csv
     - dmart_cleaned.csv
     - grofers_cleaned.csv 
     >
         
 2. Join all the 3 data frames into a single data frame
 3. Sort the products based on product type and name so that related products will be together irrespective of he shop
 
 4. Write the data frame into a final file
      - combined.csv

In [None]:
#Combine cleaned files into combined.csv file
def combine_products():
    #Load cleaned files into separate dataframes 
    df_dmart  = pd.read_csv("dmart_cleaned.csv")
    df_grofers  = pd.read_csv("grofers_cleaned.csv")
    df_bb  = pd.read_csv("big_basket_cleaned.csv")
   
    #Combine these files into a single data frame
    df_combined = pd.concat([df_bb, df_dmart, df_grofers])
    
    #Sort them based on searched product and resultant product name
    df_combined.sort_values(by=[ 'search_product' , 'product_name'], inplace=True)
    #Reset the index
    df_combined = df_combined.reset_index(drop=True)
    #Store the results in final file combined.csv
    df_combined.to_csv('combined.csv', index=False)
     
    return df_combined

In [None]:
#Combine Products 
combine_products()
#Read combined products into data frame
df_combined  = pd.read_csv("combined.csv")
print("Final Scrapped and Cleaned Product Count:" , df_combined.shape[0] )

# EDA & RECOMMENDATION

## Approach:

 For each of the files generated in the previous section, combine them into a single file as follows
 1. Load following scrapped, cleaned and combined file into a data frame 
 
     - combined.csv
   
         
 2. Perform EDA
 3. Provide user interface for users to choose a product category, name and size, and recommended from which shop the user can buy and the discount  

## EDA  


In [None]:
#Read the file into a dataframe
items = pd.read_csv('combined.csv')

In [None]:
#print shape of the combined dataset
print("The total count of products available across shops ",  items.shape[0])

In [None]:
#print the first few rows
items.head()

In [None]:
#Check if there are any null values in the final dataset
items.isnull().sum()

In [None]:
#Which shops are part of the dataset
items.shop.unique()

In [None]:
#Unique product categories
items.search_product.unique()

In [None]:
#Scatter plot for all attributes
sns.pairplot(items)
plt.show()

In [None]:
#Heatmap for all attributes
plt.figure(figsize = (10,8))
corr = items.corr()
sns.heatmap(corr, annot=True)
plt.show()



Inference: The above heat map shows there is a tight correlation between mrp and special_price.

In [None]:
## How many products does each shop have to offer?
prods_shops_df = items[['shop','search_product']]
prod_shops_counts =prods_shops_df.groupby(['shop'],as_index =False).count()
prod_shops_counts.plot.bar(x="shop", y="search_product", rot=70, title="Products accross Vendors");
 

Inference: Big Basket has highest number of products available while Grofers has the least. Customers are most likely to find a product in Big Basket.

In [None]:
#How many products avaialble for each product category overall
products_count =prods_shops_df.groupby(['search_product'],as_index =False).count()
dataFrame = pd.DataFrame(data=products_count);
y_pos = np.arange(len(dataFrame))
# Draw a vertical bar chart
ax = dataFrame.plot.bar(x="search_product", y="shop", rot=80, title="Products and their Availability");
ax.set_xlabel("Product")
ax.set_ylabel("Count")


In [None]:
###  Number of products sold per product category by each shop 
pd.crosstab(items.shop, items.search_product, margins=True, margins_name="Total",rownames=['shop']).T

In [None]:
#  Which Individual Products  have Discount
discounts_df = items[ (items.discount > 0.0)]
  
discounts_shop_products = discounts_df[['search_product','shop']]
discounts_per_shop =discounts_shop_products.groupby(['shop'],as_index =False).count()
ax = discounts_per_shop.plot.bar(x="shop", y="search_product", rot=70, title="Products accross Vendors with discount");
ax.set_xlabel("Shop")
ax.set_ylabel("Product count with discount") 
 

Inference: DMart offers the most discounts. So customers looking for discounts should visit DMart.

## Recommendation  


<font color=red>Note: Ensure combined.csv file is present in current path before executing following recommendation code.</font>

In [None]:
#Import necessary libraries for user interface
from ipywidgets import interact, Dropdown, HTML, Layout, Box, Label
from IPython.display import display
import warnings
warnings.filterwarnings('ignore')

#Read data into dataframe
df_combined = pd.read_csv("combined.csv")

#Unique product types that the user can buy
products_list = df_combined["search_product"].unique()     

#Create dropdown widgets for the user to interact
product_type_dropdown = Dropdown(description="Product Type:", options = products_list)
product_names_dropdown = Dropdown(description="Product:")
product_sizes_dropdown = Dropdown(description="Size:")
recommendation_html = HTML(
    value=" ",
    placeholder='',
    description='',  layout={'width': 'max-content'}
)

box = Box(
    [
        Label(value='Recommendation:'),
        recommendation_html
    ]
) 
#Event handler when product category changes
def update_product_name_options(change):  
    product_type = product_type_dropdown.value         
    df_product = df_combined[  df_combined.search_product.str.lower().str.contains(product_type.lower())    ]
    products = df_product["product_name"].unique()
    product_names_dropdown.options = products
product_type_dropdown.observe(update_product_name_options )
 
#Event handler when product name changes
def update_product_size_options(change): 
    if change['name'] == 'value' and (change['new'] != change['old']):
        product_type = product_type_dropdown.value
        product_name = product_names_dropdown.value
        df_product = df_combined[  df_combined.search_product.str.lower().str.contains(product_type.lower())    ]
        product_name = product_name.replace("+", "\+")
        df_product = df_product[  df_product.product_name.str.lower().str.contains(product_name.lower())   ] 
        product_sizes = df_product["weight_measure"].unique()
        product_sizes_dropdown.options = product_sizes
product_names_dropdown.observe(update_product_size_options )

#Event handler when product size changes
def recommend_product(change):
    if change['name'] == 'value' and (change['new'] != change['old']):
        product_type = product_type_dropdown.value
        product_name = product_names_dropdown.value
        product_size = product_sizes_dropdown.value
        recommendation_html.value=""
        product_name = product_name.replace("+", "\+")
        if(product_size):
            df_final_list = df_combined[ df_combined.search_product.str.lower().str.contains(product_type.lower()) & 
                                         df_combined.product_name.str.lower().str.contains(product_name.lower()) & 
                                         df_combined.weight_measure.str.lower().str.contains(product_size.lower()) ] 
            df_final_list.sort_values(by='special_price', inplace=True)
            discount = round(  df_final_list[ "discount"].iloc[0] , 2)
            recommendation_html.value= "Buy it from <b>" + df_final_list.shop.iloc[0] + "</b> for Rs."  + str(df_final_list[ "special_price"].iloc[0])  + " discount of " + str(discount) + "%"
product_sizes_dropdown.observe(recommend_product )

#Display all user interface elements 
display(product_type_dropdown)
display(product_names_dropdown)
display(product_sizes_dropdown)
display(box)
update_product_name_options(None )
    