## Customer Analytics at Bigbasket Product Recommendations


**1. What are the problems Bigbasket is trying to address? Please be as specific as possible.**

Bigbasket, India's largest online grocery retailer, is striving to tackle two crucial challenges in their business:

* **Enhancing Product Search Efficiency:** Many customers face difficulties when searching for products on their mobile devices while placing orders on Bigbasket.com. With an extensive range of items to choose from, the process can be time-consuming and can be an unpleasant experience. This especially affects customers who place orders on the go or in a hurry, leading to an unsatisfied & bitter experience for users. Bigbasket aims to streamline this process to improve customer satisfaction and encourage repeat orders. In simple words, agility, prominence and instantaneous should be the mantra of the business failing which they would loose lot of good customers.

* **Reducing Forgetfulness During Checkout:** Another significant issue is customers frequently forgetting to purchase essential items when checking out their orders. This results in additional orders being placed for the forgotten items, leading to increased operational costs for Bigbasket. The company might have to make two trips to the same location within a short period, leading to higher supply chain expenses. Alternatively, customers might buy the forgotten items from nearby stores, resulting in lower order values for Bigbasket in the future.

To address these challenges and enhance the overall customer experience, Bigbasket plans to implement predictive analytics solutions. By introducing the "Smart Basket" feature, the company aims to offer personalized shopping lists based on individual customers' purchase behavior and history. This would help simplify the ordering process, enabling customers to find their regular items quickly. Additionally, the "Did you forget?" feature will provide targeted product recommendations during checkout, minimizing the chances of customers forgetting essential items and increasing order values.

Through these initiatives, Bigbasket aims to optimize their online grocery shopping experience, retain customers, and reduce operational costs in the competitive Indian market.

**2. What is/are some of the fundamental differences in the recommender systems requirements between
Bigbasket and other eCommerce companies such as Amazon or Flipkart?**

Bigbasket, Amazon, and Flipkart are all prominent players in the eCommerce space, but they cater to different market segments and product categories, which leads to specific requirements for their recommender systems.

* **Basket Size and Purchase Frequency:** Bigbasket specializes in the online grocery retailing industry, where customers tend to buy a larger number of products per order and make frequent purchases for daily use items like groceries. On the other hand, Amazon and Flipkart serve a broader range of products being sold online, and their customers' purchase behavior may involve more infrequent and diverse purchases that goes beyond groceries. As a result, the recommender systems for Bigbasket need to be fine-tuned to handle grocery-specific preferences.

* **Specific Forgetfulness Scenario:** One unique challenge faced by Bigbasket is customers frequently forgetting to add essential grocery items to their carts during the checkout process. This situation is less common in other eCommerce platforms, which typically focus on suggesting related or similar items based on user behavior and past purchases & there is always a flexibility to add more items in the cart, cancel the order and reinitiate a new order with the Updated Cart. Thus, Bigbasket should have a robust recommender system that can address the pain area of forgetfulness by providing relevant and timely reminders to the customers.

* **Mobile Interface Limitation:** Many Bigbasket customers place orders through their mobile devices on the go. This poses a challenge as they have to navigate a vast product catalog on a smaller screen, making the shopping experience more cumbersome. Therefore, the recommender system for Bigbasket should be optimized for mobile users, offering concise and user-friendly recommendations that facilitate quick and efficient product discovery. On the contrary, this is not the case of Amazon & Flipkart where the interface is so simple to traverse and ordering is so easy.

* **Repetitive Purchases and Inventory Considerations:** Grocery shopping often involves customers buying the same products repeatedly. This pattern contrasts with other eCommerce platforms where users may explore a more diverse range of products regularly. Consequently, Bigbasket's recommender system must be adept at identifying and suggesting commonly purchased items accurately. Additionally, the system needs to factor in real-time inventory updates, considering perishable items' availability and expiration dates to avoid recommending products that cannot be fulfilled promptly.

* **Delivery Efficiency:** Efficient order fulfillment and delivery are crucial for customer satisfaction in the grocery retailing industry. The recommender system should take into account logistics and delivery schedules to suggest items that can be conveniently delivered together, enhancing the overall shopping experience.

In conclusion, the unique characteristics of Bigbasket's online grocery retailing business demand specific adaptations in their recommender system. Understanding the differences in customer behavior, forgetfulness patterns, mobile usage, inventory, and delivery requirements allows Bigbasket to provide personalized and seamless shopping experiences for their customers in the competitive online grocery market.

**3. Discuss what sort of recommendation technique(s) is/are more appropriate for Bigbasket, in general & why? Elaborate by connecting different methods and techniques with specific use cases.**

For Bigbasket, an online grocery retailer facing unique challenges, certain recommendation techniques prove more appropriate for their platform.

* **Collaborative Filtering with Purchase History:** Leveraging collaborative filtering, Bigbasket can offer ``personalized recommendations based on customers' past purchase behavior.`` By analyzing users' buying history, the system identifies patterns and suggests **items frequently bought together by customers with similar preferences.** This addresses the forgetfulness scenario, prompting users to add frequently purchased items they may have missed during checkout.

* **Content-Based Filtering with Product Attributes:** Content-based filtering aligns with Bigbasket needs, as it relies on product attributes to understand user preferences. By considering factors like categories, brands, and nutritional information, the system suggests relevant products, enhancing product search efficiency and tailoring recommendations to individual preferences.

* **Hybrid Recommendation Approach:** A combination of **collaborative and content-based filtering** allows Bigbasket to provide comprehensive recommendations. Presenting a "Smart Basket" based on purchase history (collaborative filtering) while incorporating similar attribute products (content-based filtering) optimizes the shopping experience.

* **Real-Time Inventory Based Recommendations:** Addressing inventory challenges, real time inventory-based recommendations ensure only available and fresh products are suggested. By considering live inventory updates, Bigbasket avoids suggesting out-of-stock items during delivery, enhancing customer satisfaction.

In conclusion, by combining collaborative filtering, content-based filtering, hybrid recommendations & real-time inventory-based techniques, Bigbasket can tailor its recommendations, improve efficiency, reduce forgetfulness, and optimize the shopping experience for its customers in the competitive online grocery market.


**4. What sort of data challenges do you anticipate while building a recommendation engine for Bigbasket? Explore the data set and try applying different techniques on the dataset to identify the challenges that one would face while building the engine for Bigbasket.**

When constructing a recommendation engine for Bigbasket, several plausible data challenges can surface which are as follows:

1. **Sparse Data:** The dataset may contain many missing values or sparse interactions between users and products, making it difficult to generate accurate recommendations for all users.

2. **Cold Start Problem:** Providing relevant recommendations for new users or products with limited data can be challenging as the engine lacks sufficient information to understand their preferences.

3. **Data Volume:** Managing a large volume of data, including user interactions, product details, and historical purchases, demands efficient storage and processing capabilities.

4. **Data Variety:** The dataset may encompass various data types, such as categorical, numerical, and textual data, requiring specialized techniques for processing and analysis.

5. **Data Quality:** Ensuring data accuracy and cleanliness is vital for generating reliable recommendations, as noisy or erroneous data can lead to inaccurate results.

6. **Dynamic Data:** User preferences and product popularity may change over time, necessitating continuous updates to keep the recommendations up-to-date.

7. **Cold Start Products:** New products with limited purchase history may not have enough data for recommendations, leading to biased or inaccurate suggestions.

8. **User Behavior Shifts:** As user preferences and behavior shift over time, the engine must adapt to these changes and provide relevant recommendations accordingly.

9. **Scalability:** With a growing user base and product catalog, the engine must handle increased data volume and provide timely recommendations without compromising performance.

Addressing these challenges involves employing advanced data processing techniques, implementing effective algorithms, and conducting continuous evaluations to ensure the recommendation engine's accuracy and scalability.

In [None]:
# Importing the required libraries...
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns

import warnings
warnings.filterwarnings("ignore")

# Recommender System Libraries...
from mlxtend.frequent_patterns import apriori, association_rules

In [None]:
# Importing the Dataset...

#sheet_name = "POS DATA": represents the particular sheet to imported in python as there are two sheets in excel.

grocery = pd.read_excel("~/Downloads/IMB575-XLS-ENG.xls", sheet_name="POS DATA")

In [None]:
# Preview the Dataset
grocery.head()

Unnamed: 0,Member,Order,SKU,Created On,Description
0,M09736,6468572,34993740,22-09-2014 22:45,Other Sauces
1,M09736,6468572,15669800,22-09-2014 22:45,Cashews
2,M09736,6468572,34989501,22-09-2014 22:45,Other Dals
3,M09736,6468572,7572303,22-09-2014 22:45,Namkeen
4,M09736,6468572,15669856,22-09-2014 22:45,Sugar


In [None]:
# Check the Data Info
grocery.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 62141 entries, 0 to 62140
Data columns (total 5 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   Member       62141 non-null  object
 1   Order        62141 non-null  int64 
 2   SKU          62141 non-null  int64 
 3   Created On   62141 non-null  object
 4   Description  62141 non-null  object
dtypes: int64(2), object(3)
memory usage: 2.4+ MB


In [None]:
# Convert the dataset into a transactional format using the SKU as item id
encoded = grocery.groupby(['Order', 'Description'])['Description'].count().unstack().fillna(0)

In [None]:
def map_units(x):
    if x<=0:
        return(0)
    if x>=1:
        return(1)

encoded_binary_vars = encoded.applymap(map_units)

In [None]:
# Generate frequent itemsets using Apriori algorithm with minimum support of 0.05
frequent_itemsets = apriori(encoded_binary_vars, min_support=0.05, use_colnames=True)

In [None]:
# Generate association rules with minimum lift of 1.0
association_rules_df = association_rules(frequent_itemsets, metric="lift", min_threshold=1.0)

association_rules_df.sort_values('confidence', ascending = False, inplace = True)
association_rules_df[['antecedents','consequents','support','confidence','lift']].head(10)

Unnamed: 0,antecedents,consequents,support,confidence,lift
331,"(Gourd & Cucumber, Root Vegetables, Brinjals)",(Other Vegetables),0.064505,0.769559,1.798855
319,"(Gourd & Cucumber, Beans, Root Vegetables)",(Other Vegetables),0.086086,0.761603,1.780259
275,"(Gourd & Cucumber, Beans, Brinjals)",(Other Vegetables),0.070824,0.752852,1.759801
305,"(Beans, Root Vegetables, Brinjals)",(Other Vegetables),0.075355,0.729792,1.705899
318,"(Other Vegetables, Root Vegetables, Gourd & Cu...",(Beans),0.086086,0.71627,1.794848
192,"(Gourd & Cucumber, Beans)",(Other Vegetables),0.131632,0.715489,1.672466
258,"(Gourd & Cucumber, Root Vegetables)",(Other Vegetables),0.120186,0.713881,1.668707
233,"(Gourd & Cucumber, Brinjals)",(Other Vegetables),0.100274,0.709106,1.657546
276,"(Gourd & Cucumber, Other Vegetables, Brinjals)",(Beans),0.070824,0.706302,1.76987
246,"(Root Vegetables, Brinjals)",(Other Vegetables),0.109336,0.689992,1.612867


**5. What are the implementation and deployment challenges of a recommendation engine for Bigbasket?**

Implementing a recommendation engine for Bigbasket comes with its own set of challenges, requiring careful consideration:

* **Real-time and Batch Processing:** The recommendation engine must seamlessly handle real-time user interactions and historical data analysis for generating long-term suggestions.

* **Selecting the Right Algorithms:** Choosing appropriate recommendation algorithms is crucial to provide accurate and relevant suggestions based on Bigbasket's product offerings and user behavior.

* **Personalization at Scale:** Delivering personalized recommendations to a large user base demands efficient data processing and algorithms.

* **Balancing Personalization and Diversity:** Striking a balance between personalized recommendations and diverse suggestions enhances the shopping experience.

* **Scalable Infrastructure:** To accommodate growing user traffic and data, the engine's infrastructure must be scalable and ensure uninterrupted service.

* **Continuous Model Updates:** Regular updates based on user behavior and market trends keep the recommendations accurate and relevant.

To tackle these challenges, robust infrastructure, thoughtful algorithm selection, and adherence to best practices are essential. Continuous monitoring and optimization ensure the recommendation engine's efficiency, guided by user feedback and performance metrics. By addressing these factors, Bigbasket can deploy a successful recommendation engine, enhancing customer engagement and loyalty in the competitive online grocery market.

**6. Write an R/Python script to generate five consumer-agnostic “good-quality” association rules sorted by their lift ratios? Please include the output (copy-paste/screenshot) ─ those five rules along with metrics that quantify the “goodness” of those rules ─ in your submission. What is the support of the first rule? Explain how it has been calculated for this rule. What is the confidence of the first rule? Explain how it has been calculated for this rule. What is the lift ratio of the first rule? Explain how it has been calculated for the rule. Based on the five rules you identified, suggest a couple of action plans that can benefit Bigbasket.**


Note - The Rules have been created above

The table represents association rules generated from the Apriori algorithm, which is a method used in the context of a recommender system. Each row in the table presents an association rule comprising antecedents, consequents, support, confidence, and lift.

* **Antecedents:** The antecedents are sets of items that **frequently occur together** or act as the "if" part of the association rule. For example, the antecedents (Gourd & Cucumber, Root Vegetables, Brinjals) suggest a combination of items that tend to occur together in the transactions.In other words, buying together or Same Cart Check Out stuff.


* **Consequents:** The consequents are sets of items that are **likely to occur when the antecedents are present,** forming the "then" part of the association rule. For instance, the consequents (Other Vegetables) indicate items that are frequently associated with the given antecedents.


* **Support:** The support measures the proportion of transactions in the dataset that contain both the antecedents and consequents. It indicates **how frequently the association rule occurs.** For instance, a support value of 0.064505 for the rule (Gourd & Cucumber, Root Vegetables, Brinjals) -> (Other Vegetables) suggests that this combination is present in about 6.45% of all transactions.


* **Confidence:** Confidence gauges the reliability or strength of the association rule. It is the proportion of transactions containing the antecedents that also contain the consequents. For example, a **confidence value of 0.769559 for the rule (Gourd & Cucumber, Root Vegetables, Brinjals) -> (Other Vegetables) means that when (Gourd & Cucumber, Root Vegetables, Brinjals) is present in a transaction, there is a 76.96% chance that (Other Vegetables) will also be present.**


* **Lift:** Lift indicates how much more likely the consequents are to occur when the antecedents are present compared to when they occur independently. **It signifies the strength of the association between the antecedents and consequents.** A lift value ``greater than 1 suggests a positive association``. For example, a lift value of 1.798855 for the rule (Gourd & Cucumber, Root Vegetables, Brinjals) -> (Other Vegetables) implies that the occurrence of (Gourd & Cucumber, Root Vegetables, Brinjals) increases the likelihood of (Other Vegetables) being present by approximately 79.89%.


These association rules are valuable in a recommender system, as they help identify strong relationships between items and enable the system to suggest relevant products to customers based on their purchase history. This enhances the shopping experience, potentially leading to increased sales, customer satisfaction, and improved inventory management for the business.

**7. Bigbasket is interested in introducing a “Did you forget?” feature to identify items a customer may have forgotten. Discuss how this feature can be created following data-driven approaches.**

Introducing a ``"Did you forget?"`` feature for Bigbasket can be achieved through data-driven approaches that harness customer data and predictive analytics. Here's how this feature can be developed:

* Data Collection: Bigbasket must collect and store customer data, including purchase history, shopping patterns, and interactions on the platform. This data forms the basis for building the recommendation engine.

* Customer Segmentation: Segmenting customers based on their shopping behavior, preferences, and purchase frequency allows the recommendation engine to personalize suggestions for each customer segment effectively.

* Predictive Analytics: Implementing predictive analytics techniques, such as machine learning algorithms, aids in predicting the items a customer is likely to purchase in the future. This involves analyzing historical data to identify patterns and trends.

* Collaborative Filtering: Collaborative filtering is a popular recommendation technique that relies on user-item interactions. It identifies items frequently purchased together by different users and suggests them as potential forgotten items to the current customer.

* Item Association Rules: Utilizing association rule mining helps identify relationships between different items in the shopping cart. It allows the system to recommend items that are often purchased together, even if the customer hasn't added them initially.

* Contextual Recommendations: Consider contextual factors like time of day, day of the week, or past purchase behavior to tailor the recommendations more accurately. For example, suggesting breakfast items in the morning or weekend-specific products.

* Real-time Data Processing: To offer timely recommendations, the engine should process data in real-time, continuously updating and adapting to the user's actions and preferences.

* A/B Testing: Implementing A/B testing helps validate the effectiveness of the "Did you forget?" feature and fine-tune its performance. It involves comparing different recommendation strategies to identify the most effective one.

* Continuous Improvement: Regularly monitor the feature's performance, gather user feedback, and collect relevant performance metrics. Use this information to make iterative improvements and enhance the overall customer experience.

By combining these data-driven approaches, Bigbasket can create a "Did you forget?" feature that proactively suggests items to customers, improving customer satisfaction, increasing order frequency, and reducing the likelihood of forgotten items in future purchases. This data-driven approach ensures that the recommendations are accurate, relevant, and personalized for each individual customer, thereby enhancing the overall shopping experience.

**8. Using Python, Pick a customer of your choice. Now, present an example of product sets A and B such that when a shopper has added product set A in his/her cart, the recommendation engine should suggest product sets B as “Did you forget items?” Support your answers using output generated by your scripts.**

The above can be achieved in the following steps:

* Filter the DataFrame with a Random Selected ID

* Create Recommender System using Apriori Alogrithm

* Generate the Association Rules basis Recommender System Created in the above step

* Create Did you Forget Item List

In [None]:
# Preview the Dataset
grocery.head()

Unnamed: 0,Member,Order,SKU,Created On,Description
0,M09736,6468572,34993740,22-09-2014 22:45,Other Sauces
1,M09736,6468572,15669800,22-09-2014 22:45,Cashews
2,M09736,6468572,34989501,22-09-2014 22:45,Other Dals
3,M09736,6468572,7572303,22-09-2014 22:45,Namkeen
4,M09736,6468572,15669856,22-09-2014 22:45,Sugar


In [None]:
grocery.Member.unique()

array(['M09736', 'M39021', 'M47229', 'M76390', 'M77779', 'M78365',
       'M78720', 'M82651', 'M84827', 'M86304', 'M86572', 'M90375',
       'M91098', 'M96365', 'M99030', 'M99206', 'M04158', 'M08075',
       'M09303', 'M12050', 'M12127', 'M14746', 'M16218', 'M16611',
       'M18732', 'M22037', 'M25900', 'M27458', 'M27871', 'M31101',
       'M31908', 'M31966', 'M32039', 'M32409', 'M32449', 'M32480',
       'M32655', 'M33064', 'M33422', 'M33491', 'M33558', 'M33745',
       'M33767', 'M34566', 'M35070', 'M35464', 'M35538', 'M35649',
       'M36366', 'M36432', 'M36702', 'M36876', 'M37253', 'M37600',
       'M38622', 'M40184', 'M41700', 'M41747', 'M41781', 'M42182',
       'M42513', 'M42827', 'M43189', 'M43831', 'M43977', 'M44156',
       'M45375', 'M45470', 'M46325', 'M46328', 'M46575', 'M46687',
       'M48101', 'M48154', 'M48938', 'M50038', 'M50094', 'M50420',
       'M50767', 'M51043', 'M51278', 'M52629', 'M54100', 'M54345',
       'M54382', 'M54619', 'M54796', 'M55932', 'M56255', 'M563

In [None]:
# Selecting a Member randomly using sample function from Pandas
grocery.sample(1, random_state = 1)

Unnamed: 0,Member,Order,SKU,Created On,Description
10774,M14746,8048328,15669977,16-04-2014 08:03,Almonds


In [None]:
# So, here we take the Member where the Member ID is M14746

encode_member = grocery.groupby(['Member', 'Description'])['Description'].count().unstack().fillna(0)
encode_member.head()

Description,After Shave,Agarbatti,Almonds,Aluminium Foil & Cling Wrap,Antiseptics,Avalakki / Poha,Ayurvedic,Ayurvedic Food,Baby Care Accessories,Baby Cereal,...,Vanaspati,Veg & Fruit,Vermicelli,Vinegar,Wafers,Washing Bars,Whole Grains,Whole Spices,Womens Deo,Yogurt & Lassi
Member,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
M04158,0.0,0.0,8.0,9.0,0.0,2.0,0.0,0.0,0.0,0.0,...,0.0,1.0,0.0,1.0,0.0,0.0,0.0,9.0,0.0,8.0
M08075,0.0,0.0,2.0,0.0,0.0,8.0,0.0,0.0,0.0,0.0,...,0.0,0.0,2.0,0.0,1.0,0.0,1.0,34.0,0.0,8.0
M09303,0.0,0.0,6.0,12.0,0.0,8.0,0.0,1.0,0.0,0.0,...,0.0,0.0,1.0,1.0,0.0,0.0,0.0,28.0,0.0,0.0
M09736,0.0,0.0,8.0,0.0,0.0,2.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,2.0,8.0,0.0,33.0,0.0,1.0
M12050,0.0,0.0,5.0,0.0,0.0,5.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0


In [None]:
# Filtering the DataFrame with the Member ID: M14746

filtered_df_member = grocery[grocery['Member'] == 'M14746']

In [None]:
filtered_df_member.head()

Unnamed: 0,Member,Order,SKU,Created On,Description
10304,M14746,6436519,15669760,2014-02-10 22:31:00,Whole Spices
10305,M14746,6436519,34938526,2014-02-10 22:31:00,Other Sweets
10306,M14746,6436519,15669818,2014-02-10 22:31:00,Whole Spices
10307,M14746,6436519,7598777,2014-02-10 22:31:00,Agarbatti
10308,M14746,6505503,7569801,15-09-2014 12:43,Namkeen


In [None]:
# Finding how many Transactions have taken place...

filtered_df_member.shape

(647, 5)

**Picking the Member - 'M14746' , we shall build the "Did you forget Algorithm". This particular members has done transaction 647 times which indicates he is a frequent buyer at Big Basket.**

In [None]:
# Imputting 0...
member_table = filtered_df_member.groupby(['Order', 'Description'])['Description'].count().unstack().fillna(0)
member_table

Description,Agarbatti,Almonds,Aluminium Foil & Cling Wrap,Avalakki / Poha,Ayurvedic,Banana,Beans,Besan,Boiled Rice,Bread,...,Shampoo,Sooji & Rava,Sugar Cubes,Toffee & Candy,Toor Dal,Urad Dal,Vermicelli,Whole Grains,Whole Spices,Yogurt & Lassi
Order,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
6436519,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0
6505503,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6543425,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6577820,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0
6608795,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8332823,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
8348714,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8363220,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8373867,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [None]:
# Mapping with the function created above...

encoded_member = member_table.applymap(map_units)
encoded_member.head(2)

Description,Agarbatti,Almonds,Aluminium Foil & Cling Wrap,Avalakki / Poha,Ayurvedic,Banana,Beans,Besan,Boiled Rice,Bread,...,Shampoo,Sooji & Rava,Sugar Cubes,Toffee & Candy,Toor Dal,Urad Dal,Vermicelli,Whole Grains,Whole Spices,Yogurt & Lassi
Order,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
6436519,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,0
6505503,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [None]:
# Apriori Algorithm..

frequent_member = apriori(encoded_member, min_support = 0.05, use_colnames = True)

In [None]:
frequent_member

Unnamed: 0,support,itemsets
0,0.112676,(Agarbatti)
1,0.154930,(Almonds)
2,0.070423,(Avalakki / Poha)
3,0.521127,(Beans)
4,0.070423,(Besan)
...,...,...
738,0.056338,"(Raisins, Gourd & Cucumber, Yogurt & Lassi, Ot..."
739,0.056338,"(Raisins, Yogurt & Lassi, Other Vegetables, Ro..."
740,0.056338,"(Raisins, Gourd & Cucumber, Other Vegetables, ..."
741,0.056338,"(Raisins, Gourd & Cucumber, Yogurt & Lassi, Ot..."


In [None]:
# Generate association rules with minimum lift of 1.0
association_rules_df = association_rules(frequent_member, metric="lift", min_threshold=1.0)

association_rules_df.sort_values('confidence', ascending = False, inplace = True)
association_rules_df[['antecedents','consequents','support','confidence','lift']].head(10)

Unnamed: 0,antecedents,consequents,support,confidence,lift
10451,"(Gourd & Cucumber, Root Vegetables, Cashews, O...","(Other Dry Fruits, Raisins)",0.056338,1.0,5.916667
1475,"(Yogurt & Lassi, Beans, Almonds)",(Gourd & Cucumber),0.056338,1.0,1.868421
1473,"(Gourd & Cucumber, Yogurt & Lassi, Almonds)",(Beans),0.056338,1.0,1.918919
6255,"(Gourd & Cucumber, Yogurt & Lassi, Other Dry F...",(Cashews),0.056338,1.0,4.176471
11013,"(Almonds, Cashews, Other Vegetables, Yogurt & ...","(Other Dry Fruits, Raisins)",0.056338,1.0,5.916667
6256,"(Gourd & Cucumber, Yogurt & Lassi, Cashews, Br...",(Other Dry Fruits),0.056338,1.0,2.958333
12218,"(Cashews, Root Vegetables, Other Dry Fruits, Y...","(Gourd & Cucumber, Other Vegetables, Raisins)",0.056338,1.0,8.875
12219,"(Other Vegetables, Root Vegetables, Other Dry ...","(Gourd & Cucumber, Yogurt & Lassi, Raisins)",0.056338,1.0,14.2
11924,"(Gourd & Cucumber, Root Vegetables, Other Vege...","(Yogurt & Lassi, Raisins, Brinjals)",0.056338,1.0,14.2
6260,"(Gourd & Cucumber, Yogurt & Lassi, Cashews)","(Other Dry Fruits, Brinjals)",0.056338,1.0,4.176471


In [None]:
# recommendation engine suggesting “Did you forget items?”
for index, row in association_rules_df.iterrows():
    ant = ', '.join(row['antecedents'])
    conseq = ', '.join(row['consequents'])
    print(f"Now that you have bought these items: {ant}, you are suggested these items , \n
    since you might have forgot: {conseq}")

Now that you have bought these items: Gourd & Cucumber, Root Vegetables, Cashews, Other Vegetables, you are suggested these items ,     since you might have forgot: Other Dry Fruits, Raisins
Now that you have bought these items: Yogurt & Lassi, Beans, Almonds, you are suggested these items ,     since you might have forgot: Gourd & Cucumber
Now that you have bought these items: Gourd & Cucumber, Yogurt & Lassi, Almonds, you are suggested these items ,     since you might have forgot: Beans
Now that you have bought these items: Gourd & Cucumber, Yogurt & Lassi, Other Dry Fruits, Brinjals, you are suggested these items ,     since you might have forgot: Cashews
Now that you have bought these items: Almonds, Cashews, Other Vegetables, Yogurt & Lassi, Brinjals, you are suggested these items ,     since you might have forgot: Other Dry Fruits, Raisins
Now that you have bought these items: Gourd & Cucumber, Yogurt & Lassi, Cashews, Brinjals, you are suggested these items ,     since you migh

Now that you have bought these items: Other Dry Fruits, Root Vegetables, Raisins, you are suggested these items ,     since you might have forgot: Yogurt & Lassi, Cashews, Brinjals
Now that you have bought these items: Yogurt & Lassi, Raisins, Brinjals, you are suggested these items ,     since you might have forgot: Root Vegetables
Now that you have bought these items: Yogurt & Lassi, Cashews, Raisins, Brinjals, you are suggested these items ,     since you might have forgot: Other Vegetables, Root Vegetables
Now that you have bought these items: Cashews, Raisins, Yogurt & Lassi, you are suggested these items ,     since you might have forgot: Root Vegetables
Now that you have bought these items: Raisins, Root Vegetables, Other Dry Fruits, you are suggested these items ,     since you might have forgot: Cashews
Now that you have bought these items: Yogurt & Lassi, Root Vegetables, you are suggested these items ,     since you might have forgot: Raisins, Brinjals
Now that you have boug

Now that you have bought these items: Other Vegetables, Cashews, Raisins, Brinjals, you are suggested these items ,     since you might have forgot: Gourd & Cucumber, Root Vegetables
Now that you have bought these items: Cashews, Root Vegetables, you are suggested these items ,     since you might have forgot: Other Vegetables, Other Dry Fruits, Yogurt & Lassi, Raisins
Now that you have bought these items: Other Vegetables, Root Vegetables, Other Dry Fruits, you are suggested these items ,     since you might have forgot: Beans, Raisins
Now that you have bought these items: Gourd & Cucumber, Beans, Almonds, you are suggested these items ,     since you might have forgot: Yogurt & Lassi
Now that you have bought these items: Yogurt & Lassi, Other Dry Fruits, you are suggested these items ,     since you might have forgot: Gourd & Cucumber, Cashews, Root Vegetables, Brinjals
Now that you have bought these items: Yogurt & Lassi, Other Dry Fruits, you are suggested these items ,     since y

Now that you have bought these items: Cashews, Other Dry Fruits, Raisins, you are suggested these items ,     since you might have forgot: Beans
Now that you have bought these items: Beans, Raisins, Brinjals, you are suggested these items ,     since you might have forgot: Gourd & Cucumber, Other Dry Fruits
Now that you have bought these items: Gourd & Cucumber, Cashews, Raisins, you are suggested these items ,     since you might have forgot: Beans
Now that you have bought these items: Cashews, Other Dry Fruits, Raisins, you are suggested these items ,     since you might have forgot: Yogurt & Lassi
Now that you have bought these items: Gourd & Cucumber, Beans, you are suggested these items ,     since you might have forgot: Other Vegetables
Now that you have bought these items: Almonds, Other Dry Fruits, you are suggested these items ,     since you might have forgot: Beans, Brinjals
Now that you have bought these items: Beans, Other Dry Fruits, Brinjals, you are suggested these item

Now that you have bought these items: Gourd & Cucumber, Beans, Brinjals, you are suggested these items ,     since you might have forgot: Other Dry Fruits
Now that you have bought these items: Other Vegetables, Organic F&V, Other Dry Fruits, you are suggested these items ,     since you might have forgot: Gourd & Cucumber, Beans
Now that you have bought these items: Organic F&V, Raisins, you are suggested these items ,     since you might have forgot: Gourd & Cucumber, Cashews, Other Dry Fruits
Now that you have bought these items: Gourd & Cucumber, Cashews, Brinjals, you are suggested these items ,     since you might have forgot: Other Dry Fruits, Beans, Raisins
Now that you have bought these items: Gourd & Cucumber, Cashews, Brinjals, you are suggested these items ,     since you might have forgot: Raisins, Other Vegetables, Root Vegetables, Other Dry Fruits, Yogurt & Lassi
Now that you have bought these items: Gourd & Cucumber, Cashews, Brinjals, you are suggested these items ,    

Now that you have bought these items: Root Vegetables, Brinjals, you are suggested these items ,     since you might have forgot: Yogurt & Lassi, Other Dry Fruits
Now that you have bought these items: Root Vegetables, Brinjals, you are suggested these items ,     since you might have forgot: Gourd & Cucumber, Yogurt & Lassi, Other Vegetables, Raisins
Now that you have bought these items: Gourd & Cucumber, Organic F&V, you are suggested these items ,     since you might have forgot: Cashews, Brinjals
Now that you have bought these items: Other Vegetables, Root Vegetables, you are suggested these items ,     since you might have forgot: Gourd & Cucumber, Yogurt & Lassi, Raisins, Brinjals
Now that you have bought these items: Other Dry Fruits, Brinjals, you are suggested these items ,     since you might have forgot: Raisins, Gourd & Cucumber, Other Vegetables, Root Vegetables, Cashews
Now that you have bought these items: Gourd & Cucumber, Organic F&V, you are suggested these items ,    

Now that you have bought these items: Beans, you are suggested these items ,     since you might have forgot: Gourd & Cucumber, Other Vegetables, Brinjals, Bread
Now that you have bought these items: Beans, you are suggested these items ,     since you might have forgot: Other Vegetables, Organic F&V, Root Vegetables, Brinjals
Now that you have bought these items: Beans, you are suggested these items ,     since you might have forgot: Gourd & Cucumber, Toor Dal, Raisins, Brinjals
Now that you have bought these items: Gourd & Cucumber, you are suggested these items ,     since you might have forgot: Yogurt & Lassi, Almonds, Raisins
Now that you have bought these items: Gourd & Cucumber, you are suggested these items ,     since you might have forgot: Yogurt & Lassi, Almonds
Now that you have bought these items: Gourd & Cucumber, you are suggested these items ,     since you might have forgot: Yogurt & Lassi, Beans, Brinjals
Now that you have bought these items: Gourd & Cucumber, you are

9. How do we find similarities between users based on what customers buy? Can collaborative filtering be used to create values given this dataset? Present a use case where finding similarities between user baskets can be helpful to Bigbasket.


* The Smart Basket feature for Bigbasket aims to offer personalized recommendations to individual customers based on their unique purchasing patterns and historical data. This entails employing two main data-driven approaches: association rules and collaborative filtering.

* With association rules, the focus is on analyzing a specific customer's past buying behavior and identifying items they may have forgotten to include in their current cart. Recommendations are customized as per the individual user, ensuring relevance and preciseness.

* On the other hand, collaborative filtering involves examining the purchase history of various customers to uncover similarities in their preferences and behavior. By identifying patterns in users' buying habits, the system can generate personalized recommendations that align with their interests.

* **For example, if one customer frequently purchases apples, bananas, and oranges, and another regularly buys apples and oranges, collaborative filtering can suggest bananas to the latter based on similarities in their preferences with the former.**

* To create the collaborative filtering model, Bigbasket can choose between two approaches: user-user collaborative filtering and item-item collaborative filtering. The former involves finding similarities between different users purchasing patterns, while the latter focuses on identifying similarities between different items or products.

* By combining these data-driven methods, Bigbasket can create an effective Smart Basket feature, providing each customer with relevant and tailored recommendations that match their preferences and needs. This personalized approach enhances the overall shopping experience and increases customer satisfaction, ultimately contributing to Bigbasket's growth and success in the online grocery market.

**10. How do you propose to build a Smart Basket for a customer on Bigbasket**

To build a Smart Basket for a customer on Bigbasket, we propose a robust strategy that leverages data-driven approaches and predictive analytics. Here are the key steps involved:

1. **Data Collection:** Bigbasket must gather and store customer data, including their purchase history, browsing behavior, product interactions, and order frequency. This data will serve as the foundation for developing personalized Smart Baskets.

2. **Customer Segmentation:** Customers should be segmented based on their shopping behavior, preferences, and frequency. **Grouping similar customers together using clustering techniques enables tailored recommendations.**

3. **Predictive Analytics:** Utilize predictive analytics to analyze historical data and forecast what products a customer is likely to purchase in the future. ``Machine learning algorithms like collaborative filtering and item association rules`` will be instrumental in providing personalized recommendations.

4. **Real-time Data Processing:** Implement real-time data processing to update the Smart Basket as customers interact with the platform. This ensures that recommendations remain relevant during the customer's current shopping session.

5. **Contextual Recommendations:** Consider contextual factors such as **time of day, day of the week, and seasonal trends to offer contextually relevant product recommendations.** For instance, suggesting fresh fruits in the morning or party snacks before weekends.

6. **Item Association Rules:&& Utilize association rule mining to identify relationships between different products in the customer's cart and recommend additional items based on frequent co-purchases by other customers.

7. **Customer Ratings:** Incorporate customer feedback and product ratings into the recommendation engine. Items with positive reviews and high ratings can be given priority in the Smart Basket.

8. **Continuous Improvement:** Continuously monitor the Smart Basket's performance and gather user feedback to make iterative enhancements. The recommendation engine should adapt and learn from customer interactions and behavior.

9. **A/B Testing:** Conduct A/B testing to compare different recommendation strategies and fine-tune the Smart Basket's performance. This allows for optimization and validation of the effectiveness of the recommendations.

10. **Hybrid Approach:** Adopt a hybrid approach by combining various recommendation techniques like collaborative filtering, content-based filtering, and contextual recommendations. This ensures a well-rounded and personalized shopping experience.

By integrating these strategies, Bigbasket can create a dynamic Smart Basket that offers personalized and contextually relevant product recommendations to each customer. The Smart Basket will continuously adapt to customer preferences and behaviors, resulting in increased engagement, loyalty, and overall customer satisfaction with the platform.