#### Imports

In [7]:
import pickle
import nbimporter
import pandas as pd
from utility import * # import all the functions

## Recommender

To
handle cases where a product lacks association rules—such as newly introduced items—the system calculates
similarity scores against all other products in the database. It identifies the most similar product and
utilizes its association rules to generate recommendations. This approach maintains consistency in the type
of recommendations provided. Similarly, if there are insufficient association rules, the system leverages the
"similar items" method by using related products as a basis to ensure a consistent and diverse output.

In [9]:
# Recommender function
def recommender(product_id, rules, df, n=5):
    """
    Recommend products based on association rules and similarity.

    Parameters:
        product_id (str): The product ID for which recommendations are needed.
        rules (pd.DataFrame): A DataFrame of association rules with columns 'antecedents' and 'consequents'.
        n (int): Number of recommendations to return.

    Returns:
        list: Recommended products.
    """
    
    # Step 1: Find associated items
    associated_items = find_associated_items(product_id, rules)
    
    # no rule for chosen item
    if len(associated_items) == 0:
        # find the most similar antecedents to the product_id
        
        # get all the single antecedents in the rules
        antecedents = [
            list(item)[0]
            for item in rules["antecedents"]
            if len(item) == 1
        ]
        
        # filter skincare so it searches for the similar items only among the antecedents and product_id
        antecedents.append(product_id)
        df_antecedents = df[df['product_id'].isin(antecedents)]

        # the most similar antecedents to a product_id
        the_most_similar_product_id = get_similar_items(product_id, df_antecedents, n = 1)['product_id'].iloc[0]
        
        associated_items = find_associated_items(the_most_similar_product_id, rules)
    
    # get the top n associated items
    if len(associated_items) >= n:
        return associated_items[:n]
    
    # if there are not enough  associated items get the similar items to the product_id as well
    else:
        number_of_similar_items = n - len(associated_items)
        
        # for each of the consequents find similar items and combine them in one dataframe
        similar_items_dataframes = []
        for i in associated_items:
            # get the top number_of_similar_items
            similar_items = get_similar_items(product_id, df)
            similar_items_dataframes.append(similar_items)
        
        all_similar_items = pd.concat(similar_items_dataframes, axis=0)
        
        # order is by similarity_score_highlights
        all_similar_items = all_similar_items.sort_values(by='rrf_score', ascending=False).reset_index(drop=True)
        
        # take only the top number_of_similar_items
        similar_products = list(all_similar_items['product_id'])[:number_of_similar_items]
        
        return associated_items + similar_products
        

### Load data

In [None]:
association_rules = pd.read_csv('processed_data/association_rules.csv')
association_rules = preprocess_rules(association_rules) 
skincare_df = pd.read_csv("processed_data/skincare.csv")

### Example of usage

In [16]:
random_product_id = skincare_df['product_id'].sample(n=1).iloc[0]

product_name = skincare_df[skincare_df['product_id'] == random_product_id]['product_name'].iloc[0]
merged = get_similar_items(random_product_id, skincare_df)['product_name'].head(5)

print(f"The most similar products to the {product_name} are: ")
print(merged)

The most similar products to the Baba Bomb Moisturizer are: 
0    Drink Up Intensive Overnight Hydrating Mask wi...
1    The Camellia Oil 2-in-1 Makeup Remover & Cleanser
2       Pillowgasm Vitamin-Rich Cherry Glow Sleep Mask
3                SEA Mermaid Skin Hyaluronic H2O Serum
4    Kombucha 2-in-1 No-Rinse Cleanser & Prebiotic ...
Name: product_name, dtype: object


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['similarity_score_ingredients'] = similarities
