# Constrained Food Recommender

In this assignment, you will implement both Content Based and Collaborative Filtering Recommenders and backtracking search (or local search) on your own

100% finished homework should contain EDA, Item and User profiles generation, Content-Based Recommender, Collaborative Filtering Recommender, and soluton to CSP problem of assigning recommendations to brekfast, lunch and dinner.

In [2]:
import math
import numpy as np
import pandas as pd

from sklearn.preprocessing import normalize

## Data loading

You will work with subset of [Academic Yelp Dataset](https://www.kaggle.com/yelp-dataset/yelp-dataset) containing list of restaurants in **yelp_business.csv** and reviews of the users in **yelp_reviews.parquet**

In [3]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [4]:
# df_yelp_business = pd.read_csv("yelp_business.csv").drop(columns=["Unnamed: 0"])
# df_yelp_reviews = pd.read_parquet("yelp_reviews.parquet")
df_yelp_business = pd.read_csv("drive/MyDrive/yelp_business.csv").drop(columns=["Unnamed: 0"])
df_yelp_reviews = pd.read_parquet("drive/MyDrive/yelp_reviews.parquet")

# Leave only users with at least 3 reviews
users_count = df_yelp_reviews.groupby("user_id").count()[["business_id"]] 
users_to_use = users_count[users_count["business_id"] > 2]
df_yelp_reviews = df_yelp_reviews[df_yelp_reviews["user_id"].isin(users_to_use.index)]

## Exploratory data analysis

Here you will explore the data to find out what is the distribution of business categories, hours, places, user reviews, etc.

This step is needed to proceed later with item and user profiling and to clean your data if there are duplicates (e.g. duplicated reviews, the same businesses under different ids, categories tags which are highly correlated) or some artifacts not related to the main task.

(5 points)

In [22]:
df_yelp_business.head()

Unnamed: 0,address,attributes,business_id,categories,city,hours,is_open,latitude,longitude,name,postal_code,review_count,stars,state
0,404 E Green St,"{'RestaurantsAttire': ""u'casual'"", 'Restaurant...",pQeaRpvuhoEqudo3uymHIQ,"Ethnic Food, Food Trucks, Specialty Food, Impo...",Champaign,"{'Monday': '11:30-14:30', 'Tuesday': '11:30-14...",1,40.110446,-88.233073,The Empanadas House,61820,5,4.5,IL
1,4508 E Independence Blvd,"{'RestaurantsGoodForGroups': 'True', 'OutdoorS...",CsLQLiRoafpJPJSkNX2h5Q,"Food, Restaurants, Grocery, Middle Eastern",Charlotte,,0,35.194894,-80.767442,Middle East Deli,28205,5,3.0,NC
2,300 John Street,"{'GoodForKids': 'True', 'RestaurantsTakeOut': ...",lu7vtrp_bE9PnxWfA8g4Pg,"Japanese, Fast Food, Food Court, Restaurants",Thornhill,,1,43.820492,-79.398466,Banzai Sushi,L3T 5W4,7,4.5,ON
3,"4550 East Cactus Rd, #KSFC-4","{'GoodForKids': 'True', 'RestaurantsTakeOut': ...",vjTVxnsQEZ34XjYNS-XUpA,"Food, Pretzels, Bakeries, Fast Food, Restaurants",Phoenix,"{'Monday': '10:0-21:0', 'Tuesday': '10:0-21:0'...",1,33.602822,-111.983533,Wetzel's Pretzels,85032,10,4.0,AZ
4,9595 W Tropicana Ave,"{'Alcohol': ""u'none'"", 'WiFi': ""u'no'"", 'GoodF...",fnZrZlqW1Z8iWgTVDfv_MA,"Mexican, Restaurants, Fast Food",Las Vegas,,0,36.099738,-115.301568,Carl's Jr,89147,15,2.5,NV


In [23]:
df_yelp_reviews.head()

Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id
4,IS4cv902ykd8wj1TR0N3-A,0,2017-01-14 21:56:57,0,6TdNDKywdbjoTkizeMce8A,4,happy day finally canes near casa yes others g...,0,UgMW8bLE0QMJDCkQ1Ax5Mg
6,Pthe4qk5xh4n-ef-9bvMSg,0,2015-11-05 23:11:05,0,ZayJ1zWyWgY9S_TRLT_y9Q,5,really good place simple decor amazing food gr...,1,aq_ZxGHiri48TUXJlpRkCQ
9,Ws8V970-mQt2X9CwCuT5zw,1,2009-10-13 04:16:41,0,z4BCgTkfNtCu4XY5Lp97ww,4,twice nice laid back tried weekend southern me...,3,jOERvhmK6_lo_XGUBPws_w
16,d4qwVw4PcN-_2mK2o1Ro1g,0,2015-02-02 06:28:00,0,bVTjZgRNq8ToxzvtiVrqMA,1,10pm super bowl sunday already closed weak won...,0,2hRe26HSCAWbFRn5WChK-Q
22,9Jo1pu0y2zU6ktiwQm6gNA,20,2016-12-04 03:15:21,19,sgTnHfeaEvyOoWX4TCgkuQ,4,coconut fish cafe fantastic five stars fish ca...,24,A0j21z2Q1HGic7jW6e9h7A


In [5]:
# Removing dublicated businesses with same business ids

df_yelp_business = df_yelp_business.drop_duplicates(subset=['business_id'])

In [6]:
# Removing dublicated reviews with same business ids

df_yelp_reviews.drop_duplicates(subset=['business_id'], keep='first', inplace=True)

## Building recommender

First of all you should process user reviews to get the utility matrix containing ratings for users and businesses. There will be a lot of 0 in this matrix and it is better to store such matrices in the specialized data structure for sparce matrices. However, your working dataset is relatively small and we can use simple **pd.DataFrame** to proceed

In [42]:
def create_utility_matrix(reviews: pd.DataFrame, business: pd.DataFrame) -> pd.DataFrame:
    business_ids = business["business_id"].unique()
    users = reviews["user_id"].unique()
    ut_matrix = pd.DataFrame(0, columns=business_ids, index=users)
    for _, review in reviews.iterrows():
        ut_matrix.loc[review["user_id"], review["business_id"]] = review["stars"]
    
    # Rating normalization to (-1, 1) range (5 points)
    norm_ut_matrix = pd.DataFrame(normalize(ut_matrix, axis=1), columns=ut_matrix.columns)
    norm_ut_matrix.index = ut_matrix.index

    return norm_ut_matrix

df_utility_matrix = create_utility_matrix(df_yelp_reviews, df_yelp_business)
df_utility_matrix.head()

Unnamed: 0,pQeaRpvuhoEqudo3uymHIQ,CsLQLiRoafpJPJSkNX2h5Q,lu7vtrp_bE9PnxWfA8g4Pg,vjTVxnsQEZ34XjYNS-XUpA,fnZrZlqW1Z8iWgTVDfv_MA,rVBPQdeayMYht4Uv_FOLHg,fhNf_sg-XzZ3e7HEVGuOZg,LoRef3ChgZKbxUio-sHgQg,Ga2Bt7xfqoggTypWD5VpoQ,xFc50drSPxXkcLvX5ygqrg,tLpkSwdtqqoXwU0JAGnApw,Sd75ucXKoZUM2BEfBHFUOg,lK-wuiq8b1TuU7bfbQZgsg,LAoSegVNU4wx4GTA8reB6A,-qjn24n8HYF6It9GQrQntw,ZkzutF0P_u0C0yTulwaHkA,0QjROMVW9ACKjhSEfHqNCQ,RrapAhd8ZxCj-iue7fu9FA,7j0kor_fkeYhyEpXh4OpnQ,OWkS1FXNJbozn-qPg3LWxg,j9bWpCRwpDVfwVT_V85qeA,6GHwgKNlvfIMUpFaxgBjUA,Ir_QIzs-4o9ElOtiGuxJrw,0Y5Kzo8PWHTjk0tlfAKcDQ,JcsZvx-4yovFgCXOfd6KMg,MTx-Zdl_KcU_z9G832XAjg,a6d7UcYnRvbr4t-THg4pSQ,UFU8ONTkzEkcOk61XJ8JwQ,-C0AlwLuXpcP609madJZQQ,W7hCuNdn2gzehta6eSHzgQ,8nP8ghEpT6WFcM6tfqAaGA,8k62wYhDVq1-652YbJi5eg,8Hvp1tYKiQbBgGIwkCRK5g,39lLJK_rrYY2NYomSsQdUA,kQknjbOvtPmS3NVm-RhcdQ,0y6alZmSLnPzmG5_kP5Quw,3YjPlOX3VHzKguHetiR_3g,OGVHlFHSXjHuioOvm1wVqg,Q_dh08clYUPj13GmCRzIVA,34FYKG4pHNXbM9ZRRiJaGw,...,xnVkYE3iMp_aZniiCIuD0g,DmyS9b7ykIOo7XwYt5I9wg,Xs37o78aby0o-Wmvh5yYPg,jhj-r7aH3AlJyVmGtcHi-Q,Vjzg0VOQsBWw1TA3iLol-A,HfQm-rq78QfSeJmm0i1I9Q,r0byBoB7y_IH8uicEvyCqQ,dDqG-lKO9BRadoQ9fmWP-A,6-yG0OQe-mSRoz1R72MkKg,YO0fC7aJv8PzZ1MGf_F6Vg,nvJjfEPYFXj8sJZcbt0k-Q,BGGQOJQTQerEQu0kHbT_UQ,CvonRhKDJaH155xhtpz_iw,IZUDXIq5SULhQ5RGLCdB7g,VeFqptSzekFAc3FZOpi81Q,gVLzkqIAHIWro_ZxkpjbFg,GCen6oV-_6PfMP_uKN-dZw,P3dBcZh_Hmr1wVWFZn2b-g,2Wzet4CPV0glZYXqFGyqyg,XjrjVQfKpbcvOXda-5r1jQ,u1jADJ_yMcL8bRJre7hjMQ,VBCR0KKjFfzfcvV36KnkhA,02hhtAO83rZBU1hflleElA,6pKR-h3KN7AwgGOOYBbE2A,E50mr3xobsahb77IBRwVTQ,W1iwBUfcDxoOPfhwr6EOGQ,kw3DbQo6Pqo83FSDx8HQjQ,aGMU3qMFOQzG0DT2akMfng,1SJiW_mW6IlEe7hqMTnjYg,LkMtMHVetws5_7QfRjPtlg,2SfSzEd3B7WimeZac23zhg,1dV3qNEv8nNUAX1k3qdE2w,YHCseOJ93wJh0gBcii_2qA,TJt1W9haRm2DKuoZLQ69yA,wM8QNs7uSyDqMJKjBYFPCQ,gp_bu7Ah81qaBY3M0Leffw,PUKOr5bEI87TVHjwijT1xw,zV38gkkEeJ4cVRlSWWQTfQ,H1j34TgbrVZkxeww9xlJTw,F8M0IukXQqR50IRyocRQbg
UgMW8bLE0QMJDCkQ1Ax5Mg,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
aq_ZxGHiri48TUXJlpRkCQ,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
jOERvhmK6_lo_XGUBPws_w,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2hRe26HSCAWbFRn5WChK-Q,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
A0j21z2Q1HGic7jW6e9h7A,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Content-Based Recommender

In [43]:
def build_business_profiles(business: pd.DataFrame) -> pd.DataFrame:
    # Feature engineering (10 points)

    # Getting all categories
    categories = []
    for _, info in business.iterrows():
        for category in info['categories'].split(', '):
            if category not in categories:
                categories.append(category)
    
    # Creating data frame for business profiles
    business_ids = business['business_id'].unique()
    business_profiles = pd.DataFrame(0, columns=categories, index=business_ids)

    # Place 1 for the categories that are present for current business
    for _, info in business.iterrows():
        for category in info['categories'].split(', '):
            business_profiles.loc[info['business_id'], category] = 1

    return business_profiles

df_business_profiles = build_business_profiles(df_yelp_business)
df_business_profiles.head()

Unnamed: 0,Ethnic Food,Food Trucks,Specialty Food,Imported Food,Argentine,Food,Restaurants,Empanadas,Grocery,Middle Eastern,Japanese,Fast Food,Food Court,Pretzels,Bakeries,Mexican,Burgers,American (Traditional),Chicken Wings,Lebanese,American (New),Hot Dogs,Chinese,Shopping Centers,Coffee & Tea,Cafes,Museums,Shopping,Local Flavor,Flowers & Gifts,Arts & Entertainment,Art Galleries,Florists,Egyptian,Pizza,Vietnamese,Buffets,Indian,Halal,Breakfast & Brunch,...,Beverage Store,Pancakes,Barbers,Town Car Service,Transportation,Septic Services,Limos,Historical Tours,Libraries,Churches,Amusement Parks,Restaurant Supplies,Wholesale Stores,Videos & Video Game Rental,Doctors,Cardiologists,Emergency Medicine,Pediatricians,Hospitals,Mongolian,Cosmetics & Beauty Supply,Furniture Stores,Home Decor,Dry Cleaning & Laundry,Accessories,Men's Clothing,Vinyl Records,Drugstores,South African,Escape Games,Hobby Shops,DJs,Fur Clothing,Pilates,Lighting Fixtures & Equipment,Ice Delivery,Furniture Repair,Party Equipment Rentals,Audio/Visual Equipment Rental,Furniture Rental
pQeaRpvuhoEqudo3uymHIQ,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
CsLQLiRoafpJPJSkNX2h5Q,0,0,0,0,0,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
lu7vtrp_bE9PnxWfA8g4Pg,0,0,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
vjTVxnsQEZ34XjYNS-XUpA,0,0,0,0,0,1,1,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
fnZrZlqW1Z8iWgTVDfv_MA,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [44]:
def build_user_profiles(utility_matrix: pd.DataFrame, 
                                   business_profiles: pd.DataFrame) -> pd.DataFrame:
    # Feature aggregation (5 points)

    # Create zero matrix for user_profiles
    user_profiles = pd.DataFrame(0, index=utility_matrix.index, columns=business_profiles.columns)
    
    # Iterate on utility_matrix
    for user_id, businesses in utility_matrix.iterrows():
        businesses_non_zero_ratings = list(businesses[businesses != 0].index)

        # Dataframe of bussinesses with non zero ratings and bussiness types
        ratings_profiles = business_profiles[business_profiles.index.isin(businesses_non_zero_ratings)]
        # Multiplication of previous dataframe and businesses with non zero ratings in utility_matrix
        multiplied_ratings_profiles = ratings_profiles.mul(businesses[businesses != 0], axis=0)
        
        # For each user locate the avarage value of rating regarding each bussiness type
        user_profiles.loc[user_id] = multiplied_ratings_profiles.sum(axis=0) / len(businesses[businesses != 0])

    return user_profiles

df_user_profiles = build_user_profiles(df_utility_matrix, df_business_profiles)
df_user_profiles.head()

Unnamed: 0,Ethnic Food,Food Trucks,Specialty Food,Imported Food,Argentine,Food,Restaurants,Empanadas,Grocery,Middle Eastern,Japanese,Fast Food,Food Court,Pretzels,Bakeries,Mexican,Burgers,American (Traditional),Chicken Wings,Lebanese,American (New),Hot Dogs,Chinese,Shopping Centers,Coffee & Tea,Cafes,Museums,Shopping,Local Flavor,Flowers & Gifts,Arts & Entertainment,Art Galleries,Florists,Egyptian,Pizza,Vietnamese,Buffets,Indian,Halal,Breakfast & Brunch,...,Beverage Store,Pancakes,Barbers,Town Car Service,Transportation,Septic Services,Limos,Historical Tours,Libraries,Churches,Amusement Parks,Restaurant Supplies,Wholesale Stores,Videos & Video Game Rental,Doctors,Cardiologists,Emergency Medicine,Pediatricians,Hospitals,Mongolian,Cosmetics & Beauty Supply,Furniture Stores,Home Decor,Dry Cleaning & Laundry,Accessories,Men's Clothing,Vinyl Records,Drugstores,South African,Escape Games,Hobby Shops,DJs,Fur Clothing,Pilates,Lighting Fixtures & Equipment,Ice Delivery,Furniture Repair,Party Equipment Rentals,Audio/Visual Equipment Rental,Furniture Rental
UgMW8bLE0QMJDCkQ1Ax5Mg,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
aq_ZxGHiri48TUXJlpRkCQ,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
jOERvhmK6_lo_XGUBPws_w,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2hRe26HSCAWbFRn5WChK-Q,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
A0j21z2Q1HGic7jW6e9h7A,0.0,0.0,0.0,0.0,0.0,0.19245,0.57735,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.19245,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [45]:
from sklearn.metrics.pairwise import cosine_similarity


def predict_content_ratings(user_profiles: pd.DataFrame, business_profiles: pd.DataFrame) -> pd.DataFrame:
    # Distance based rating prediction (5 points)
    # Pointwise/Pairwase training based prediction (optional for 10 extra points)

    predicted_content = pd.DataFrame(0, index=user_profiles.index, columns=business_profiles.index)

    user_count = 0
    for user_id, user_row in user_profiles.iterrows():
        cosine_distance = cosine_similarity(business_profiles.append(user_row))
        # cosine_similarity = 1 - cosine_distance
        prediction = (1 - cosine_distance[-1])[:-1]
        predicted_content.loc[user_id] = prediction
        # print(f"For user {user_id} got {prediction}")

        # Prediction for the first 10 users
        user_count += 1
        if user_count > 10:
            break
    
    return predicted_content

df_content_predictions = predict_content_ratings(df_user_profiles, df_business_profiles)
df_content_predictions.head()

Unnamed: 0,pQeaRpvuhoEqudo3uymHIQ,CsLQLiRoafpJPJSkNX2h5Q,lu7vtrp_bE9PnxWfA8g4Pg,vjTVxnsQEZ34XjYNS-XUpA,fnZrZlqW1Z8iWgTVDfv_MA,rVBPQdeayMYht4Uv_FOLHg,fhNf_sg-XzZ3e7HEVGuOZg,LoRef3ChgZKbxUio-sHgQg,Ga2Bt7xfqoggTypWD5VpoQ,xFc50drSPxXkcLvX5ygqrg,tLpkSwdtqqoXwU0JAGnApw,Sd75ucXKoZUM2BEfBHFUOg,lK-wuiq8b1TuU7bfbQZgsg,LAoSegVNU4wx4GTA8reB6A,-qjn24n8HYF6It9GQrQntw,ZkzutF0P_u0C0yTulwaHkA,0QjROMVW9ACKjhSEfHqNCQ,RrapAhd8ZxCj-iue7fu9FA,7j0kor_fkeYhyEpXh4OpnQ,OWkS1FXNJbozn-qPg3LWxg,j9bWpCRwpDVfwVT_V85qeA,6GHwgKNlvfIMUpFaxgBjUA,Ir_QIzs-4o9ElOtiGuxJrw,0Y5Kzo8PWHTjk0tlfAKcDQ,JcsZvx-4yovFgCXOfd6KMg,MTx-Zdl_KcU_z9G832XAjg,a6d7UcYnRvbr4t-THg4pSQ,UFU8ONTkzEkcOk61XJ8JwQ,-C0AlwLuXpcP609madJZQQ,W7hCuNdn2gzehta6eSHzgQ,8nP8ghEpT6WFcM6tfqAaGA,8k62wYhDVq1-652YbJi5eg,8Hvp1tYKiQbBgGIwkCRK5g,39lLJK_rrYY2NYomSsQdUA,kQknjbOvtPmS3NVm-RhcdQ,0y6alZmSLnPzmG5_kP5Quw,3YjPlOX3VHzKguHetiR_3g,OGVHlFHSXjHuioOvm1wVqg,Q_dh08clYUPj13GmCRzIVA,34FYKG4pHNXbM9ZRRiJaGw,...,xnVkYE3iMp_aZniiCIuD0g,DmyS9b7ykIOo7XwYt5I9wg,Xs37o78aby0o-Wmvh5yYPg,jhj-r7aH3AlJyVmGtcHi-Q,Vjzg0VOQsBWw1TA3iLol-A,HfQm-rq78QfSeJmm0i1I9Q,r0byBoB7y_IH8uicEvyCqQ,dDqG-lKO9BRadoQ9fmWP-A,6-yG0OQe-mSRoz1R72MkKg,YO0fC7aJv8PzZ1MGf_F6Vg,nvJjfEPYFXj8sJZcbt0k-Q,BGGQOJQTQerEQu0kHbT_UQ,CvonRhKDJaH155xhtpz_iw,IZUDXIq5SULhQ5RGLCdB7g,VeFqptSzekFAc3FZOpi81Q,gVLzkqIAHIWro_ZxkpjbFg,GCen6oV-_6PfMP_uKN-dZw,P3dBcZh_Hmr1wVWFZn2b-g,2Wzet4CPV0glZYXqFGyqyg,XjrjVQfKpbcvOXda-5r1jQ,u1jADJ_yMcL8bRJre7hjMQ,VBCR0KKjFfzfcvV36KnkhA,02hhtAO83rZBU1hflleElA,6pKR-h3KN7AwgGOOYBbE2A,E50mr3xobsahb77IBRwVTQ,W1iwBUfcDxoOPfhwr6EOGQ,kw3DbQo6Pqo83FSDx8HQjQ,aGMU3qMFOQzG0DT2akMfng,1SJiW_mW6IlEe7hqMTnjYg,LkMtMHVetws5_7QfRjPtlg,2SfSzEd3B7WimeZac23zhg,1dV3qNEv8nNUAX1k3qdE2w,YHCseOJ93wJh0gBcii_2qA,TJt1W9haRm2DKuoZLQ69yA,wM8QNs7uSyDqMJKjBYFPCQ,gp_bu7Ah81qaBY3M0Leffw,PUKOr5bEI87TVHjwijT1xw,zV38gkkEeJ4cVRlSWWQTfQ,H1j34TgbrVZkxeww9xlJTw,F8M0IukXQqR50IRyocRQbg
UgMW8bLE0QMJDCkQ1Ax5Mg,0.823223,0.75,0.5,0.552786,0.42265,0.711325,0.5,0.711325,0.6464466,0.552786,0.42265,0.646447,0.855662,0.776393,0.5,0.646447,0.646447,0.75,0.75,0.75,0.776393,0.42265,0.5,0.646447,0.292893,0.292893,0.841886,0.711325,0.646447,0.711325,0.552786,0.811018,0.776393,0.646447,0.646447,0.711325,0.646447,0.776393,0.776393,0.776393,...,0.811018,0.42265,0.75,0.6464466,0.42265,0.552786,0.646447,0.776393,0.591752,0.75,0.811018,0.75,0.841886,0.776393,0.833333,0.795876,0.711325,0.646447,0.646447,0.646447,0.811018,0.646447,0.776393,0.823223,0.795876,0.646447,0.646447,0.855662,0.646447,0.866369,0.795876,0.6464466,0.646447,0.42265,0.646447,0.776393,0.711325,0.776393,0.75,0.75
aq_ZxGHiri48TUXJlpRkCQ,0.75,0.646447,0.646447,0.683772,0.591752,0.591752,0.646447,0.591752,0.5,0.683772,0.591752,0.5,0.795876,0.683772,0.646447,0.5,0.5,0.646447,0.646447,0.646447,0.367544,0.591752,0.646447,0.5,0.5,0.5,0.776393,0.591752,0.5,0.591752,0.683772,0.732739,0.683772,0.5,0.5,0.591752,0.5,0.683772,0.683772,0.683772,...,0.732739,0.591752,0.646447,0.5,0.591752,0.683772,0.5,0.683772,0.711325,0.646447,0.732739,0.646447,0.776393,0.683772,0.764298,0.711325,0.591752,0.5,0.5,0.5,0.732739,0.5,0.683772,0.75,0.711325,0.5,0.5,0.795876,0.5,0.811018,0.711325,0.5,0.75,0.591752,0.5,0.683772,0.591752,0.683772,0.646447,0.646447
jOERvhmK6_lo_XGUBPws_w,0.764298,0.666667,0.833333,0.701858,0.80755,0.6151,0.666667,0.80755,0.7642977,0.701858,0.80755,0.764298,0.6151,0.701858,0.666667,0.764298,0.764298,0.833333,0.666667,0.666667,0.701858,0.6151,0.666667,0.764298,0.528595,0.764298,0.789181,0.6151,0.764298,0.80755,0.403715,0.622036,0.403715,0.764298,0.764298,0.80755,0.528595,0.850929,0.701858,0.552786,...,0.370059,0.80755,0.833333,0.7642977,0.80755,0.850929,0.764298,0.552786,0.727834,0.833333,0.748024,0.5,0.683772,0.701858,0.777778,0.863917,0.80755,0.764298,0.764298,0.764298,0.874012,0.764298,0.850929,0.528595,0.727834,0.764298,0.764298,0.711325,0.764298,0.821826,0.591752,0.7642977,0.646447,0.6151,0.764298,0.701858,0.80755,0.701858,0.666667,0.833333
2hRe26HSCAWbFRn5WChK-Q,0.75,0.646447,0.646447,0.683772,0.183503,0.591752,0.646447,0.591752,2.220446e-16,0.683772,0.591752,0.5,0.795876,0.683772,0.646447,0.5,0.5,0.646447,0.646447,0.646447,0.683772,0.591752,0.292893,0.5,0.5,0.5,0.776393,0.591752,0.5,0.591752,0.683772,0.732739,0.683772,0.5,0.5,0.591752,0.5,0.683772,0.683772,0.683772,...,0.732739,0.591752,0.292893,2.220446e-16,0.591752,0.367544,0.5,0.683772,0.711325,0.646447,0.732739,0.646447,0.776393,0.367544,0.764298,0.711325,0.591752,0.5,0.5,0.5,0.732739,0.5,0.367544,0.75,0.711325,0.5,0.5,0.795876,0.5,0.811018,0.711325,2.220446e-16,0.75,0.591752,0.5,0.683772,0.591752,0.683772,0.646447,0.646447
A0j21z2Q1HGic7jW6e9h7A,0.657003,0.514929,0.636197,0.566139,0.579916,0.439888,0.636197,0.579916,0.4855042,0.674604,0.579916,0.485504,0.719944,0.566139,0.636197,0.314006,0.485504,0.636197,0.636197,0.393661,0.566139,0.29986,0.514929,0.485504,0.314006,0.485504,0.616518,0.579916,0.314006,0.29986,0.457674,0.633321,0.566139,0.314006,0.485504,0.29986,0.485504,0.674604,0.566139,0.566139,...,0.633321,0.579916,0.636197,0.4855042,0.579916,0.674604,0.485504,0.674604,0.702956,0.636197,0.72499,0.636197,0.539821,0.566139,0.757464,0.702956,0.29986,0.485504,0.485504,0.314006,0.72499,0.485504,0.674604,0.657003,0.702956,0.485504,0.485504,0.509902,0.485504,0.740719,0.504926,0.4855042,0.657003,0.439888,0.485504,0.566139,0.579916,0.457674,0.514929,0.636197


## Collaborative Filtering Recommender

In [46]:
from scipy.spatial.distance import cosine


def predict_collaborative_ratings(utility_matrix: pd.DataFrame) -> pd.DataFrame:
    # User-item collaborative filtering based rating prediction (15 points)
    # UV-decomposition based rating prediction (optional for 10 extra points)

    first_user_count = 0
    second_user_count = 0

    # Building matrix with similarity coefficients for each pair of users
    users_similarity_matrix = pd.DataFrame(0, index=utility_matrix.index, columns=utility_matrix.index)
    for user_id_1, businesses_1 in utility_matrix.iterrows():
        for user_id_2, businesses_2 in utility_matrix.iterrows():
            cosine_similarity = 1 - cosine(businesses_1, businesses_2)
            users_similarity_matrix.loc[user_id_1, user_id_2] = cosine_similarity

            first_user_count += 1
            if first_user_count > 10:
                first_user_count = 0
                break

        second_user_count += 1
        if second_user_count > 10:
            break

    user_count = 0
    predicted_collaborative = pd.DataFrame(0, index=utility_matrix.index, columns=utility_matrix.columns)
    for user_id, user_row in users_similarity_matrix.iterrows():
        max_similarity_id = user_row.idxmax(axis=1)
        counter = 0
        # Finding current user bussinesses and most similar user bussinesses
        for user, businesses in utility_matrix.iterrows():
            if user == max_similarity_id:
                similar_bussiness = businesses
                counter += 1
                if counter == 2:
                    break

            if user == user_id:
                curr_bussiness = businesses
                counter += 1
                if counter == 2:
                    break
        
        # Remove businesses that user has already visited, to not recommend it
        result_similarity = []
        for i in range(len(similar_bussiness.values)):
            if curr_bussiness.values[i] == similar_bussiness.values[i] :
                result_similarity.append(0)
            else:
                result_similarity.append(similar_bussiness.values[i])

        predicted_collaborative.loc[user_id] = result_similarity

        user_count += 1
        if user_count > 10:
            break
        
    return predicted_collaborative


df_collaborative_predictions = predict_collaborative_ratings(df_utility_matrix)
df_collaborative_predictions.head()

Unnamed: 0,pQeaRpvuhoEqudo3uymHIQ,CsLQLiRoafpJPJSkNX2h5Q,lu7vtrp_bE9PnxWfA8g4Pg,vjTVxnsQEZ34XjYNS-XUpA,fnZrZlqW1Z8iWgTVDfv_MA,rVBPQdeayMYht4Uv_FOLHg,fhNf_sg-XzZ3e7HEVGuOZg,LoRef3ChgZKbxUio-sHgQg,Ga2Bt7xfqoggTypWD5VpoQ,xFc50drSPxXkcLvX5ygqrg,tLpkSwdtqqoXwU0JAGnApw,Sd75ucXKoZUM2BEfBHFUOg,lK-wuiq8b1TuU7bfbQZgsg,LAoSegVNU4wx4GTA8reB6A,-qjn24n8HYF6It9GQrQntw,ZkzutF0P_u0C0yTulwaHkA,0QjROMVW9ACKjhSEfHqNCQ,RrapAhd8ZxCj-iue7fu9FA,7j0kor_fkeYhyEpXh4OpnQ,OWkS1FXNJbozn-qPg3LWxg,j9bWpCRwpDVfwVT_V85qeA,6GHwgKNlvfIMUpFaxgBjUA,Ir_QIzs-4o9ElOtiGuxJrw,0Y5Kzo8PWHTjk0tlfAKcDQ,JcsZvx-4yovFgCXOfd6KMg,MTx-Zdl_KcU_z9G832XAjg,a6d7UcYnRvbr4t-THg4pSQ,UFU8ONTkzEkcOk61XJ8JwQ,-C0AlwLuXpcP609madJZQQ,W7hCuNdn2gzehta6eSHzgQ,8nP8ghEpT6WFcM6tfqAaGA,8k62wYhDVq1-652YbJi5eg,8Hvp1tYKiQbBgGIwkCRK5g,39lLJK_rrYY2NYomSsQdUA,kQknjbOvtPmS3NVm-RhcdQ,0y6alZmSLnPzmG5_kP5Quw,3YjPlOX3VHzKguHetiR_3g,OGVHlFHSXjHuioOvm1wVqg,Q_dh08clYUPj13GmCRzIVA,34FYKG4pHNXbM9ZRRiJaGw,...,xnVkYE3iMp_aZniiCIuD0g,DmyS9b7ykIOo7XwYt5I9wg,Xs37o78aby0o-Wmvh5yYPg,jhj-r7aH3AlJyVmGtcHi-Q,Vjzg0VOQsBWw1TA3iLol-A,HfQm-rq78QfSeJmm0i1I9Q,r0byBoB7y_IH8uicEvyCqQ,dDqG-lKO9BRadoQ9fmWP-A,6-yG0OQe-mSRoz1R72MkKg,YO0fC7aJv8PzZ1MGf_F6Vg,nvJjfEPYFXj8sJZcbt0k-Q,BGGQOJQTQerEQu0kHbT_UQ,CvonRhKDJaH155xhtpz_iw,IZUDXIq5SULhQ5RGLCdB7g,VeFqptSzekFAc3FZOpi81Q,gVLzkqIAHIWro_ZxkpjbFg,GCen6oV-_6PfMP_uKN-dZw,P3dBcZh_Hmr1wVWFZn2b-g,2Wzet4CPV0glZYXqFGyqyg,XjrjVQfKpbcvOXda-5r1jQ,u1jADJ_yMcL8bRJre7hjMQ,VBCR0KKjFfzfcvV36KnkhA,02hhtAO83rZBU1hflleElA,6pKR-h3KN7AwgGOOYBbE2A,E50mr3xobsahb77IBRwVTQ,W1iwBUfcDxoOPfhwr6EOGQ,kw3DbQo6Pqo83FSDx8HQjQ,aGMU3qMFOQzG0DT2akMfng,1SJiW_mW6IlEe7hqMTnjYg,LkMtMHVetws5_7QfRjPtlg,2SfSzEd3B7WimeZac23zhg,1dV3qNEv8nNUAX1k3qdE2w,YHCseOJ93wJh0gBcii_2qA,TJt1W9haRm2DKuoZLQ69yA,wM8QNs7uSyDqMJKjBYFPCQ,gp_bu7Ah81qaBY3M0Leffw,PUKOr5bEI87TVHjwijT1xw,zV38gkkEeJ4cVRlSWWQTfQ,H1j34TgbrVZkxeww9xlJTw,F8M0IukXQqR50IRyocRQbg
UgMW8bLE0QMJDCkQ1Ax5Mg,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
aq_ZxGHiri48TUXJlpRkCQ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
jOERvhmK6_lo_XGUBPws_w,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2hRe26HSCAWbFRn5WChK-Q,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
A0j21z2Q1HGic7jW6e9h7A,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


## Evaluation

In [49]:
import sklearn.metrics as metrics


def score_model(utility_matrix: pd.DataFrame, predicted_utility_matrix: pd.DataFrame, model_name="model_0"):
    # Implement these by hand (each metric 1 point)
    sliced_utility_matrix = utility_matrix.iloc[:10]
    sliced_predicted_utility_matrix = predicted_utility_matrix.iloc[:10]

    rmse_score = metrics.mean_squared_error(sliced_utility_matrix, sliced_predicted_utility_matrix)
    map_score = metrics.mean_absolute_percentage_error(sliced_utility_matrix, sliced_predicted_utility_matrix)
    # coverage_score = metrics.coverage_error(sliced_utility_matrix, sliced_predicted_utility_matrix)
    coverage_score = 0
    personalization_score = 0
    intra_list_similarity_score = 0
    
    print("{} RMSE {}".format(model_name, rmse_score))
    print("MAP: {}".format(model_name, map_score))
    print("Coverage: {}".format(model_name, coverage_score))
    print("Personalization: {}".format(model_name, personalization_score))
    print("Intra-list similarity: {}".format(model_name, intra_list_similarity_score))    

score_model(df_content_predictions, df_utility_matrix, "content-based approach")
score_model(df_collaborative_predictions, df_utility_matrix, "collaborative-filtering approach")

content-based approach RMSE 0.47151222830138095
MAP: content-based approach
Coverage: content-based approach
Personalization: content-based approach
Intra-list similarity: content-based approach
collaborative-filtering approach RMSE 0.00017397355601948505
MAP: collaborative-filtering approach
Coverage: collaborative-filtering approach
Personalization: collaborative-filtering approach
Intra-list similarity: collaborative-filtering approach


## Constraint Satisfaction Problem

We can work with the task of planing breakfast, lunch and dinner for particular user as Constraint Satisfaction Problem with

**Domain**: {all_businesses}

**Variables**: {breakfast, lunch, dinner}

**Constraints**: {constrainst regarding individual variable, or several variables at once}

We also have predicted ratings for every business and want to have personalized plan of restaurants. So we won't only satisfy our constraints, but also would like to get the maximum cumulative rating.

Take a look on prepared constraints and finish empty constraints in similar way (some of these constraints may require analytics on business data. e.g. to finish **has_coffee_constraint** you may need to determine all the categories which may include good coffee in their menu)

In [None]:
def is_vegetarian_constraint(business_id):
    return "vegetarian" in df_yelp_business[df_yelp_business["business_id"] == business_id].categories.values[0].lower()

def has_coffee_constraint(business_id):
    # TODO: implement this constraint (1 point)
    return False

def has_alcohol_constraint(business_id):
    # TODO: implement this constraint (1 point)
    return False

def is_open_constraint(business_id):
    # TODO: implement this constraint (1 point)
    return False

def is_open_at_date_at_time_meta_constraint(weekday, time, business_id):
    # TODO: implement this constraint (1 point)
    return False

def is_open_at_monday_at_10am_constraint(business_id):
    return is_open_at_date_at_time("monday", "10:00", business_id)

def all_are_different_constraint(state):
    for time in ["breakfast", "dinner", "lunch"]:
        for _t in ["breakfast", "dinner", "lunch"]:
            if time == _t: continue
            business_categories = set(df_yelp_business[df_yelp_business["business_id"] == state[time]["business_id"]].categories.values[0].split(","))
            _business_categories = set(df_yelp_business[df_yelp_business["business_id"] == state[_t]["business_id"]].categories.values[0].split(","))
            if len(business_categories.intersection(_business_categories)) > \
                    len(business_categories.union(_business_categories)) // 2:
                return False
    return True

def all_are_in_the_same_city_constraint(state):
    # TODO: implement this constraint (1 point)
    return False

def all_are_in_the_same_region_meta_constraint(coordinates, threshold, state):
    # TODO: implement this constraint (1 point). Hint: use haversine distance https://pypi.org/project/haversine/
    return False

def all_are_in_test_region(state):
    return all_are_in_the_same_region_constraint({"lat": 40.110446, "lon": -115.301568}, 400, state)

def at_least_one_visited_place_constraint(state):
    # TODO: implement this constraint (2 points)
    # Make this constraint give more reward for more than one familiar place
    return False

def at_least_one_has_coffee_constraint(state):
    # TODO: implement this constraint (2 points)
    # Make this constraint give more reward for more than one place with coffee
    return False

In [None]:
import random 

random.seed(42)
inspected_user = random.choice(df_yelp_reviews["user_id"].unique())

all_constraints = {
    "breakfast": [has_coffee_constraint, is_open_constraint, is_open_at_monday_at_10am_constraint],
    "lunch": [is_open_constraint],
    "dinner": [is_vegetarian_constraint, has_alcohol_constraint, is_open_constraint],
    "state": [at_least_one_has_coffee_constraint, at_least_one_visited_place_constraint, all_are_in_test_region,
             all_are_in_the_same_city_constraint, all_are_different_constraint]
}

def goal_test(state: dict, constraints: dict):
    cumulative_rating = state["breakfast"]["predicted_rating"]*state["lunch"]["predicted_rating"]*\
                        state["dinner"]["predicted_rating"]
    for k in constraints.keys():
        if k == "state":
            for c in constraints[k]:
                cumulative_rating *= c(state)
        else:
            for c in constraints[k]:
                cumulative_rating *= c(state[k]["business_id"])
    return cumulative_rating


def prepare_restaurants_plan(user_id: str, user_business_ratings: pd.DataFrame, constraints: dict):
    # TODO: assign breakfast, lunch and dinner by solving Constraint Satisfaction Problem 
    # maximizing total score and satisfying all the constraints (it should work with any configuration of constraints)
    
    # You can implement Backtracking (10) + Filtering (10) + Ordering (5) using goal_test
    # OR
    # Local Search (10) + Min-Conflicts heuristic (10) + Ordering (5) with modification of goal test to work as Min-Conflicts heuristic
    
    state = {"user_id" : user_id,
        "breakfast": 
                {"business_id": random.choice(user_business_ratings["business_id"].unique()),
                 "predicted_rating": 0},
            "lunch": 
    {"business_id": random.choice(user_business_ratings["business_id"].unique()),
                 "predicted_rating": 0},
            "dinner": {"business_id": random.choice(user_business_ratings["business_id"].unique()),
                 "predicted_rating": 0}}
    
    state_v = goal_test(state, constraints)

    
    return state

# TODO: replace df_utility_matrix with your best predictions
prepare_restaurants_plan(inspected_user, df_utility_matrix, all_constraints)