**Collaborative Filtering**

Collaborative methods work with the interaction matrix that can also be called rating matrix in the rare case when users provide explicit rating of items. The task of machine learning is to learn a function that predicts utility of items to each user. Matrix is typically huge, very sparse and most of values are missing.


Collaborative filtering methods are based on collecting and analyzing a large amount of information on user behaviors, activities or preferences and predicting what users will like based on their similarity to other users.

The fundamental assumption behind collaborative filtering technique is that similar user preferences over the items could be exploited to recommend those items to a user who has not seen or used it before. In simpler terms, we assume that users who agreed in the past (purchased the same product or viewed the same movie) will agree in the future.

**Aim is to find all these user and item dependencies in the Matrix.**

Example:

24 = 8 * 3 <br>
     ^   ^ <br>
     |   | <br>
     factors
     
<b>Matrix Factorization</b>: Matrix Factorization finds two rectangular matrices with smaller dimensions to represent a big rating matrix (RM). These factors retain the dependencies and properties of the rating matrix. One matrix can be seen as the user matrix (UM) where rows represent users and columns are k latent factors. The other matrix is the item matrix (IM)where rows are k latent factors and columns represent items. Here k < number of items and k < number of users.

My Approach:

- 1) Creating user-restaurant matrix 
- 2) Checking for sparsity
- 3) Handling Sparsity using Matrix factorization where gradient descent will be appiled to lower the RMSE error and fill up the missing values in the matrix.
- 4) Applying cosine similarity to calculate the similiarity socres for users.
- 5) Applying Top K nearest neighbouros to find the top K recommended movies for the queries user id.

## Importing necessary settings and modules

In [1]:
import pandas as pd

In [2]:
bus_reviews = pd.read_csv('output/business_review.csv',encoding = "ISO-8859-1",index_col=0)

In [3]:
bus_reviews.head()

Unnamed: 0,review_id,user_id,business_id,stars,name,restuarant_stars,review_count,attributes,categories,city
0,3bMgLXMLzm89C_0mkbIFOA,j9hC9EmCsS2S2ZtbsK-l0g,5Q4Gw1pyZnG8IlFNozxIlw,2,Native New Yorker Restaurant,3.0,22,"{'RestaurantsTakeOut': 'True', 'Ambience': ""{'...",Restaurants,Gilbert
1,slhog3p6YaoVEej3USo2Iw,IJ1wbXUh_B5Yn6U1YWHy4g,5Q4Gw1pyZnG8IlFNozxIlw,1,Native New Yorker Restaurant,3.0,22,"{'RestaurantsTakeOut': 'True', 'Ambience': ""{'...",Restaurants,Gilbert
2,noxz5btWWJjSpbUBmregVQ,_L-JKT5OahgimFlVOo708w,5Q4Gw1pyZnG8IlFNozxIlw,4,Native New Yorker Restaurant,3.0,22,"{'RestaurantsTakeOut': 'True', 'Ambience': ""{'...",Restaurants,Gilbert
3,gmiMXHSb6loIDxopmOWHSA,x_I7IDsFeT4vVcEBZLqr7A,5Q4Gw1pyZnG8IlFNozxIlw,5,Native New Yorker Restaurant,3.0,22,"{'RestaurantsTakeOut': 'True', 'Ambience': ""{'...",Restaurants,Gilbert
4,xfgHmu5n7cg-uNoU2C1lZg,ceLFSre4hrzkT5VpSlt2Lg,5Q4Gw1pyZnG8IlFNozxIlw,3,Native New Yorker Restaurant,3.0,22,"{'RestaurantsTakeOut': 'True', 'Ambience': ""{'...",Restaurants,Gilbert


## User based collaborative filtering

While doing exploratory analysis, we saw that Toronto has the highest number of resturant than other cities. So, we will be keeping the data for Toronto only.

In [4]:
bus_reviews = bus_reviews[bus_reviews['city'] == 'Toronto']

In [7]:
bus_reviews

Unnamed: 0,review_id,user_id,business_id,stars,name,restuarant_stars,review_count,attributes,categories,city
190,j99pOG3iUs85SagiXSaDVA,y6zJHfOZ1TQylruZOGWL5w,9DLBwCS4hu_xaiENCbYXbQ,4,Chae Chester Fried Chicken Express,4.0,9,"{'RestaurantsPriceRange2': '1', 'RestaurantsRe...",Restaurants,Toronto
191,NcLaiNYI8UTDnj2tbaY6_Q,DBHCFW3mSmmOEpONHVu1rQ,9DLBwCS4hu_xaiENCbYXbQ,3,Chae Chester Fried Chicken Express,4.0,9,"{'RestaurantsPriceRange2': '1', 'RestaurantsRe...",Restaurants,Toronto
192,8ENIH-hnN9RPOXk51DPzzQ,XN6InnJ-XfH4vRHNvHyYBw,9DLBwCS4hu_xaiENCbYXbQ,5,Chae Chester Fried Chicken Express,4.0,9,"{'RestaurantsPriceRange2': '1', 'RestaurantsRe...",Restaurants,Toronto
193,Cnvo0tIOH8G0zG9_uNsoyA,SptpVlrkT_hImZTyf6RItQ,9DLBwCS4hu_xaiENCbYXbQ,5,Chae Chester Fried Chicken Express,4.0,9,"{'RestaurantsPriceRange2': '1', 'RestaurantsRe...",Restaurants,Toronto
194,WOq0IEABXmwzMINp5lBhSA,0LiZRLlzGXI4ugV_p1Rrbg,9DLBwCS4hu_xaiENCbYXbQ,5,Chae Chester Fried Chicken Express,4.0,9,"{'RestaurantsPriceRange2': '1', 'RestaurantsRe...",Restaurants,Toronto
195,SFMpXt35e1HklPDhdITBeQ,zbyoPxKKqK1NHQ8_wUvqdQ,9DLBwCS4hu_xaiENCbYXbQ,5,Chae Chester Fried Chicken Express,4.0,9,"{'RestaurantsPriceRange2': '1', 'RestaurantsRe...",Restaurants,Toronto
196,bzVGArTcOvxOBpt3BGi9MA,hTsDirHtFsOeCC3TTypc-g,9DLBwCS4hu_xaiENCbYXbQ,1,Chae Chester Fried Chicken Express,4.0,9,"{'RestaurantsPriceRange2': '1', 'RestaurantsRe...",Restaurants,Toronto
197,wFt301O9q3pGR6zM17trzg,6J5N1B9-COnPtkMxNit3ug,9DLBwCS4hu_xaiENCbYXbQ,1,Chae Chester Fried Chicken Express,4.0,9,"{'RestaurantsPriceRange2': '1', 'RestaurantsRe...",Restaurants,Toronto
198,18pqmbBC9lvukqR9PWrWlw,accbcGY98ffb7H4FY1xrpA,9DLBwCS4hu_xaiENCbYXbQ,5,Chae Chester Fried Chicken Express,4.0,9,"{'RestaurantsPriceRange2': '1', 'RestaurantsRe...",Restaurants,Toronto
341,MXYgoQnZINRF1XYjPxAZmA,_Y5uwE-SfShOon4goDSRAA,Xxr4n8peOKw0csYEgVl0hA,1,Wing Machine,2.5,5,"{'GoodForKids': 'True', 'RestaurantsReservatio...",Restaurants,Toronto
