## Destination Recommender System with Implicit Feedback

Here, we build an implicit feedback recommender system using the [implicit](https://github.com/benfred/implicit) package.

The cells of the user-item matrix are populated by a given user's degree of preference towards an item, which can come in the form of:

1. **explicit feedback:** direct feedback towards an item (e.g., destination ratings which can be observed in Collaborative Filter approach.
2. **implicit feedback:** indirect behaviour towards an item (e.g., transaction history, browsing history, search behaviour)

Implicit feedback makes assumptions about a user's preference based on their actions towards items. Let's take Netflix for example. If you binge-watch a show and blaze through all seasons in a week, there's a high chance that you like that show. However, if you start watching a series and stop halfway through the first episode, there's suspicion to believe that you probably don't like that show. 


### Step 1: Import Dependencies

In [None]:
import numpy as np
import pandas as pd
from scipy.sparse import csr_matrix

import implicit

### Step 2: Load the Data

In [None]:
ratings = pd.read_csv("sample_user_ratings.csv")
destinations = pd.read_csv("sample_destinations.csv")

In [None]:
ratings.head()

In [None]:
destinations.head()

#### Note : In this case, we treat destination ratings as the number of times that a user watched a destination's profile.

### Step 3: Transforming the Data

Similar to collaborative filter, we need to transform the `ratings` dataframe into a user-item matrix where rows represent users and columns represent destinations. The cells of this matrix will be populated with implicit feedback: in this case, the number of times a user watched a destination's profile. 

The `create_X()` function outputs a sparse matrix **X** with four mapper dictionaries:
- **user_mapper:** maps user id to user index
- **destination_mapper:** maps destination id to destination index
- **user_inv_mapper:** maps user index to user id
- **destination_inv_mapper:** maps destination index to destination id

We need these dictionaries because they map which row and column of the utility matrix corresponds to which user ID and destination ID, respectively.

The **X** (user-item) matrix is a [scipy.sparse.csr_matrix](scipylinkhere) which stores the data sparsely.


In [None]:
def create_X(df):
    """
    Generates a sparse matrix from ratings dataframe.
    
    Args:
        df: pandas dataframe
    
    Returns:
        X: sparse matrix
        user_mapper: dict that maps user id's to user indices
        user_inv_mapper: dict that maps user indices to user id's
        destination_mapper: dict that maps destination id's to destination indices
        destination_inv_mapper: dict that maps destination indices to destination id's
    """
    N = df['user_id'].nunique()
    M = df['destination_id'].nunique()

    user_mapper = dict(zip(np.unique(df["user_id"]), list(range(N))))
    destination_mapper = dict(zip(np.unique(df["destination_id"]), list(range(M))))
    
    user_inv_mapper = dict(zip(list(range(N)), np.unique(df["user_id"])))
    destination_inv_mapper = dict(zip(list(range(M)), np.unique(df["destination_id"])))
    
    user_index = [user_mapper[i] for i in df['user_id']]
    destination_index = [destination_mapper[i] for i in df['destination_id']]

    X = csr_matrix((df["rating"], (destination_index, user_index)), shape=(M, N))
    
    return X, user_mapper, destination_mapper, user_inv_mapper, destination_inv_mapper

In [None]:
X, user_mapper, destination_mapper, user_inv_mapper, destination_inv_mapper = create_X(ratings)

### Creating Destination Title Mappers

We need to interpret a destination title from its index in the user-item matrix and vice versa. Let's create 2 helper functions that make this interpretation easy:

- `get_destination_index()` - converts a destination title to destination index

- `get_destination_title()` - converts a destination index to destination title

In [None]:
from fuzzywuzzy import process

def destination_finder(title):
    all_titles = destinations['title'].tolist()
    closest_match = process.extractOne(title,all_titles)
    return closest_match[0]

destination_title_mapper = dict(zip(destinations['title'], destinations['destination_id']))
destination_title_inv_mapper = dict(zip(destinations['destination_id'], destinations['title']))

def get_destination_index(title):
    fuzzy_title = destination_finder(title)
    destination_id = destination_title_mapper[fuzzy_title]
    destination_idx = destination_mapper[destination_id]
    return destination_idx

def get_destination_title(destination_idx): 
    destination_id = destination_inv_mapper[destination_idx]
    title = destination_title_inv_mapper[destination_id]
    return title 

It's time to test it out! Let's get the destination index of `Swayambhunath Temple`. 

In [None]:
get_destination_index('Swayambhunath Temple')

Let's pass this index value into `get_destination_title()`. We're expecting Swayambhunath Temple to get returned.

In [None]:
get_destination_title(3)

Great! These helper functions will be useful when we want to interpret our recommender results.

### Step 4: Building Our Implicit Feedback Recommender Model


We've transformed and prepared our data so that we can start creating our recommender model.

The [implicit](https://github.com/benfred/implicit) package is built around a linear algebra technique called [matrix factorization](https://en.wikipedia.org/wiki/Matrix_factorization_(recommender_systems)), which can help us discover latent features underlying the interactions between users and movies. These latent features give a more compact representation of user tastes and item descriptions. Matrix factorization is particularly useful for very sparse data and can enhance the quality of recommendations. The algorithm works by factorizing the original user-item matrix into two factor matrices:

- user-factor matrix (n_users, k)
- item-factor matrix (k, n_items)

We are reducing the dimensions of our original matrix into "taste" dimensions. We cannot interpret what each latent feature $k$ represents. However, we could imagine that one latent feature may represent users who like historical sites, while another latent feature may represent destinations which are scenic natural destinations.

$$X_{mn} \approx P_{mk} \times Q_{nk}^T = \hat{X}$$


In traditional matrix factorization, such as SVD, we would attempt to solve the factorization at once which can be very computationally expensive. As a more practical alternative, we can use a technique called `Alternating Least Squares (ALS)` instead. With ALS, we solve for one factor matrix at a time:

- Step 1: hold user-factor matrix fixed and solve for the item-factor matrix
- Step 2: hold item-factor matrix fixed and solve for the user-item matrix

We alternate between Step 1 and 2 above, until the dot product of the item-factor matrix and user-item matrix is approximately equal to the original X (user-item) matrix. This approach is less computationally expensive and can be run in parallel.

The [implicit](https://github.com/benfred/implicit) package implements matrix factorization using Alternating Least Squares (see docs [here](https://implicit.readthedocs.io/en/latest/als.html)). Let's initiate the model using the `AlternatingLeastSquares` class.

In [None]:
model = implicit.als.AlternatingLeastSquares(factors=50)

This model comes with a couple of hyperparameters that can be tuned to generate optimal results:

- factors ($k$): number of latent factors,
- regularization ($\lambda$): prevents the model from overfitting during training

In this tutorial, we'll set $k = 50$ and $\lambda = 0.01$ (the default). In a real-world scenario, I highly recommend tuning these hyperparameters before generating recommendations to generate optimal results.

The next step is to fit our model with our user-item matrix. 

In [None]:
model.fit(X)

Now, let's test out the model's recommendations. We can use the model's `similar_items()` method which returns the most relevant destinations of a given destination. We can use our helpful `get_destination_index()` function to get the destination index of the destination that we're interested in.

In [None]:
destination_of_interest = 'phewa lake'

destination_index = get_destination_index(destination_of_interest)
related = model.similar_items(destination_index,N=6)
related_ids=list(related[0])
print(related_ids)

The output of `similar_items()` is not user-friendly. We'll need to use our `get_destination_title()` function to interpret what our results are. 

In [None]:
print(f"Because you watched the profile of {destination_finder(destination_of_interest)}...")
for r in related_ids:
    recommended_title = get_destination_title(r)
    if recommended_title != destination_finder(destination_of_interest):
        print(recommended_title)

When we treat user ratings as implicit feedback, the results look pretty good! You can test out other destinations by changing the `destination_of_interest` variable.

### Step 5: Generating User-Item Recommendations

A cool feature of [implicit](https://github.com/benfred/implicit) is that you can pull personalized recommendations for a given user. Let's test it out on a user in our dataset.

In [None]:
user_id = 95

In [None]:
user_ratings = ratings[ratings['user_id']==user_id].merge(destinations[['destination_id', 'title']])
user_ratings = user_ratings.sort_values('rating', ascending=False)
print(f"Number of destinations rated by user {user_id}: {user_ratings['destination_id'].nunique()}")

User 95 watched the profile of 19 destinations. Their highest rated destinations are below:

In [None]:
user_ratings = ratings[ratings['user_id']==user_id].merge(destinations[['destination_id', 'title']])
user_ratings = user_ratings.sort_values('rating', ascending=False)
top_5 = user_ratings.head()
top_5

Their lowest rated destinations:

In [None]:
bottom_5 = user_ratings[user_ratings['rating']<3].tail()
bottom_5

Let's see what recommendations our model will generate for user 95.

We'll use the `recommend()` method, which takes in the user index of interest and transposed user-item matrix. 

In [None]:
print(len(ratings["user_id"].unique()))

In [None]:
X_t = X.T.tocsr()
user_idx = user_mapper[user_id]
recommendations = model.recommend(user_idx, X_t,N=10)
recommendations

We can't interpret the results as is since destinations are represented by their index. We'll have to loop over the list of recommendations and get the destination title for each destination index. 

In [None]:
print("Here are some destinations personally recommended based on your previous ratings:")
for r in recommendations:
    recommended_title = get_destination_title(r[0])
    print(recommended_title)