# Evaluation of Matrix Factorization 

## Baseline Model

To acess the relevance of our recommendations we will compare our model to a baseline model. The baseline model will simpoly recommend the 20 most popular artists irregardless of user. The 20 most popular artists are the 20 artists with the most number of total listens. This baseline model does not make personalised recommendations, instead all users receive the same 20 recommendations. This is an extremely simple model that we expect our more advanced regularized matrix factorization model to outperform.

In [None]:
class Baseline():
    """
    Baseline model. Take most popular artist across entire dataset. 
    """
    def __init__(self, n_recs):
        self.n_recs = n_recs
    
    def fit(self, item_user):
        """
        Input: item_user, csr matrix of n_items, n_users. Calculate most popular artists.
        """
        print("Fitting baseline...")
        plays = item_user.toarray()
        total_listens = np.sum(plays, axis = 1) #sum up total artists in this dataset
        
        #get index of most popular artists
        idx = (-total_plays).argsort()[:self.n_recs]
        self.idx = idx
    
    def predict(self, X=None):
        #returns index of most popular artists
        return self.idx

In [None]:
model_baseline = Baseline(n_recs = 20)
model_baseline.fit(train)

In [None]:
def baseline():
    artist_pop = pd.DataFrame(user_artists.groupby('artistID').weight.sum()).set_axis('listens', axis=1, inplace=False).reset_index()
    artist_pop = pd.merge(artist_pop, artists.iloc[:, 0:2], left_on='artistID', right_on='id')
    bylistens = artist_pop.sort_values('listens', ascending=False).head(20)

## Metrics

We will use the following metrics:
- Recall
- Precision
- Coverage

In [None]:
from tqdm import tqdm_notebook as tqdm

In [None]:
def evaluate_lightfm(model, original, train, test, user_features=None, item_features=None, n_rec = 20):
    """
    Calculates evaluation metrics (Precision, Recall and Coverage)
    
    Parameters:
    - model: specified model for recommender system
    - original: matrix of full interactions
    - train: matrix of training set
    - test: matrix of test set
    - user_features: a matrix of user features built from metadata
    - artist_features: a matrix of item features built from metadata
    - n_rec: nu,ber of recommendations to be made
    
    Output:
    - coverage: catalog coverage of model specified
    - precision: precision of model specified
    - recall: recall of model specified
    
    
    """
    
    print("Evaluating Model...")
    
    print("Calculating Coverage...")
    catalog = []
    for user in tqdm(range(0, original.shape[0])):
        #get scores for this particular user for all items
        rec_scores = model.predict(user,np.arange(original.shape[1]),user_features=user_features, item_features=item_features)

        #get top k items to recommend
        rec_items = (-rec_scores).argsort()[:20]
    
        #calculate coverage
        #coverage calculation
        for recs in rec_items:
            if recs not in catalog:
                catalog.append(recs)
            
    coverage = len(catalog)/float(original.shape[1])
    
    print("Calculating Recall at k...")
    recall = recall_at_k(model, test, user_features=user_features, item_features=item_features, k = n_rec).mean()

    print("Calculating Precision at k...")
    precision = precision_at_k(model, test, user_features=user_features, item_features=item_features, k = n_rec).mean()
    
    return coverage, precision, recall