# Exerise 2
> This exercise builds on the lectures' material, namely Lectures 3, 4, 6, 7 and 8.

The aims of this execise are to:
 - Make your first recommender by implementing a user-based collaborating filtering (CF) (c.f. Lecture 4).
 - Make your first recommendations using Spotlight recommender toolkit on explicit data (c.f. Lecture 8).
 - Develop and evaluate baseline recommender systems (c.f. Lecture 3).
 - Start to think about explicit vs. implicit learners.
 - Evaluate your results using Spotlight (c.f. Lecture 6 & 7).


There are 10 tasks to increase your understanding of the content of the Recommender Sytems course.  Each of these tasks have corresponding questions in the quiz.

NB: Parts A, B & C are independent. It is important to properly manage your time to ensure that you have time to answer all parts of the Exercise.

In [None]:
#Standard setup
import pandas as pd
import numpy as np
import torch
from typing import List, Tuple, Sequence
SEED=20

We'll be using Movielens again. Let's load it in to the dataframe.


In [None]:
!curl -o ml-latest-small.zip http://files.grouplens.org/datasets/movielens/ml-latest-small.zip
# backup location
#!curl -o ml-latest-small.zip http://www.dcs.gla.ac.uk/~craigm/recsysHM/ml-latest-small.zip
!unzip -o ml-latest-small.zip

In [None]:
ratings_df = pd.read_csv("ml-latest-small/ratings.csv")
movies_df = pd.read_csv("ml-latest-small/movies.csv")

# we're going to treat userId as strings, and similarly as movies. This will prevent confusion later on.
ratings_df['userId'] = "u" + ratings_df['userId'].astype(str)
ratings_df['movieId'] = "m" + ratings_df['movieId'].astype(str)
movies_df['movieId'] = "m" +  movies_df['movieId'].astype(str)

In [None]:
ratings_df.head()

In [None]:
movies_df.head()

# Part A. User-based CF

You can generate a matrix of ratings with the ratings_df dataframe. Note that in the matrix, the unrated items are filled with 0 (this means they have no impact upon the calculated Cosine value, but you need to be careful about them in other situations).

In [None]:
r_df_matrix = ratings_df.pivot_table(index='userId', columns='movieId', values='rating').fillna(0)
r_df_matrix

The left hand bold column is the [index of the dataframe](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html) - that is, an attribute of the dataframe that allows fast lookup of rows. In this case, userId has become our index column.

You can get all the index of users using the .index

In [None]:
r_df_matrix.index

You can also use [.loc](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html) to access rows, by their "index". For instance, we can get all ratings of a specific user with userId=‘u1’.

In [None]:
r_df_matrix.loc['u1']

## Task 0 - Matrix Analysis

This task is concerned with examining `r_df_matrix`. Lets define *density* as the percentage of the matrix that that has been filled in (i.e. contains user interactions). How dense is `r_df_matrix`? Express your answer as a percentage (in range 0 to 100), rounded to 2 decimal places.

Hints:
 - You can obtain a Numpy tensor using from a DataFrame `.to_numpy()` if you prefer
 - Similarly, you can ask for the `.shape` of a dataframe

In [None]:
# Add your solution here

User-based CF heavily relies upon Cosine similarity. We are providing a Cosine similarity implementation based on numpy operations. We also show how to use `df.loc` to get all the ratings of a given user from `r_df_matrix` as a Series - we then make this into a numpy array using the [.values](https://pandas.pydata.org/docs/reference/api/pandas.Series.values.html) property.


In [None]:
def cos_sim(a, b):
  from numpy.linalg import norm
  from numpy import dot
  return dot(a, b)/(norm(a)*norm(b))

print('Cosine similarity between userId=1 and itself is:')
print(cos_sim(r_df_matrix.loc['u1'].values, r_df_matrix.loc['u1'].values))

print('Cosine similarity between userId=1 and userId=607 is:')
print(cos_sim(r_df_matrix.loc['u1'].values, r_df_matrix.loc['u607'].values))

## Task 1. Get the most similar users.

User-based CF is based on user-neighbourhoods. In this task, you will implement a function ` get_most_similar_users(userId : str, k : int = 10)` that will identify the userIds of the k most similar users to the specified userId, and their corresponding cosine similarities.

In determining the most similar users, you should break ties based on their position in the array - for instance, if two users are tied as 2nd most similar user, the user who appears earlier should be 2nd, and the latter user third.

You should exclude the compared user itself when generating a list of the most similar users.

NB: We are using Python type hints to remind you what the function parameters (`str`, `int`) and return type (`Tuple[Sequence[str], Sequence[float]]`) should be.

Hints:
 - The `cos_sim` function should be used here.
 - Higher `cos_sim` means more similar.
 - Try SciPy's [`rankdata()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rankdata.html) function. Given an array, `rankdata()` tells you positions in sorted rank order. For instance:
```
>>> rankdata([5.9, 2.1, 4.3])
array([3., 1., 2.])
```
It also has support for addressing ties.
 - The first return component of [np.nonzero](https://numpy.org/doc/stable/reference/generated/numpy.nonzero.html) can be used to return the indices of the elements that are non-zero. E.g.
 ```
 >>> np.array([True,False]).nonzero()[0]
array([0])
```


In [None]:
from scipy.stats import rankdata

def get_most_similar_users(userId : str, k : int = 10) -> Tuple[Sequence[str], Sequence[float]]:
  # Add your solution here

  topk_userids = ["u?"] # a list/numpy array of k userIds of top-k users
  topk_cosines = [0.0]  # a list/numpy array of k cosine similarity values
  return (topk_userids, topk_cosines)

print(get_most_similar_users(userId='u3', k=1))

# Add your solution here (cosine similarity > 0)


You can now answer the questions corresponding to Task 1 in the quiz.

In [None]:
print(get_most_similar_users(userId='u10', k=1))

In [None]:
print(get_most_similar_users(userId='u10', k=10))

In [None]:
print(get_most_similar_users(userId='u500', k=2))

## Task 2. Predict ratings via user-based CF.

Now you should implement your user-based CF, within a `predict_rating()` function. The aim of this function is to predict the rating of a given userId for a given itemId.

Your implementation should make use of your `get_most_similar_users()` implementation above, using k=10 nearest neighbours.
- **If a neighbour has not rated a movie (e.g. rating = 0), you should skip that neighbour.**
- **If you are unable to make a predicted rating, you should return 0.**

Hint:
 - You may wish to revise user-based CF from Lecture 4.

In [None]:
def predict_rating(userId : str, movieId : str) -> float:
  # Add your solution here
  predicted = "?" # predicted rating value
  return predicted

print("Predicted rating:", predict_rating(userId='u1', movieId='m1'))

print("Actual rating:", r_df_matrix.loc['u1']['m1'])

You can complete answering the quiz questions for Task 2.

## Task 3. Predict ratings via user-based CF with Mean-center normalisation.

Users usually rate differently: (1) some rate high, while others low. (2) Some use more of the scale than others. However, the user-based CF we implemented above ignores these differences. To this end, we can apply normalisation to compensate. In this task, you will implement user-based CF with Mean-Center Normalisation.

Provide implementations for `mean_rating(userId : str)` and `predict_rating_MC(userId : str, movieId : str)`.

Hints:
- See lecture 4 about user-based CF with Mean-center normalisation.
- Check if the predicted rating for a given user makes sense (i.e. what did the user rate before)?

In [None]:
def mean_rating(userId : str) -> float:
  # Add your solution here
  mean_rating = '?' # mean-centering value
  return mean_rating

print("Mean rating of user u5:", mean_rating('u5') )

def predict_rating_MC(userId : str, movieId : str) -> float:
  # Add your solution here
  predicted = "?" # predicted rating value with mean-centering
  return predicted

print("Predicted rating:", predict_rating_MC('u1', 'm1'))
print("Actual rating:", r_df_matrix.loc['u1']['m1'])

Now answer the questions for Task 3 in the quiz.

#Part B - Explicit Matrix Factorisation using Spotlight

In this part, we will investigate explicit matrix factorisation.

We're going to use the Spotlight library - see https://github.com/maciejkula/spotlight - and its documentation at https://maciejkula.github.io/spotlight/

You can install this direct from Git, but using Craig's patched version as done below.


In [None]:
!pip install git+https://github.com/cmacdonald/spotlight.git@master#egg=spotlight

Now we can get onto some real recommendation work. Spotlight has a handy [Interactions](https://maciejkula.github.io/spotlight/interactions.html) object, which encapsulates the basics of a recommendation dataset.

In fact, there are handy loaders for a few standard datasets including MovieLens, but let's make our own, so that we can match back to the dataframe.

Interactions need *numbers* to uniquely identify each item and user. Unfortunately, our MovieLens uses numbers, but these aren't consecutive (i.e. we have missing movieIds values). They are also strings (i.e. movieIds start with "m" and userIds start with "u").

Hence, for both movies and users, we have to assign numbers that start from 0. We will call these **iids** and **uids**.

We use [defaultdict](https://docs.python.org/3/library/collections.html#collections.defaultdict) to convert the MovieLens strings down to consecutive integers for use in Spotlight, in the `uid_map` and `iid_map` objects. We'll keep the reverse mapping around too, in case we want to lookup the actual movieId given the uid recorded by Spotlight (etc).

*NB*: This is a *really* important concept to understand. Put simply, WE -- as humans -- deal with external representations (userId, movieId, in this dataset prefixed with "u" and "m" respectively). On the other hand, Spotlight can only deal with integers starting from 0 for both items and users (we call these "iids" and "uids").

In [None]:
from collections import defaultdict
from itertools import count

#create userId -> uid mapping dictionary. the next assigned value is the current size.
uid_map = defaultdict(count().__next__)
#ditto for movieId -> iid
iid_map = defaultdict(count().__next__)

#uids is an array of integers corresponding to the userId for every row in ratings_df
#uid_map does the assignment of new uid values, or reusing the uid value assigned for
#each userId
uids = np.array([uid_map[uid] for uid in ratings_df["userId"].values ], dtype=np.int32)
#similar for iids
iids = np.array([iid_map[iid] for iid in ratings_df["movieId"].values ], dtype=np.int32)

#freeze uid_map and iid_map so no more mapping are created
uid_map.default_factory = None
iid_map.default_factory = None

uid_rev_map = {v: k for k, v in uid_map.items()}
iid_rev_map = {v: k for k, v in iid_map.items()}
num_items = len(iid_map)
num_users = len(uid_map)

print("%d users %d item" % (num_users, num_items))

ratings = ratings_df["rating"].values.astype(np.float32)
timestamps = ratings_df["timestamp"].values.astype(np.int32)

To be clear, `uid_map` and `iid_map` are just dictionaries - you can use them to lookup the uid (iid) assigned to a given user (movie).

Similarly, `uid_rev_map` (`iid_rev_map`) can be used to recover the userId (movieId) for a given uid (iid).

In [None]:
print("userId %s got uid %d" % ("u556", uid_map["u556"]))
print("movieId %s got iid %d" % ("m54001", iid_map["m54001"]))

Furthemore, we will use user u556 as one of our illustrative users. You will remember from Exercise 1 that they rated a number of fantasy movies highly.



## On towards Matrix Factorisation (MF)

Now let's build a Spotlight [Interactions](https://maciejkula.github.io/spotlight/interactions.html) object. This contains everything that Spotlight needs to train a model. We can split it up randomly into train and test subsets

NB: we use a SEED (20) to make our results reproducible.

In [None]:
from spotlight.interactions import Interactions
from spotlight.cross_validation import random_train_test_split

dataset = Interactions(user_ids=uids,
                                  item_ids=iids,
                                  ratings=ratings,
                                  timestamps=timestamps)

#lets initialise the seed, so that its repeatable and reproducible
train_valid, test = random_train_test_split(dataset, random_state=np.random.RandomState(SEED))
train, valid = random_train_test_split(train_valid, random_state=np.random.RandomState(SEED))

Let's see how big the two datasets are. What is the train/test split percentage size?

In [None]:
print(train)
print(valid)
print(test)

Here, you can see that following the collaborative filtering task model (see Lecture 6), all users, and all items, are present in both training and test sets.

Now, you can think of the Interaction objects are being the partitions of the rating matrix. But we don't store it as a single big matrix. Instead, we record three one-dimensional arrays:

  * one for the ids of the users
  * one for the ids of the items
  * one for the actual rating values.

Each of these arrays is the size of the number of ratings (64534 for the training set).

In essence, Interactions is a sparse matrix - for each rating, we record its x and y position, as well as the rating itself.


In [None]:
print(train.item_ids.shape)
print(train.user_ids.shape)
print(train.ratings.shape)

For instance, let's look at the first rating:

In [None]:
print("uid %d gave iid %d a rating of %d" % (train.user_ids[0], train.item_ids[0],train.ratings[0]))

Let's take our favourite fantasy adventure fan from Exercise 1, userId u556. We can give a look at their training ratings:

In [None]:
# map userId to the internal uid value
userId = "u556"
uid = uid_map.get(userId)

# see which ratings are for this user. Use this to filter the item and ratings arrays.
# here we are filtering a numpy array based on an array of True/False values. Its just
# like filtering a Pandas data frame.
print(train.item_ids[train.user_ids == uid])
print(train.ratings[train.user_ids == uid])

We can now learn a model. Let's start with a matrix factorisation for explicit data.  We train the model using the `fit` method. This is just like the `fit` in Sklearn - we're fitting  a model to the specified training data.

This might take upto a minute.

**NB:**  Spotlight can support using GPUs which we could use to slightly speed up training time, but that will make our life more difficult later on, so let's ignore this for now.

In [None]:
from spotlight.factorization.explicit import ExplicitFactorizationModel
import time

emodel = ExplicitFactorizationModel(n_iter=10,
                                    embedding_dim=32, #this is Spotlight default
                                    use_cuda=False,
                                    random_state=np.random.RandomState(SEED) # ensure results are repeatable
)
current = time.time()

emodel.fit(train, verbose=True)

end = time.time()
diff = end - current
print("Training took %d seconds "% (diff))

How well did we do. Well, let's give a look at the recommentations, for our specific user, userId u556.



In [None]:
userId = "u556"

# convert the string to the internal integer
uid = uid_map.get(userId)
print("One test item_id for userId %s (uid %d) is " % (userId, uid))

# pick one rating that the user made
testItemId = test.item_ids[test.user_ids == uid][0]
print("Test movieId is %s iid %d " % (iid_rev_map.get(testItemId), testItemId ) )


#here 0 is a dummy item, which Spotlight needs for some reason...
#we discard its prediction using [1]
predicted = emodel.predict( np.array([uid]), item_ids=np.array([0, testItemId]) )[1]

#what was the actual score of the user for that movie?
#we can get the appropriate row from the ratings dataframe, then extract that value
actual = ratings_df[(ratings_df.movieId==iid_rev_map.get(testItemId)) & (ratings_df.userId==userId)]["rating"].values[0]


def getMovieTitle(iid):
  return movies_df[movies_df['movieId'] == iid_rev_map.get(iid)]["title"].values[0]

print("Predicted rating for '%s' was %f, actual rating %0.1f, error was %f" % (getMovieTitle(testItemId), predicted, actual, abs(predicted-actual) ))


So this is interesting - while we saw above that the users liked fantasy movies, we predicted a rating of $\sim 2.5$, but the user gave this particular movie a 3.5.

We can also ask for **all** of the recommendations for a given user:

In [None]:
allpreds = emodel.predict( np.array([uid]) )

print(allpreds)
print(allpreds.size)

#we can recover the original rating for our test item
print(allpreds[testItemId])

# lets just check we got the correct prediction
print(allpreds[testItemId] - actual < 0.1)

## Latent Factors aka Embeddings

Let's see how these recommendations are made. Remember from Lecture 8 that the prediction is made based on the dot product of the user's and item's latent factors (also know as "embeddings").

We can access these embeddings directly from the emodel object. Each embedding has 32 dimensions, which is what we set when configuring Spotlight's Explicit Factorisation Model.

In [None]:
#the embedding of an item is a PyTorch tensor of size 32
#a PyTorch tensor can be thought of having similar semantics as an numpy array.
print(emodel._net.item_embeddings.weight[0].shape)
emodel._net.item_embeddings.weight[0]


We can check how Spotlight makes its prediction. The key line is https://github.com/maciejkula/spotlight/blob/master/spotlight/factorization/representations.py#L89

This takes the (dot-)product of the user's "embedding" (latent factor) and the item's embedding. On top of these are added "user_biases" and "item_biases". What do you think these last two components are for?

Let's reproduce this for our favourite user...

In [None]:
# uid=555 for u556
# testItemId is our item of interest

dotprod = (emodel._net.user_embeddings.weight[uid] * emodel._net.item_embeddings.weight[testItemId]).sum(0)
user_bias = emodel._net.user_biases(torch.tensor([uid]))
item_bias = emodel._net.item_biases(torch.tensor([testItemId], dtype=torch.long))

print(getMovieTitle(testItemId))

dotprod + user_bias + item_bias

## Task 4. Examining Latent Factors

Let's give a look at item-item similarities. Write a function `mostsimilar(targetMovieId, model)` that identifies the most similar movieId to the specified target, based on the Cosine similarity of their item embedding vectors.

What's the closest movie to "Harry Potter and the Deathly Hallows: Part 1 (2010)" , which is movieId m81834 in the MovieLens dataset?

Hint:
 - Since we're working with PyTorch tensors (rather than the numpy vectors used in Part A), you should use [`nn.functional.cosine_similarity(x, y, dim=0)`](https://pytorch.org/docs/stable/nn.functional.html#cosine-similarity) to calculate the cosine similarity between two vectors x & y, as demonstrated below between two orthogonal vectors:

In [None]:
import torch.nn as nn
nn.functional.cosine_similarity(
     torch.tensor([1.0,0]),
     torch.tensor([0,1.0],), dim=0)

In [None]:
import torch.nn as nn

def mostsimilar(targetIId : int, model):
  highest=0
  highestCos=0

  #you may assume that model._num_items provides the total number of items


  # Add your solution here
  #####################

  print(train.num_items)
  print("targetMovieId = %s '%s' (iid %d)" % (iid_rev_map.get(targetIId), getMovieTitle(targetIId), targetIId))
  print("mostSimilar = %s (iid %d) with cosine of %f " % ( iid_rev_map.get(highest), highest, highestCos))


mostsimilar(iid_map["m81834"], emodel)

Hopefully, you can see a correspondence between the nearest movie to `"m81834"`.

In [None]:
mostsimilar(iid_map["m88125"], emodel)

In [None]:
mostsimilar(iid_map["m44"], emodel)

## Evaluating performance

Finally, let's see how good we are at our rating predictions. Handily, Spotlight implements a few common evaluation measures for us to inspect.

In [None]:
from spotlight.evaluation import rmse_score

train_rmse = rmse_score(emodel, train)
test_rmse = rmse_score(emodel, test)

print('Train RMSE {:.3f}, test RMSE {:.3f}'.format(train_rmse, test_rmse))


## Task 5. Tuning

Now we wish to tune the latent factors. The task here is to train and evaluate new instances of ExplicitFactorizationModels using different numbers of latent factors, while leaving the other parameters unchanged (i.e. `n_iter=10, use_cuda=False, random_state=np.random.RandomState(SEED)`.

You should also record the training times for different numbers of latent factors.

You should vary the factors in `[8,16,32,64]`. Evaluate and record the RMSE values of the resulting models on (i) the training set (`train`), (ii) the  validation set (`valid`) and (iii) the test set (`test`). Use matplotlib to create a graph showing how the training, validation and test RMSE change as the number of latent factors is varied. Use [plt.savefig()](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.savefig.html) to save a PNG of your graph.

You can now answer the questions about Task 5 in the quiz.

In [None]:
# Add your solution here

## Evaluating Other Models

When evaluating models, it's important to compare to some reasonable baselines.

Fortunately, Spotlight's `rmse_score()` method can be used to evaluate any Python object that adheres to the specification of the `predict()` function. For instance, we can make a baseline "static" scoring model, which returns the same scores for each user. This set of scores is passed as numpy array in the constructor.


In [None]:
class StaticModel:

  def __init__(self, staticscores):
    self.numitems = len(staticscores)
    assert isinstance(staticscores, np.ndarray), "Expected a numpy array"
    assert staticscores.dtype == np.float32 or staticscores.dtype == np.float64, "Expected a numpy array of floats"
    self.staticscores = staticscores

  # uids are the user(s) we are requesting recommendations for;
  # returns an array of scores, one for each item
  # the array is duplicated for each user requested
  def predict(self, uids, iids=None):
    # this model returns all zeros, regardless of userid

    # we respond to one or more uids
    uids = [uids] if isinstance(uids, int) else uids

    # if iids is specificed, we filter predicts for those userids
    # if iids is not specificed, predict() returns the prediction for each item _in iid order_
    iids = np.arange(self.numitems) if iids is None else iids
    return [self.staticscores[iids] for u in uids]

For instance, we can make a static baseline that just returns 0 for every item, regardless of the user.

In [None]:
mydummymodel = StaticModel(np.zeros(num_items))

print("Asking for 2 users, one item: " + str(mydummymodel.predict([0,1],0)))
print("Asking for one item: " + str(mydummymodel.predict(0,0)))
print("Asking for two items: " + str(mydummymodel.predict(0,[0,1])))
print("RMSE of our dummy model: %f" % rmse_score(mydummymodel, test))

## Task 6. Popularity-based Recommenders

This task asks you to implement other baseline recommenders.

**Using ratings_df**, create three new instances of StaticModel as baselines:

(a). the number of ratings for each item - you must linearly normalise this to be in the range 0-5.

(b). the number of 5 scores received by an item - you must linearly normalise this to be in the range 0-5.

(c). the average rating value for each item (no need to normalise - scores are already 0-5)

Evaluate your baseline models in terms of RMSE, as well as providing their scores for particular iids, as requested in the quiz.

Hints:
 - You may find iterating over a dataframe using iterrows() useful - e.g. see  https://stackoverflow.com/a/16476974
 - The order that predict() returns scores for items is VERY IMPORTANT. Think carefully about the assumed order that predict() returns item scores for, and how you can recover that order when working with ratings_df.

In [None]:
# Add your solution here

# Part C - Implicit Recommendation

This part of the lab uses a music dataset from [Last.fm](https://www.last.fm/) -- a Spotify-like music streaming service -- that was obtained by a researcher at Pompeu Fabra University (Barcelona, Spain). The relevant citation is:

```
  @book{Celma:Springer2010,
      	author = {Celma, O.},
      	title = {{Music Recommendation and Discovery in the Long Tail}},
       	publisher = {Springer},
       	year = {2010}
      }
 ```

You can have more information about the dataset at [this link](http://ocelma.net/MusicRecommendationDataset/lastfm-1K.html).

## Dataset preparation

The full Last.fm dataset is 2.4GB uncompressed. So we focus on a sample with 200k listens. You can download and load the sample into a DataFrame using just one line code:

In [None]:
listens_df = pd.read_csv("https://www.dcs.gla.ac.uk/~craigm/recsysH/lastfm-200ksample-listens_df.csv.gz")


Let's look at the dataset. Note that the we don't have any explicit ratings by the users. We just know what they interacted with (and when).

In [None]:
listens_df.head()

## An implicit recommendation approach

Let's move away from explicit recommendation to implicit.

We will continue using the [Spotlight](https://github.com/maciejkula/spotlight/) toolkit for our recommender.

We can construct [Interaction](https://maciejkula.github.io/spotlight/interactions.html) objects for Spotlight in the same way as before. The only difference is that this time we do not record the user's ratings.


In [None]:
from collections import defaultdict
from itertools import count

#we cant trust the musicbrainz ids to exist, so lets build items ids based on artist & trackname attributes
LFMiid_map = defaultdict(count().__next__)
LFMiids = np.array([LFMiid_map[str(artist)+"/"+str(trackname)] for artist,trackname in listens_df[["artist","trackname"]].values ], dtype=np.int32)

LFMuid_map = defaultdict(count().__next__)
LFMuids = np.array([LFMuid_map[uid] for uid in listens_df["user"].values ], dtype=np.int32)
#freeze uid_map and iid_map so no more mapping are created
LFMuid_map.default_factory = None
LFMiid_map.default_factory = None

LFMuid_rev_map = {v: k for k, v in LFMuid_map.items()}
LFMiid_rev_map = {v: k for k, v in LFMiid_map.items()}

from spotlight.interactions import Interactions
from spotlight.cross_validation import random_train_test_split

#NB: we will set num_users and num_items here - its a good practice.
imp_dataset = Interactions(user_ids=LFMuids, item_ids=LFMiids, num_users=len(LFMuid_map), num_items=len(LFMiid_map))
#we could add the timestamps here if we were doing sequence recommendation

#what have we got.
print(imp_dataset)

In [None]:
from spotlight.cross_validation import random_train_test_split

itrain, itest = random_train_test_split(imp_dataset, random_state=np.random.RandomState(SEED))
print(itrain)
print(itest)

Let's run Spotlight's impllicit Matrix Factorisation on this dataset. Here, we use a *pointwise* loss, which just tries to predict whether the user will like the item or not. It does not use the BPR loss function (more on that later).

**Warning**: this dataset is difficult for the learner - this *will* take a few minutes to learn... Use the time to read-on.

In [None]:
from spotlight.factorization.implicit import ImplicitFactorizationModel
import time

imodel = ImplicitFactorizationModel(n_iter=5,
                                    embedding_dim=32, #this is Spotlight default
                                    use_cuda=False,
                                    random_state=np.random.RandomState(SEED) # ensure results are repeatable
)
current = time.time()

imodel.fit(itrain, verbose=True)
end = time.time()
diff = end - current
print("Training took %d seconds" % (diff))

Again, we can look at the predictions. We make a prediction (a score ) for ALL items for user uid 0. Note that the scores vary in magnitude - indeed, we're not predicting a rating, we just need to have scores in order to rank the items in descending order.

In [None]:
print(imodel.predict(0))
print(len(imodel.predict(0)))

Now that we have the scores of all items for a given user, we need to identify the top-scored ones, i.e. those that we would present to the user.

## Task 7. Track Analysis

Write a function `tracksForUser(user)` to identify the artist name & track of the top K (e.g. K=4) items based on their prediction scores of `imodel` for a given user index index (i.e. 0.. 964). What are the top scored 10 tracks recommended for user uid 4?

Hints:


 - I also found [`np.argwhere()`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.argwhere.html) to be useful. It results only the positions of an array that are True. For instance:
```
>>> np.argwhere([True, False])
array([[0]])
```
 Alternatively, you can sort and then slice.



In [None]:
#Add your solution here

## Task 8. Artist Analysis

Look at the artists actually listened to by uid 4, and compare/contrast with the predictions of the recommender. It's useful to examine how many times each artist was listened to.

Hints:
 - Use a groupby on a suitable subset of the `listens_df` dataframe.
 - Sort by descending frequency of listen.

In [None]:
#Add your solution here

I observed that uid 4 listened frequently to "Radiohead" (rank 3), while a Radiohead song was among the top 10 ranked songs in our predicted model.

## Evaluating an implicit recommender




We can examine the MRR of the implicit model we have learned. We pass it the test set (which contains knowledge of what the user *actually* clicked), as our ground truth.

In the second variant, we also pass the training data. Give a look at the  implementation of [mrr_score()](https://github.com/cmacdonald/spotlight/blob/master/spotlight/evaluation.py#L8) to understand what it is doing, and why.

**Questions for you to consider**
 - Why is the second score lower?
 - Would this be the same for all recommendation settings?
 - In the implementation, why are the scores negated, why do we use [rankdata()](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rankdata.html)?

We will use the first variant for this Lab.

In [None]:
from spotlight.evaluation import mrr_score

#evaluate on this dataset takes approx 1 minute
!date
print(mrr_score(imodel, itest).mean())
!date
print(mrr_score(imodel, itest,  train=itrain).mean())
!date


How to interpret an MRR score - we know it has a range [0,1] with 1 being best. 1 means, on average across all users, we make a relevant prediction at rank 1; 0.5 means, on average, at rank 2. This is a very rough rule-of-thumb - MRR isn't a linear measure, so  a few poor predictions affect the average more than a few good ones.

**More information:**

rankdata() is a very useful function. Here's an example of its output:
```python
>>> rankdata([0, 2, 3, 2])
array([1. , 2.5, 4. , 2.5])
```

It tells us the RANK of the number at each position. So the first element of the array (value 0) was the smallest, so is "rank 1"; the highest value gets "rank 4"; the other two values are tied, so they get equal ranks (2.5 is halfway between 2 & 3). We can adjust this tie-breaking behaviour using the `method=` kwarg.

You can now answer all questions for Task 8.

## Task 9. Listens and Recommendations

We now want to analyse how the recommender predictions differs from what it was trained on. This helps us understand in what situations does the recommender perform well or not.

We can see the models performance by using `mrr_score(imodel, itest)`.

*   Pick the user with the lowest uid that has the highest RR. How many listens (ie. how many times they have listened to any song) did they have in the training dataset (as represented by `itrain`)?
*   Similarly, pick the user with the lowest uid that had the lowest RR. How many listens did they have in the training dataset (`itrain`)?

Hints:
 * What does an Interaction object contain?

In [None]:
#Add your solution here

Next, make a numpy array containing the number of listens for each uid in `itrain`. Plot a histogram of the distribution - like in Exercise 1, use matplotlib's histogram functionality, the default number of bins and use `log=True`.

Save the PNG for uploading to the quiz when prompted.

In [None]:
#Add your solution here

Many users have very few listens. Lets set 20 listens as a threshold.

Lets define users with < 20 listens as cold-start users.
Based on `itrain`, how many cold-start users are there?
Looking back to our evaluation results, what is the MRR for ONLY these users, versus "normal" with 20 or more listens.


In [None]:
#Add your solution here

## Task 10 - BPR

Finally, let's compare the *pointwise* implicit factorisation model with Bayesian Personalised Ranking (BPR). BPR is a very key recommendation model in the literature, which is widely used today as a baseline in many research papers.

Train an ImplicitFactorizationModel on the Last FM dataset (i.e. `itrain`) using identical settings as before, except adding `loss='bpr'`. Record the time taken to train, and the evaluate its effectiveness in terms of MRR. Do NOT use the `train=itrain` argument to `mrr_score()`.

In [None]:
#Add your solution here

# End of Exercise

As part of your submission, you should complete the Exercise 2 quiz on Moodle.
You will need to upload your notebook, complete with the **results** of executing the code (inc figures and plots).