# RecSys M/H 2023 - Exercise 3 Template

The aims of this exercise are:
 - Explore a different recommendation dataset
 - Develop and evaluate baseline recommender systems
 - Implement hybrid recommender models
 - Explore diversification issues in recommender systems
 - Revise other material from the lectures.

As usual, there is a corresponding Quiz on Moodle for this Exercise, which should be answered as you proceed. For more details, see the Exercise 3 specification.



# Part-Pre. Preparation

## Pre 1. Setup Block

This exercise will use the [Goodreads]() dataset for books. These blocks setup the data files, Python etc.

In [None]:
!rm -rf ratings* books* to_read* test*

!curl -o ratings.csv "https://www.dcs.gla.ac.uk/~craigm/recsysH/coursework/final-ratings.csv"
!curl -o books.csv "https://www.dcs.gla.ac.uk/~craigm/recsysH/coursework/final-books.csv"
!curl -o to_read.csv "https://www.dcs.gla.ac.uk/~craigm/recsysH/coursework/final-to_read.csv"
!curl -o test.csv "https://www.dcs.gla.ac.uk/~craigm/recsysH/coursework/final-test.csv"

In [None]:
#Standard setup
import pandas as pd
import numpy as np
import torch
!pip install git+https://github.com/cmacdonald/spotlight.git@seed#egg=spotlight
from spotlight.interactions import Interactions
SEED=20
BPRMF=None

## Pre 2. Data Preparation

Let's load the `goodbooks` dataset into dataframes.
- `ratings.csv`: It contains ratings sorted by time. Ratings go from one to five.
- `to_read.csv`: It provides IDs of the books marked "to read" by each user, as <user_id, book_id> pairs.
- `books.csv`: It has metadata for each book (goodreads IDs, authors, title, average rating, etc.).

In [None]:
#load in the csv files
ratings_df = pd.read_csv("ratings.csv")
books_df = pd.read_csv("books.csv")
to_read_df = pd.read_csv("to_read.csv")
test = pd.read_csv("test.csv")

In [None]:
## Test
to_read_df.head()

In [None]:
#cut down the number of items and users
counts=ratings_df[ratings_df["book_id"] < 2000].groupby(["book_id"]).count().reset_index()
valid_books=counts[counts["user_id"] >= 10][["book_id"]]

books_df = books_df.merge(valid_books, on="book_id")
ratings_df = ratings_df[ratings_df["user_id"] < 2000].merge(valid_books, on="book_id")
to_read_df = to_read_df[to_read_df["user_id"] < 2000].merge(valid_books, on="book_id")
test = test[test["user_id"] < 2000].merge(valid_books, on="book_id")


#stringify the id columns
def str_col(df):
  if "user_id" in df.columns:
    df["user_id"] = "u" + df.user_id.astype(str)
  if "book_id" in df.columns:
    df["book_id"] = "b" + df.book_id.astype(str)

str_col(books_df)
str_col(ratings_df)
str_col(to_read_df)
str_col(test)

Here we construct the Interactions objects from `ratings.csv`, `to_read.csv` and `test.csv`. We manually specify the num_users and num_items parameters to all Interactions objects, in case the test set differs from your training sets.

In [None]:
from collections import defaultdict
from itertools import count

from spotlight.cross_validation import random_train_test_split

iid_map = defaultdict(count().__next__)


rating_iids = np.array([iid_map[iid] for iid in ratings_df["book_id"].values], dtype = np.int32)
test_iids = np.array([iid_map[iid] for iid in test["book_id"].values], dtype = np.int32)
toread_iids = np.array([iid_map[iid] for iid in to_read_df["book_id"].values], dtype = np.int32)


uid_map = defaultdict(count().__next__)
test_uids = np.array([uid_map[uid] for uid in test["user_id"].values], dtype = np.int32)
rating_uids = np.array([uid_map[uid] for uid in ratings_df["user_id"].values], dtype = np.int32)
toread_uids = np.array([uid_map[iid] for iid in to_read_df["user_id"].values], dtype = np.int32)


uid_rev_map = {v: k for k, v in uid_map.items()}
iid_rev_map = {v: k for k, v in iid_map.items()}


rating_dataset = Interactions(user_ids=rating_uids,
                               item_ids=rating_iids,
                               ratings=ratings_df["rating"].values,
                               num_users=len(uid_rev_map),
                               num_items=len(iid_rev_map))

toread_dataset = Interactions(user_ids=toread_uids,
                               item_ids=toread_iids,
                               num_users=len(uid_rev_map),
                               num_items=len(iid_rev_map))

test_dataset = Interactions(user_ids=test_uids,
                               item_ids=test_iids,
                               num_users=len(uid_rev_map),
                               num_items=len(iid_rev_map))

print(rating_dataset)
print(toread_dataset)
print(test_dataset)

#here we define the validation set
toread_dataset_train, validation = random_train_test_split(toread_dataset, random_state=np.random.RandomState(SEED))

num_items = test_dataset.num_items
num_users = test_dataset.num_users

Finally, this is some utility code that we will use in the exercise.

In [None]:
def getAuthorTitle(iid):
  bookid = iid_rev_map[iid]
  row = books_df[books_df.book_id == bookid]
  return row.iloc[0]["authors"] + " / " + row.iloc[0]["title"]

print("iid 0: " + getAuthorTitle(0) )

## Pre 3. Example Code

To evaluate some of your hand-implemented recommender systems (e.g. Q1, Q4), you will need to instantiate objects that match the specification of a Spotlight model, which `mrr_score()` etc. expects.


Here is an example recommender object that returns 0 for each item, regardless of user.

In [None]:
from spotlight.evaluation import mrr_score, precision_recall_score

class dummymodel:

  def __init__(self, numitems):
    self.predictions=np.zeros(numitems)

  #uid is the user we are requesting recommendations for;
  #returns an array of scores, one for each item
  def predict(self, uid):
    #this model returns all zeros, regardless of userid
    return( self.predictions )

#lets evaluate how the effeciveness of dummymodel

print(mrr_score(dummymodel(num_items), test_dataset, train=rating_dataset, k=100).mean())
#as expected, a recommendation model that gives 0 scores for all items obtains a MRR score of 0

In [None]:
#note that mrr_score() displays a progress bar if you set verbose=True
print(mrr_score(dummymodel(num_items), test_dataset, train=rating_dataset, k=100, verbose=True).mean())

# Part-A. Combination of Recommendation Models

## Task 1. Explicit & Implicit Matrix Factorisation Models

Create and train three matrix factorisation systems:

(NOTE: Different models will be trained using DIFFERENT datasets)
 - "EMF": explicit MF, trained on the **ratings** Interactions object (`rating_dataset`)
 - "IMF": implicit MF, trained on the **toread** Interactions object (`toread_dataset_train`)
 - "BPRMF": implicit MF with the BPR loss function (`loss='bpr'`), trained on the **toread** Interactions object (`toread_dataset_train`)

Use a variable of the same name for these models, as we will use some of them later (e.g. `BPRMF`).

Normally, the hyper-parameters (e.g. `embedding_dim`) will be tuned using the `validation` set based on different models, but here, to simplify the excercie, we use a fixed setting of those hyper-parameters, and keep a fixed random seed.
  
In all cases, you must use the standard initialisation arguments, i.e.
`n_iter=10, embedding_dim=32, use_cuda=False, random_state=np.random.RandomState(SEED)`.

Evaluate each of these models in terms of Mean Reciprocal Rank on the test set. MRR can be obtained using:
```python
mrr_score(X, test_dataset, train=rating_dataset, k=100, verbose=True).mean())
```
where X is an instance of a Spotlight model. Do NOT change the `k` or `train` arguments. You MUST use these arguments for MRR for all of the rest of this Exercise.

### Implement the explicit MF model

In [None]:
# Add your solution here

Now you can answer quiz question 1


### Implement the implicit MF model

In [None]:
# Add your solution here

Now you can answer quiz question 2

### Implement the BPRMF model

In [None]:
# Add your solution here
# use BPRMF as the name of your model
BRMF = None

Now you can answer quiz question 3

## Task 2. Hybrid Model

In this task, you are expected to create new hybrid recommendation models that
combine the two models in Task 1, namely IMF and BPRMF.

(a) Linearly combine the *scores* from IMF and BPRMF.  Here please use **CombSUM** as your data fusion function, and you need normalise both input scores into the range 0..1 using [sklearn's minmax_scale() function](
https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.minmax_scale.html) before combining them.

(b) Apply a pipelining recommender, where the top 100 items are obtained from IMF and re-ranked using the scores of BPRMF. Items not returned by IMF get a score of 0.

To implement these hybrid models, you should create new classes that abide by the Spotlight model contract (namely, it has a `predict(self, uid)` function that returns a score for *all* items).

Evaluate each model in terms of MRR. How many users are improved, how many are degraded compared to the BPRMF baseline?

Finally, pass your instantiated model object to the `test_Hybrid_a()` (for (a)) or `test_Hybrid_b()` (for (b)) functions, as appropriate, and record the results in the quiz. For example, if your model for (b) is called `pipeline`, then you would run:
```python
test_Hybrid_b(pipeline)
```

You now have sufficient information to answer the Task 2 quiz questions.

In [None]:
def test_Hybrid_a(combsumObj):
  for i, u in enumerate([5, 20]):
    print("Hybrid a test case %d" % i)
    print(np.count_nonzero(combsumObj.predict(u) > 1))

def test_Hybrid_b(pipeObj):
  for i, iid in enumerate([3, 0]):
    print("Hybrid b test case %d" % i)
    print(pipeObj.predict(0)[iid])



In [None]:
# Add your solutions here and evaluate them

In [None]:
#Now test your hybrid approaches for the quiz

#test_Hybrid_a(linearModel)
#test_Hybrid_b(pipeModel)


# Part-B. Analysing Recommendation Models

## Utility methods

Below, we provide a function, `get_top_K(model, uid : int, k : int)` which, when provided with a Spotlight model, will provide the top k predictions for the specified uid. The iids, their scores, and their embeddings are returned.

In [None]:
from typing import Sequence, Tuple

def get_top_K(model, uid : int, k : int) -> Tuple[ Sequence[int], Sequence[float],  np.ndarray ] :
  #returns iids, their (normalised) scores in descending order, and item emebddings for the top k predictions of the given uid.

  from sklearn.preprocessing import minmax_scale

  from scipy.stats import rankdata
  # get scores from model
  scores = model.predict(uid)

  # map scores into rank 0..1 over the entire item space
  scores = minmax_scale(scores)

  #compute their ranks
  ranks = rankdata(-scores)

  # get and filter iids, scores and embeddings
  rtr_scores = scores[ranks <= k]
  rtr_iids = np.argwhere(ranks <= k).flatten()
  if hasattr(model, '_net'):
    embs = model._net.item_embeddings.weight[rtr_iids].detach()
  else:
    # not a model that has any embeddings
    embs = np.zeros([k,1])

  # identify correct ordering using numpy.argsort()
  ordering = (-1*rtr_scores).argsort()

  #return iids, scores and their embeddings in descending order of score
  return rtr_iids[ordering], rtr_scores[ordering], embs[ordering]

if BPRMF is not None: # BPRMF is the model name defined in Task 1
  iids, scores, embs = get_top_K(BPRMF, 0, 10)
  print("Returned iids: %s" % str(iids))
  print("Returned scores: %s" % str(scores))
  print("Returned embeddings: %s" % str(embs))
else:
  print("You need to define BPRMF in Task 1")

## Task 3. Evaluation of Non-personalised Models
Implement the following four (non-personalised) baselines for ranking books based on their statistics:
 - Average rating, obtained from ratings_df, `ratings` column
 - Number of ratings, obtained from books_df (column `ratings_count`)
 - Number of 5* ratings, obtained from books_df (column `ratings_5`)
 - Fraction of 5* ratings, calculated from the two sources of evidence above, i.e (columns  `ratings_5` and `ratings_count`).

Evaluate these in terms of MRR using the provided test data. You may use the StaticModel class below.

Hints:
 - As in Exercise 2, the order of items returned by predict() is _critical_. You may wish to refer to iid_map.
 - For all models, you need to ensure that your values are not cast to ints. If you are extracting values from a Pandas series, it is advised to use [.astype(np.float32)](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.astype.html).


In [None]:
class StaticModel:

  def __init__(self, staticscores):
    self.numitems = len(staticscores)
    #print(self.numitems)
    assert isinstance(staticscores, np.ndarray), "Expected a numpy array"
    assert staticscores.dtype == np.float32 or staticscores.dtype == np.float64, "Expected a numpy array of floats"
    self.staticscores = staticscores

  def predict(self, uid):
    #this model returns the same scores for each user
    return self.staticscores

In [None]:
# Add your solution here
# And answer the quiz questions

## Task 4. Qualiatively Examining Recommendations

From now on, we will consider the `BPRMF` model.

In Recommender Systems, the ground truth (i.e. our list of books that the user has added to their "to_read" shelf) can be very incomplete. For instance, this can be because the user is not aware of the book yet.

For this reason, it is important to "eyeball" the recommendations, to understand what the system is surfacing, and whether the recommendations make sense. In this way, we understand if the recommendations are reasonable, even if they are for books that the user has not actually read according to the test dataset.

First, write a function, which given a uid (int), prints the *title and authors* of:
 - (a) the books that the user has previously shelved (c.f. `toread_dataset_train`)
 - (b) the books that the user will read in the future (c.f. `test_dataset`)
 - (c) the top 10 books that the user were recommended by `BPRMF` - you can make use of `get_top_K()`.

You can use the previously defined `getAuthorTitle()` function in your solution.
You will also want to compare books in (c) with those in (a) and (b).

Then, we will examine two specific users, namely uid 1805 (u336) and uid 179 (user u1331), to analyse if their recommendations make sense. Refer to the Task 4 quiz questions.


In [None]:
# Add your solution here

# Part-C. Diversity of Recommendations

This part of the exercise is concerned with diversification, as covered in Lecture 11.

## Task 5. Measuring Intra-List Diversity


For the BPR implicit factorisation model, implement the Intra-list diversity measure (see Lecture 11) of the top 5 scored items based on their item embeddings in the `BPRMF` model.

Implement your ILD as a function with the specification:
```python
def measure_ild(top_books : Sequence[int], K : int=5) -> float
```
where:
 - `top_books` is a list or a Numpy array of iids that have been returned for a particular user. For instance, it can be obtained from `get_top_K()`.
 - `K` is the number of top-ranked items to consider from `top_books`.
 - Your implementation should use the item embeddings stored in the `BPRMF` model.

Calculate the ILD (with k=5). Using your code for Task 4, identify the books previously shelved and recommended for the specific users requested in the quiz, and use these to analyse the recommendations.

Hints:
 - As can be seen in `get_top_K()`, item embeddings can be obtained from `BPRMF._net.item_embeddings.weight[iid]`.
 - For obtaining the cosine similarity of PyTorch tensors, use `nn.functional.cosine_similarity(, , axis=0)`.


In [None]:
# Add your solution here
def measure_ild(top_books : Sequence[int], K : int=5) -> float:
  ILD = 0.0
  return ILD

## Task 6. Implement MMR Diversification

Develop an Maximal Marginal Relevance (MMR) diversification technique, to re-rank the top-ranked recommendations for a given user.

Your function should adhere to the specification as follows:
```python
def mmr(iids : Sequence[int], scores : Sequence[float], embs : np.ndarray, alpha : float) -> Sequence[int]:
```

where:
 - iids is a list of iids,
 - scores are their corresponding scores (in descending order),
 - embs is their embeddings,
 - alpha controls the diversification tradeoff.

The function returns a re-ordering of iids. As in previous Exercises, type hints are provided for clarity; a Sequence can be a list or numpy array.

Hints:
 - As above, for obtaining the cosine similarity of PyTorch tensors, use nn.functional.cosine_similarity(, , axis=0).

To use your `mmr()` function, provide it with the outputs of `get_top_K()`. For example, to obtain an MMR reordering of the top 10 predictions of uid 0, we can run:
```
mmr( *get_top_K(bprmodel, 0, 10), 0.5)
```

Thereafter, we provide test cases for your MMR implementation, which you  should report in the quiz. We also ask for the ILD values before and after the application of MMR.


In [None]:
from typing import Sequence
def mmr(iids : Sequence[int], scores : Sequence[float], embs : np.ndarray, alpha : float) -> Sequence[int]:

  assert len(iids) == len(scores)
  assert len(iids) == embs.shape[0]
  assert len(embs.size()) == 2


  rtr_iids=iids

  #input your solution here returns a re-ordering of iids, such that the first ranked item is first in the list

  return rtr_iids

In [None]:
def run_MMR_testcases(mmrfn):
  example_embeddings1 = torch.tensor([[1.0,1.0],[1.0,1.0],[0,1.0],[0.1, 1.0]])
  example_embeddings2 = torch.tensor([[1.0,1.0],[1.0,1.0],[0.02,1.0],[0.01,1.0]])
  print("Testcase 0 : %s" % mmrfn([1,2,3,4], [0.5, 0.5, 0.5, 0.5],  example_embeddings1, 0.5)[0] )
  print("Testcase 1 : %s" % mmrfn([1,2,3,4], [0.5, 0.5, 0.5, 0.5],  example_embeddings1, 0.5)[1] )
  print("Testcase 2 : %s" % mmrfn([1,2,3,4], [4, 3, 2, 1],  example_embeddings1, 1)[1] )
  print("Testcase 3 : %s" % mmrfn([1,2,3,4], [0.99, 0.98, 0.97, 0.001],  example_embeddings2, 0.001)[1] )
  print("Testcase 4 : %s" % mmrfn([1,2,3,4], [0.99, 0.98, 0.97, 0.001],  example_embeddings2, 0.5)[1] )



Now we can analyse the impact of our MMR implementation. Let's consider again uid 179 (user u1331).

Apply MMR on the top 10 results obtained from the BPRMF model using `get_top_K()`, with an alpha value of 0.5. The following code should help:
```python
mmr( *get_top_K(BPRMF, 179, 10), 0.5)
```

Finally, anayse the returned books. Calculate the ILD (with `k=5`), and examine the authors and titles (using `getAuthorTitle()`).

Now answer the questions in Task 6 of the Moodle quiz.


In [None]:
#add your solution here

# Task 7 Content-related questions

This task is not a practical task - instead there are questions that tests your understanding of some related content of the course in the quiz.

# End of Exercise

As part of your submission, you should complete the Exercise 3 quiz on Moodle.
You will need to upload your notebook, complete with the **results** of executing the code.