# Fun Time! Application on Recommender Systems

We've seen basic operations for loading, filtering, and processing datasets using NumPy, Pandas, and Scipy.

Now we've arrived to the fun part. We will be implementing recommender systems using these libraries.

We will take into account recommendations at length $n$, i.e., each algorithm will return a list of $n$ items.


In [2]:
import numpy as np
import pandas as pd
from lenskit.algorithms import Recommender, item_knn, user_knn
from scipy import sparse

## Dataset Loading

In [3]:
urm_df = data_df = pd.read_csv("data/ml-100k/u.data",
                      names=["user", "item", "rating", "timestamp"],
                      dtype=[('user', np.int32), ('item', np.int32), ('rating', np.int32), ('timestamp', np.int64)], 
                  delimiter="\t")
urm_csr = sparse.load_npz("data/urm_csr.npz",)
urm_csc = sparse.load_npz("data/urm_csc.npz",)

## Constants

In [5]:
recommendation_length = 10
(num_users, num_items), num_interactions = urm_csr.shape, urm_csr.nnz

rng = np.random.default_rng()

print(recommendation_length, num_users, num_items)

10 944 1683


In [7]:
density = urm_csr.nnz / (num_users * num_items)
density

0.06294248567429025

## Recommender: Random

![Random Recommender](images/random.jpg)

The name it's more or less self explanatory. But just to make it clear, it recommends $n$ random items of the catalog.


In [8]:
def random_item_recommender():
    return rng.permutation(np.arange(1,num_items + 1))[:recommendation_length]

## Recommender: Top Popular

![Similarity](images/similarity.jpeg)

This is one of the most basic recommender out there. It just recommend the most popular items to all users.

The interesting part is to count the number of times each item has been interacted.

When we have a CSC Sparse matrix, we can get the number of elements stored in each column using the `indptr` attribute [Reference](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csc_matrix.html#scipy.sparse.csc_matrix) and the `np.ediff1d` function.

More specifically, for a matrix like this:

```python
[[1, 0, 4],
 [0, 0, 5],
 [2, 3, 6]]
```
We have that `indptr = np.array([0, 2, 3, 6])` (will always have one more column than the original number of columns), `indices = np.array([0, 2, 2, 0, 1, 2])`, and `data = np.array([1, 2, 3, 4, 5, 6])`. If we want to know where in the matrix we have a value, we do the following:




In [9]:
example_matrix = sparse.csc_matrix(np.array([[1, 0, 4], [0, 0, 5], [2, 3, 6]]))

for column_idx in range(3):
    row_ranges = example_matrix.indices[example_matrix.indptr[column_idx]:example_matrix.indptr[column_idx+1]]
    values = example_matrix.data[example_matrix.indptr[column_idx]:example_matrix.indptr[column_idx+1]]
    print(f"{column_idx = } - {row_ranges = } - {values = }")

column_idx = 0 - row_ranges = array([0, 2], dtype=int32) - values = array([1, 2])
column_idx = 1 - row_ranges = array([2], dtype=int32) - values = array([3])
column_idx = 2 - row_ranges = array([0, 1, 2], dtype=int32) - values = array([4, 5, 6])


Thanks to the way indptr is constructed, we can deduct that the number of non zero elements on each column $i$ is solely the difference between `indptr[i+1] - indptr[i]`. Moreover, after we've calculated the number of nnz elements in each column, we sort the indices with `np.argsort` and get the latest $n$ indices (argsort sorts in ascending order)

We add 1 at the end because ids begin at 1.

In [11]:
def top_popular_item_recommender():
    return np.argsort(np.ediff1d(urm_csc.indptr))[num_items - 1: num_items - 1 - recommendation_length:-1] + 1

## What are we recommending?

With this dataset we have the information of items, such as genres, directors, movie title, and more. This is the structure of the data.

| movie id | movie title | release date | video release date | IMDb URL | unknown | Action | Adventure | Animation | Children's | Comedy | Crime | Documentary | Drama | Fantasy | Film-Noir | Horror | Musical | Mystery | Romance | Sci-Fi | Thriller | War | Western |
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |

First, let's load that information into a DataFrame

In [12]:
item_features = pd.read_csv("data/ml-100k/u.item", 
                            names=["movie_id", "movie_title", "release_date", "video_release_date", 
                                   "IMDB URL", "unknown", "Action", "Adventure", "Animation", "Children's", 
                                   "Comedy", "Crime", "Documentary", "Drama", "Fantasy",
                                   "Film-Noir", "Horror", "Musical", "Mystery", "Romance", "Sci-Fi",
                                   "Thriller", "War", "Western"],
                            delimiter="|",
                            parse_dates=True,
                            dtype={"movie_id": np.int32, 
                                   "movie_title": "object",
                                   "video_release_date": "object", 
                                   "IMDB URL": "object", 
                                   "unknown": "object",
                                   "Action": np.bool, 
                                   "Adventure": np.bool, 
                                   "Animation": np.bool, 
                                   "Children's": np.bool, 
                                   "Comedy": np.bool, 
                                   "Crime": np.bool, 
                                   "Documentary": np.bool,
                                   "Drama": np.bool, 
                                   "Fantasy": np.bool,
                                   "Film-Noir": np.bool, 
                                   "Horror": np.bool, 
                                   "Musical": np.bool, 
                                   "Mystery": np.bool, 
                                   "Romance": np.bool, 
                                   "Sci-Fi": np.bool,
                                   "Thriller": np.bool, 
                                   "War": np.bool, 
                                   "Western": np.bool},
                            encoding='latin-1')
item_features

Unnamed: 0,movie_id,movie_title,release_date,video_release_date,IMDB URL,unknown,Action,Adventure,Animation,Children's,...,Fantasy,Film-Noir,Horror,Musical,Mystery,Romance,Sci-Fi,Thriller,War,Western
0,1,Toy Story (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?Toy%20Story%2...,0,False,False,True,True,...,False,False,False,False,False,False,False,False,False,False
1,2,GoldenEye (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?GoldenEye%20(...,0,True,True,False,False,...,False,False,False,False,False,False,False,True,False,False
2,3,Four Rooms (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?Four%20Rooms%...,0,False,False,False,False,...,False,False,False,False,False,False,False,True,False,False
3,4,Get Shorty (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?Get%20Shorty%...,0,True,False,False,False,...,False,False,False,False,False,False,False,False,False,False
4,5,Copycat (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?Copycat%20(1995),0,False,False,False,False,...,False,False,False,False,False,False,False,True,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1677,1678,Mat' i syn (1997),06-Feb-1998,,http://us.imdb.com/M/title-exact?Mat%27+i+syn+...,0,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1678,1679,B. Monkey (1998),06-Feb-1998,,http://us.imdb.com/M/title-exact?B%2E+Monkey+(...,0,False,False,False,False,...,False,False,False,False,False,True,False,True,False,False
1679,1680,Sliding Doors (1998),01-Jan-1998,,http://us.imdb.com/Title?Sliding+Doors+(1998),0,False,False,False,False,...,False,False,False,False,False,True,False,False,False,False
1680,1681,You So Crazy (1994),01-Jan-1994,,http://us.imdb.com/M/title-exact?You%20So%20Cr...,0,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False


And now, we create a function that takes a recommendation list as input and returns the rows in the dataframe that match with those ids.

In [13]:
def get_item_features(item_ids):
    return item_features[item_features.movie_id.isin(item_ids)]

In [16]:
get_item_features(random_item_recommender())

Unnamed: 0,movie_id,movie_title,release_date,video_release_date,IMDB URL,unknown,Action,Adventure,Animation,Children's,...,Fantasy,Film-Noir,Horror,Musical,Mystery,Romance,Sci-Fi,Thriller,War,Western
39,40,"To Wong Foo, Thanks for Everything! Julie Newm...",01-Jan-1995,,http://us.imdb.com/M/title-exact?To%20Wong%20F...,0,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
169,170,Cinema Paradiso (1988),01-Jan-1988,,http://us.imdb.com/M/title-exact?Nuovo%20cinem...,0,False,False,False,False,...,False,False,False,False,False,True,False,False,False,False
249,250,"Fifth Element, The (1997)",09-May-1997,,http://us.imdb.com/M/title-exact?Fifth%20Eleme...,0,True,False,False,False,...,False,False,False,False,False,False,True,False,False,False
281,282,"Time to Kill, A (1996)",13-Jul-1996,,http://us.imdb.com/M/title-exact?Time%20to%20K...,0,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
575,576,Cliffhanger (1993),01-Jan-1993,,http://us.imdb.com/M/title-exact?Cliffhanger%2...,0,True,True,False,False,...,False,False,False,False,False,False,False,False,False,False
659,660,Fried Green Tomatoes (1991),01-Jan-1991,,http://us.imdb.com/M/title-exact?Fried%20Green...,0,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
756,757,Across the Sea of Time (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?Across%20The%...,0,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
850,851,Two or Three Things I Know About Her (1966),01-Jan-1966,,http://us.imdb.com/M/title-exact?Deux%20ou%20t...,0,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1271,1272,Talking About Sex (1994),01-Jan-1994,,http://us.imdb.com/M/title-exact?Talking%20Abo...,0,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1526,1527,Senseless (1998),09-Jan-1998,,http://us.imdb.com/M/title-exact?imdb-title-12...,0,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False


In [17]:
get_item_features(top_popular_item_recommender())

Unnamed: 0,movie_id,movie_title,release_date,video_release_date,IMDB URL,unknown,Action,Adventure,Animation,Children's,...,Fantasy,Film-Noir,Horror,Musical,Mystery,Romance,Sci-Fi,Thriller,War,Western
1,2,GoldenEye (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?GoldenEye%20(...,0,True,True,False,False,...,False,False,False,False,False,False,False,True,False,False
50,51,Legends of the Fall (1994),01-Jan-1994,,http://us.imdb.com/M/title-exact?Legends%20of%...,0,False,False,False,False,...,False,False,False,False,False,True,False,False,True,True
100,101,Heavy Metal (1981),08-Mar-1981,,http://us.imdb.com/M/title-exact?Heavy%20Metal...,0,True,True,True,False,...,False,False,True,False,False,False,True,False,False,False
121,122,"Cable Guy, The (1996)",14-Jun-1996,,"http://us.imdb.com/M/title-exact?Cable%20Guy,%...",0,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
181,182,GoodFellas (1990),01-Jan-1990,,http://us.imdb.com/M/title-exact?GoodFellas%20...,0,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
258,259,George of the Jungle (1997),01-Jan-1997,,http://us.imdb.com/M/title-exact?George+of+the...,0,False,False,False,True,...,False,False,False,False,False,False,False,False,False,False
286,287,Marvin's Room (1996),18-Dec-1996,,http://us.imdb.com/M/title-exact?Marvin's%20Ro...,0,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
288,289,Evita (1996),25-Dec-1996,,http://us.imdb.com/M/title-exact?Evita%20(1996),0,False,False,False,False,...,False,False,False,True,False,False,False,False,False,False
294,295,Breakdown (1997),02-May-1997,,http://us.imdb.com/M/title-exact?Breakdown%20%...,0,True,False,False,False,...,False,False,False,False,False,False,False,True,False,False
300,301,In & Out (1997),19-Sep-1997,,http://us.imdb.com/Title?In+%26+Out+(1997),0,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
