# Group Recommender Systems - Tutorial 2 (Lab 2)

In this tutorial, we will focus on Group Recommender Systems. After completing this tutorial, you will be able to: 
- Implement the basic agregation strategies for group recommendations.
- Generate a simple textual explanation for such strategies.

#### Summary

1. Selection of a random group in our dataset
2. Aggregation Strategies for Group Recommenders
3. Explanations for Group Recommenders


#### 1. Selection of a random group in our dataset

First, we need a group! So, we will select a random set of five users from our dataset. For simplicity, we will focus on users having at least 200 evaluations.

In [1]:
preprocessed_dataset_folder = "../preprocessed_dataset"

import pandas as pd
ratings_df = pd.read_csv(preprocessed_dataset_folder+"/ratings.csv") 
movies_df = pd.read_csv(preprocessed_dataset_folder+"/movies.csv", index_col="item")


In [2]:
users_ratings = ratings_df.groupby(['user']).count()
selected = users_ratings['rating'] > 200
selected_users = users_ratings.loc[selected]
random_selected = selected_users.sample(n=5) # sample() returns now n random rows from the dataframe. The returned object is a dataframe with five rows. 
select_column_df = random_selected.reset_index()['user'] # reset_index() create a new index, and the userId became a column. Then, we can filter using the column name
group_users = list(select_column_df) # iloc select by index, since our dataframe only has one row we read it from the index 0
print(group_users)

[305, 103, 561, 156, 105]


Let us assume we want to recommend to this group a list of 10 movies that nobody in the group has seen yet. We first need to determine the list of possible candidates. For simplicity, we will only consider movies for which we more then 10 evaluations.

In [3]:
group_ratings = ratings_df.loc[ratings_df['user'].isin(group_users)]
all_movies = set(movies_df.index.tolist())
num_ratings_df = ratings_df.groupby(['item']).count()
considered_movies = set(num_ratings_df.loc[num_ratings_df['user'] > 10].reset_index()['item'])

group_seen_movies = set(group_ratings['item'].tolist())
group_unseen_movies = considered_movies - group_seen_movies

print(len(all_movies))
print(len(considered_movies))
print(len(group_seen_movies))
print(len(group_unseen_movies))

4633
1421
989
627


Now, we need to evaluate individuals' preverences for the unseen movies. To do so, we use the Lenskit library. We will use the same CF recommender used in the previous example. To generate the Dataframe with user-item pairs to pass as input in the *predict* function, we use the [product](https://docs.python.org/3/library/itertools.html#itertools.product) method of the itertools library, which takes as imput two lists and returns all the possible combinations between elements of the two lists. This is passed as input for the Dataframe constructor, which will then generate a Dataframe containing a pair on each row.

In [4]:
import itertools
from lenskit.algorithms import Recommender
from lenskit.algorithms.user_knn import UserUser

user_user = UserUser(15, min_nbrs=3)  # Minimum (3) and maximum (15) number of neighbors to consider
recsys = Recommender.adapt(user_user)
recsys.fit(ratings_df)
group_unseen_df = pd.DataFrame(list(itertools.product(group_users, group_unseen_movies)), columns=['user', 'item'])
group_unseen_df['predicted_rating'] = recsys.predict(group_unseen_df)
display(group_unseen_df)

Unnamed: 0,user,item,predicted_rating
0,305,3,4.112944
1,305,2052,3.315190
2,305,2053,2.744430
3,305,7,3.037984
4,305,9,3.043037
...,...,...,...
3130,105,2021,4.118675
3131,105,135143,3.930969
3132,105,55282,3.821803
3133,105,4084,3.663001


We have now our predicted ratings.
We can apply an aggregation strategy to generate the group recommendations.

#### 2. Aggregation Strategies for Group Recommenders

Let's implement some of the aggregation strategies seen in the lecture today.

##### Additive strategy

The Additive strategy considers as group rating the sum of all the individuals ratings. Then, the recommended items are the one scoring the best with such group rating. We can easily implement it grouping our *group_unseen_df* Dataframe by *item*, and then computing the *sum*.

In [5]:
# Additive strategy

additive_df = group_unseen_df.groupby('item').sum()
additive_df = additive_df.join(movies_df['title'], on='item')
additive_df = additive_df.sort_values(by="predicted_rating", ascending=False).reset_index()[['item', 'title', 'predicted_rating']]
display(additive_df.head(10))

Unnamed: 0,item,title,predicted_rating
0,3451,guess who's coming to dinner,24.429428
1,1411,hamlet,23.91999
2,3836,kelly's heroes,23.19465
3,951,his girl friday,23.120324
4,1283,high noon,22.803631
5,3435,double indemnity,22.54956
6,1041,secrets & lies,22.48458
7,2160,rosemary's baby,22.448222
8,3213,batman: mask of the phantasm,22.22885
9,1103,rebel without a cause,22.225097


##### Least Misery strategy

The Least Misery strategy considers as group rating the minimum of all the individuals ratings. Then, the recommended items are the one scoring the best with such group rating. As we did before, we can implement it grouping our *group_unseen_df* Dataframe by *item*, and then computing the *min*.

In [6]:
# least misery

least_misery_df = group_unseen_df.groupby('item').min()
least_misery_df = least_misery_df.join(movies_df['title'], on='item')
least_misery_df = least_misery_df.sort_values(by="predicted_rating", ascending=False).reset_index()[['item', 'title', 'predicted_rating']]
display(least_misery_df.head(10))

Unnamed: 0,item,title,predicted_rating
0,951,his girl friday,4.370239
1,3836,kelly's heroes,4.344944
2,3451,guess who's coming to dinner,4.331141
3,1287,ben-hur,4.236517
4,1242,glory,4.219783
5,1283,high noon,4.148872
6,1411,hamlet,4.147944
7,1103,rebel without a cause,4.124804
8,6787,all the president's men,4.120403
9,2160,rosemary's baby,4.115756


##### Most Pleasure strategy

The Most Pleasure strategy considers as group rating the maximum of all the individuals ratings. Then, the recommended items are the one scoring the best with such group rating. Again, We can easily implement it grouping our *group_unseen_df* Dataframe by *item*, and then computing the *max*.

In [7]:
# most pleasure

most_pleasure_df = group_unseen_df.groupby('item').max()
most_pleasure_df = most_pleasure_df.join(movies_df['title'], on='item')
most_pleasure_df = most_pleasure_df.sort_values(by="predicted_rating", ascending=False).reset_index()[['item', 'title', 'predicted_rating']]
display(most_pleasure_df.head(10))

Unnamed: 0,item,title,predicted_rating
0,1411,hamlet,5.236559
1,3451,guess who's coming to dinner,5.154168
2,951,his girl friday,4.917308
3,1283,high noon,4.875535
4,3836,kelly's heroes,4.85948
5,3435,double indemnity,4.84541
6,2160,rosemary's baby,4.735055
7,1041,secrets & lies,4.726601
8,3683,blood simple,4.714523
9,2968,time bandits,4.704793


##### Fairness strategy

For the Fairness strategy we have an ordering between the group members, and at each round one group member choose the best item for him/her. Hence, we can compute the preference lists for each group member separately. Then we iterate over the group members, and at each iteration we select one element from the list of the correct user, and add it to the result list. Finally, we create a dataframe and enrich the information of the movies selected 

In [8]:
# Fairness

import pandas as pd

def generate_preference_list(user):
    individual_df = group_unseen_df.loc[group_unseen_df['user']==user]
    return list(individual_df.sort_values(by="predicted_rating", ascending=False).reset_index()['item'])

individual_preference_lists = dict()
for member in group_users:
    individual_preference_lists[member] = generate_preference_list(member)
    
result = list()
for i in range(10):
    user = group_users[i % 5]
    user_best = individual_preference_lists[user].pop(0)
    for member in group_users:
        if user_best in individual_preference_lists[member]:
            individual_preference_lists[member].remove(user_best)
    result.append(user_best)
    
fairness_df = pd.DataFrame(result, columns=['item']).join(movies_df['title'], on='item')
display(fairness_df)

Unnamed: 0,item,title
0,3451,guess who's coming to dinner
1,1411,hamlet
2,951,his girl friday
3,28,persuasion
4,3836,kelly's heroes
5,7247,chitty chitty bang bang
6,1283,high noon
7,1287,ben-hur
8,3435,double indemnity
9,1041,secrets & lies


In [9]:
# To check individual evaluations on a specific item
group_unseen_df.loc[group_unseen_df['item']==3740]

Unnamed: 0,user,item,predicted_rating


#### EXERCISE

Implement the Approval Voting and Plurality Voing strategies for group recommendations.

#### 2 Explanations for Group Recommenders

Let's see now some simple strategy to generate basic explanations for the group recommendation strategies implemented before. For the Additive, Least Misery and Most Pleasure strategies, we will use social-choice based explanations as defined in [Barile et al., 2021](http://ceur-ws.org/Vol-2955/paper11.pdf). For the Fairness strategy, we will use a generic formulation:

- Additive: "i_k has been recommended to the group since it achieves the highest total rating."
- Least Misery: "i_k has been recommended to the group since no group members has a real problem with it."
- Most Pleasure: "i_k has been recommended to the group since it achieves the highest of all individual group members."
- Fairness: "i_k has been recommended to the group since it is the favourite for u_j, and it's his/her turn to choose."

In [10]:
explanations = {
    "ADD" : "<item> has been recommended to the group since it achieves the highest total rating.\n",
    "LMS" : "<item> has been recommended to the group since no group members has a real problem with it.\n",
    "MPL" : "<item> has been recommended to the group since it achieves the highest of all individual group members.\n",
    "FAI" : "<item> has been recommended to the group since it is the favourite for <user>, and it's his/her turn to choose.\n"
}

# Present explanations for the first item of each strategy
movie_title = additive_df['title'].iloc[0]
print("Recommendation: " + movie_title.title())
print(explanations["ADD"].replace("<item>", "The movie \"" + movie_title.title() + "\""))

movie_title = least_misery_df['title'].iloc[0]
print("Recommendation: " + movie_title.title())
print(explanations["LMS"].replace("<item>", "The movie \"" + movie_title.title() + "\""))

movie_title = most_pleasure_df['title'].iloc[0]
print("Recommendation: " + movie_title.title())
print(explanations["MPL"].replace("<item>", "The movie \"" + movie_title.title() + "\""))

movie_title = fairness_df['title'].iloc[0]
user = group_users[0]
print("Recommendation: " + movie_title.title())
print(explanations["FAI"]
      .replace("<item>", "The movie \"" + movie_title.title() + "\"")
      .replace("<user>", "the user with id " + str(user)))

Recommendation: Guess Who'S Coming To Dinner
The movie "Guess Who'S Coming To Dinner" has been recommended to the group since it achieves the highest total rating.

Recommendation: His Girl Friday
The movie "His Girl Friday" has been recommended to the group since no group members has a real problem with it.

Recommendation: Hamlet
The movie "Hamlet" has been recommended to the group since it achieves the highest of all individual group members.

Recommendation: Guess Who'S Coming To Dinner
The movie "Guess Who'S Coming To Dinner" has been recommended to the group since it is the favourite for the user with id 305, and it's his/her turn to choose.



##### EXERCISE

Implement the explanation for the Approval Voting and Plurality Voting strategies for group recommendations, and print the corresponding explanation for the best movie for the group.