# Notebook #3: Investigation on Item Provider Fairness

This notebook will consider the directors of movies in Movielens as the item providers and investigates how unfairness based on gender groups affects providers' group visibility and exposure with respect to their representation in the item catalog. We also showcase an umsampling strategy that upsamples interations involving items of a minority group to improve fairness. 

** While gender is by no means a binary construct, to the best of our knowledge no dataset with non-binary genders exists. What we are considering is a binary feature, as the current publicly available platforms offer.

## Setup the working environment for this tutorial

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
%cd /content/drive/My Drive/bias-recsys-tutorial/notebooks

In [None]:
import sys 
import os

sys.path.append(os.path.join('..'))

In [None]:
import pandas as pd
import numpy as np
import math

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
from helpers.train_test_splitter import *
from helpers.instances_upsampler import *
from models.pointwise import PointWise
from models.pairwise import PairWise
from models.mostpop import MostPop
from models.random import Random
from helpers.utils import *

In [None]:
data_path = '../data/'

## Data analysis: representation of gender-based groups of providers in the dataset 

Basically, given that we are considering a pre-processing approach, we will go through the whole experimental pipeline again. To this end, we will first load the Movielens 1M dataset ans inspect to what extent the differen groups are represented in the catalog.  

In [None]:
dataset = 'ml1m'          
method = 'utime_pfair'
user_field = 'user_id'
item_field = 'item_id'
rating_field = 'rating'
time_field = 'timestamp'
type_field = 'type_id'

In [None]:
data = pd.read_csv('../data/datasets/' + dataset + '.csv', encoding='utf8')

Now, we need also to append thegender information to the original dataset. We will leverage a csv that, for each item, gives the percentage of directors belonging to the two genders. Please note that more than one director can be associated to a movie. 

In [None]:
dirgender = pd.read_csv('../data/datasets/' + dataset + '-dir-gender' + '.csv', encoding='utf8')

In [None]:
dirgender.head()

In [None]:
len(dirgender[dirgender['gender_1'] > 0]) / len(dirgender), len(dirgender[dirgender['gender_1'] == 0]) / len(dirgender)

We can observe that the first gender group (female group) represents a minority provider group in this dataset, with a representation of 5% in the catalog. The other gender group (male group) is reprensented by the 83% in the catalog. Summing up the two percentages, we do not reach a 100%. This is due to the fact that, for some movies, we were not able to get the gender information of the respective directors. For the sake of easiness, we assume that items of providers whose gender is unknown are part of the minority group, together with the female group. 

In [None]:
dirgender['minority'] = dirgender['gender_1'].apply(lambda x: 1.0 if math.isnan(x) else x)
dirgender['majority'] = dirgender['gender_2'].apply(lambda x: 0.0 if math.isnan(x) else x)

In [None]:
del dirgender['gender_1']
del dirgender['gender_2']

In [None]:
len(dirgender[dirgender['minority'] > 0]) / len(dirgender), len(dirgender[dirgender['minority'] == 0]) / len(dirgender)

In [None]:
original_minority_rep = len(dirgender[dirgender['minority'] > 0]) / len(dirgender)

Hence, in this notebook, we will consider a minority group with a representation of 16% in the catalog. 

In [None]:
data = pd.merge(data, dirgender, on='item_id')

In [None]:
data.sample(n=10, random_state=1)

## Data analysis: analysis of provider's group visibility and exposure in recommendations

We will use the same cutoffs we have configured in the first notebook. 

**IMPORTANT BOOKMARK** Please bookmark this point. 

In [None]:
cutoffs = np.array([5, 10, 20])

In [None]:
model_types = ['utime_pairwise', 'utime_random', 'utime_mostpop']

In [None]:
metrics = {}
for model_type in model_types:
    metrics[model_type] = load_obj(os.path.join(data_path, 'outputs/metrics/' + dataset + '_' + model_type + '_metrics.pkl'))

In [None]:
plt.rcParams.update({'font.size': 16.5})
plt.figure(figsize=(30, 7.5))

plt.subplot(131)
plt.title(r'Precision')
plt.xlabel('Cutoff Value')
plt.ylabel('Precision')
for model_type in model_types:
    plt.plot(cutoffs, [np.mean(metrics[model_type]['precision'][k,:]) for k in range(len(cutoffs))], label=model_type)
plt.xticks(cutoffs)
plt.legend()
plt.grid(axis='y')

plt.subplot(132)
plt.title(r'Recall')
plt.xlabel('Cutoff Value')
plt.ylabel('Recall')
for model_type in model_types:
    plt.plot(cutoffs, [np.mean(metrics[model_type]['recall'][k,:]) for k in range(len(cutoffs))], label=model_type)
plt.xticks(cutoffs)
plt.legend()
plt.grid(axis='y')

plt.subplot(133)
plt.title(r'NDCG')
plt.xlabel('Cutoff Value')
plt.ylabel('NDCG')
for model_type in model_types:
    plt.plot(cutoffs, [np.mean(metrics[model_type]['ndcg'][k,:]) for k in range(len(cutoffs))], label=model_type)
plt.xticks(cutoffs)
plt.legend()
plt.grid(axis='y')

plt.tight_layout()

In [None]:
plt.rcParams.update({'font.size': 16.5})
plt.figure(figsize=(30, 7.5))

plt.subplot(121)
plt.title(r'Disparate Visibility')
plt.xlabel('Cutoff Value')
plt.ylabel('Disparate Visibility')
for model_type in model_types:
    plt.plot(cutoffs, [abs(np.mean(metrics[model_type]['visibility'][k,:]) - original_minority_rep) for k in range(len(cutoffs))], label=model_type)
plt.xticks(cutoffs)
plt.legend()
plt.grid(axis='y')

plt.subplot(122)
plt.title(r'Disparate Exposure')
plt.xlabel('Cutoff Value')
plt.ylabel('Disparate Exposure')
for model_type in model_types:
    plt.plot(cutoffs, [abs(np.mean(metrics[model_type]['exposure'][k,:]) - original_minority_rep) for k in range(len(cutoffs))], label=model_type)
plt.xticks(cutoffs)
plt.legend()
plt.grid(axis='y')

plt.tight_layout()

## Sample treatment to increase fairness between providers' groups: pre-processing

This part will show how to improve fairness among provider groups by upsampling interactions involving the minority group of providers. This example is a didactic version of the work proposed by Boratto et al. (2020b). 

First, we split again train and test data and prepare the data needed to initialize a recommendation model. 

In [None]:
smode = 'utime_pfair'
train_ratio = 0.80        
min_train_samples = 8
min_test_samples = 2
min_time = None
max_time = None
step_time = 1000

In [None]:
if smode == 'uftime_pfair':
    traintest = fixed_timestamp(data, min_train_samples, min_test_samples, min_time, max_time, step_time, user_field, item_field, time_field, rating_field)
elif smode == 'utime_pfair':
    traintest = user_timestamp(data, train_ratio, min_train_samples+min_test_samples, user_field, item_field, time_field)
elif smode == 'urandom_pfair':
    traintest = user_random(data, train_ratio, min_train_samples+min_test_samples, user_field, item_field)

In [None]:
train = traintest[traintest['set']=='train'].copy()
test = traintest[traintest['set']=='test'].copy()

In [None]:
len(train[train['minority'] > 0]) / len(train), len(train[train['minority'] == 0]) / len(train)

In [None]:
users = list(np.unique(traintest[user_field].values))
items = list(np.unique(traintest[item_field].values))

In [None]:
items_metadata = traintest.drop_duplicates(subset=['item_id'], keep='first')
category_per_item = items_metadata[type_field].values

Then, we identify the set of items belonging to the minority group and we run the upsampling strategy. 

In [None]:
items_w_min = np.unique(traintest[traintest['minority'] > 0]['item_id'].values)
items_w_maj = np.unique(traintest[traintest['minority'] == 0]['item_id'].values)

In [None]:
items_map = traintest.drop_duplicates(subset='item_id', keep='first')
item_group = {i: (0.0 if v > 0 else 1.0) for i, v in zip(items_map['item_id'].values, items_map['minority'].values)}

In [None]:
umode = 'fake'
utarget = 0.30

In [None]:
if umode == 'real':
    train = real(train, 'minority', target=utarget)
if umode == 'fake':
    train = fake(train, 'minority', items_w_min, target=utarget)
if umode == 'fakebypop':
    train = fakeByPop(train, 'minority', items_w_min, target=utarget)

In [None]:
train.sample(n=10, random_state=1)

Hence, we upsampled the representation of the minority group in the interactions, reaching 20%. 

In [None]:
train['rating'] = 1.0

Now, we can run another instance of the pairwise algorithm. 

In [None]:
model_type = 'pairwise'
model = PairWise(users, items, train, test, category_per_item, item_field, user_field, rating_field)

In [None]:
model.train(no_epochs=5)

In [None]:
model.predict()

In [None]:
predictions = model.get_predictions()

In [None]:
save_obj(predictions, os.path.join(data_path, 'outputs/predictions/' + dataset + '_' + smode + '_' + model_type + '_scores.pkl'))

Now, we need to specify how items are associated to provider groups.

In [None]:
save_obj(item_group, os.path.join(data_path, 'datasets', 'ml1m-item-group'))

In [None]:
model.test(item_group=item_group, cutoffs=cutoffs)

In [None]:
metrics = model.get_metrics()

In [None]:
save_obj(metrics, os.path.join(data_path, 'outputs/metrics/' + dataset + '_' + smode + '_' + model_type + '_metrics.pkl'))

In [None]:
model.show_metrics(index_k=int(np.where(cutoffs == 10)[0]))

Now, we come back to the **IMPORTANT BOOKMARK** mentioned above, using cutoffs = np.array([5, 10]) and adding 'utime_pfair_pairwise' to the model_types list. Then, we can rerun all the cells for plotting in order to compare the results obtained with these strategy against the ones of the baseline recommendation algorithms. 