# Mitigation bias in recommender systems

This is an introduction to fairness in recommender systems. A recommender system aims to recommend the best item according to the user preference. In this tutorial, we will focus on the task of correctly predicting users' music preference.

A recommender system can be biased in multiple ways. For example, we may be concerned that the artists in our database will not get equal representation (item fairness). Alternative, our main concern may be that different groups of users (e.g. male/female users) will get different music recommendations (user fairness). In the following, we will show how to explore the data for fairness, and measure these various types of fairness using the holisticai library.

## Importing the data

We will start by importing the example dataset, which we host on our library. The [datatset](https://www.kaggle.com/datasets/ravichaubey1506/lastfm) contains a set of artists that were downloaded by users. It includes personal information about the user, specifically sex and country of origin. A user can download more than one artist. We will use the column "score", which contains only 1s for counting the interactions.

In [1]:
# sys path
import sys
sys.path.append('../../')

In [2]:
import numpy as np
import pandas as pd
import sys
sys.path.append('../')
from holisticai.datasets import load_last_fm

bunch = load_last_fm()
lastfm = bunch['frame']
lastfm['score'] = 1
lastfm

Unnamed: 0,user,artist,sex,country,score
0,1.0,red hot chili peppers,f,Germany,1
1,1.0,the black dahlia murder,f,Germany,1
2,1.0,goldfrapp,f,Germany,1
3,1.0,dropkick murphys,f,Germany,1
4,1.0,le tigre,f,Germany,1
...,...,...,...,...,...
289950,19718.0,bob dylan,f,Canada,1
289951,19718.0,pixies,f,Canada,1
289952,19718.0,the clash,f,Canada,1
289953,19718.0,a tribe called quest,f,Canada,1


We now need to change the dataframe to an interaction matrix, where every row is a user and every column is an artist. We can use the formatting function provided in the library, the output dataframe can be used as an input to the bias metric functions for recommenders.

In [3]:
# import formatters
from holisticai.utils import recommender_formatter

# Each interaction results in a non-nan entry in the dataframe.
df_pivot, p_attr = recommender_formatter(lastfm, users_col='user', groups_col='sex', items_col='artist', scores_col='score', aggfunc='mean')

In [4]:
df_pivot

artist,...and you will know us by the trail of dead,2pac,3 doors down,30 seconds to mars,311,36 crazyfists,44,50 cent,65daysofstatic,Edith piaf,...,weezer,wilco,within temptation,wolfgang amadeus mozart,wu-tang clan,yann tiersen,yeah yeah yeahs,yellowcard,yo la tengo,zero 7
user,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1.0,,,,,,,,,,,...,,,,,,,,,,
3.0,,,,,,,,,,,...,,,,,,,,,,
4.0,,,,,,,,,,,...,,,,,,,,,,
5.0,,,,,,,,,,,...,,,,,,,,,,
6.0,,,,,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
19713.0,,,,,,,,,,,...,,,,,,,,,,
19714.0,,,,,,,,,,,...,,,,,,,,,,
19715.0,,,,,,,,,,,...,,,,1.0,,,,,,
19717.0,,,,,,,,,,,...,,,,,,,,,,


In [5]:
print ('Number of Unique Users : ' + str(df_pivot.shape[0]))
print ('Number of Unique Artists : ' + str(df_pivot.shape[1]))

Number of Unique Users : 15000
Number of Unique Artists : 1004


## Train a Model

There are many ways to recommend artists to users. We will use item based collaborative filtering since it is the simplest and most intuitive approach. For each artist, we work out a list of most similar artists. Then we recommend artists to users by looking at which artists they like, and choosing the most similar ones.

In [6]:
index_to_artist = dict(zip(range(len(df_pivot.columns)),df_pivot.columns))
artist_to_index = dict(zip(df_pivot.columns,range(len(df_pivot.columns))))
user_gender_dict = dict(zip(df_pivot.index, p_attr))

In [7]:
data_matrix = df_pivot.fillna(0).to_numpy()
data_matrix

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]])

In [8]:
# Import linear_kernel
from sklearn.metrics.pairwise import linear_kernel
# Compute the cosine similarity between items matrix
cosine_sim = linear_kernel(data_matrix.T, data_matrix.T)
cosine_sim.shape

(1004, 1004)

In [9]:
def display_items(arr):
    return [index_to_artist[x] for x in arr]

def items_liked_by_user(data_matrix, u):
    return np.nonzero(data_matrix[u])[0]

def recommended_items(data_matrix, similarity_matrix, u, k):
    liked = items_liked_by_user(data_matrix, u)
    arr = np.sum(similarity_matrix[liked,:], axis=0)
    arr[liked] = 0
    return np.argsort(arr)[-k:]

These all make sense to a human evaluator

In [10]:
def explode(arr, num_items):
    out = np.zeros(num_items)
    out[arr] = 1
    return out

new_recs = [explode(recommended_items(data_matrix, cosine_sim, u, 10), len(df_pivot.columns)) for u in range(df_pivot.shape[0])]
new_df_pivot = pd.DataFrame(new_recs, columns = df_pivot.columns)
new_df_pivot

artist,...and you will know us by the trail of dead,2pac,3 doors down,30 seconds to mars,311,36 crazyfists,44,50 cent,65daysofstatic,Edith piaf,...,weezer,wilco,within temptation,wolfgang amadeus mozart,wu-tang clan,yann tiersen,yeah yeah yeahs,yellowcard,yo la tengo,zero 7
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
14995,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
14996,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
14997,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
14998,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Evaluate Bias of Model

We will now show how we can calculate various metrics of fairness for recommender systems. In this example, we will cover both metrics for item fairness and for user fairness (equality of outcome).

### Non Negative Matrix Factorization

In [11]:
from holisticai.utils.models.recommender.matrix_factorization.non_negative import NonNegativeMF
from holisticai.bias.metrics import recommender_bias_metrics

mf = NonNegativeMF(K=40)
mf.fit(data_matrix)

rankings = mf.predict(data_matrix, top_n=10)
mat = rankings.pivot('X','Y','score').replace(np.nan,0).to_numpy()
recommender_bias_metrics(mat_pred=mat>0, metric_type='item_based')

Unnamed: 0_level_0,Value,Reference
Metric,Unnamed: 1_level_1,Unnamed: 2_level_1
Aggregate Diversity,1.0,1
GINI index,0.644667,0
Exposure Distribution Entropy,5.426239,-
Average Recommendation Popularity,817.774267,-


### Debiasing Learning Matrix Factorization

In [None]:
from holisticai.bias.mitigation import DebiasingLearningMF

mf = DebiasingLearningMF(K=40, normalization='Vanilla', lamda=0.08, metric='mse', bias_mode='Regularized', seed=1)
mf.fit(data_matrix)

rankings = mf.predict(data_matrix, top_n=10)
mat = rankings.pivot('X','Y','score').replace(np.nan,0).to_numpy()
recommender_bias_metrics(mat_pred=mat>0, metric_type='item_based')

### Blind Spot Aware Matrix Factorization

In [None]:
from holisticai.bias.mitigation import BlindSpotAwareMF

mf = BlindSpotAwareMF(K=40, beta=0.02, steps=10, alpha=0.002, lamda=0.008, verbose=1)
mf.fit(data_matrix)

rankings = mf.predict(data_matrix, top_n=10)
mat = rankings.pivot('X','Y','score').replace(np.nan,0).to_numpy()
recommender_bias_metrics(mat_pred=mat>0, metric_type='item_based')

### Popularity Propensity Matrix Factorization

In [None]:
from holisticai.bias.mitigation import PopularityPropensityMF

mf = PopularityPropensityMF(K=40, beta=0.02, steps=100, verbose=1)
mf.fit(data_matrix)

rankings = mf.predict(data_matrix, top_n=10)
mat = rankings.pivot('X','Y','score').replace(np.nan,0).to_numpy()
recommender_bias_metrics(mat_pred=mat>0, metric_type='item_based')

In [12]:
from holisticai.bias.mitigation import FairRec
from holisticai.bias.metrics import recommender_bias_metrics

fr = FairRec(rec_size=10, MMS_fraction=0.5)
fr.fit(data_matrix)

recommendations = fr.predict(data_matrix, top_n=10)
mat = rankings.pivot('X','Y','score').replace(np.nan,0).to_numpy()
recommender_bias_metrics(mat_pred=mat>0, metric_type='item_based')

  from .autonotebook import tqdm as notebook_tqdm


Unnamed: 0_level_0,Value,Reference
Metric,Unnamed: 1_level_1,Unnamed: 2_level_1
Aggregate Diversity,1.0,1
GINI index,0.644667,0
Exposure Distribution Entropy,5.426239,-
Average Recommendation Popularity,817.774267,-
