# **Bias mitigation with "Popularity propensity" and "Two-sided fairness"**

This demo demonstrates how to implement the "popularity propensity" and "Two-sided fairness" method to enhance fairness in recommender systems.

- [Popularity propensity](#Method-Popularity-propensity)
  - [Traditional implementation](#traditional-implementation)
  - [Pipeline implementation](#Pipeline-implementation)
- [Two-sided fairness](#Method-Two-sided-fairness)
  - [Traditional implementation](#Traditional-implementation-for-FairRec)
  - [Pipeline implementation](#Pipeline-implementation-for-FairRec)

First, install the `holisticai` package if you haven't already:
```bash
!pip install holisticai[all]
```
Then, import the necessary libraries.

In [2]:
import numpy as np
import pandas as pd
from holisticai.datasets import load_dataset
from holisticai.bias.metrics import recommender_bias_metrics
from holisticai.bias.mitigation import PopularityPropensityMF

np.random.seed(0)
import warnings
warnings.filterwarnings("ignore")

Loading the proprocessed "LastFM" dataset.

In [3]:
dataset = load_dataset('lastfm')
df_pivot, p_attr = dataset['data_pivot'], dataset['p_attr']

In [4]:
def explode(arr, num_items):
    out = np.zeros(num_items)
    out[arr] = 1
    return out

## **Bias mitigation**

### **Method: Popularity propensity**

### **Traditional implementation**

First, we will show the traditional implementation of the "Popularity Propensity" method.

In [5]:
mf = PopularityPropensityMF(K=40, beta=0.02, steps=100, verbose=1)
data_matrix = df_pivot.fillna(0).to_numpy()
mf.fit(data_matrix)

In [5]:
def recommended_items(model, data_matrix, k):
    recommended_items_mask = data_matrix>0
    candidate_index = ~recommended_items_mask
    candidate_rating = model.pred*candidate_index
    return np.argsort(-candidate_rating,axis=1)[:,:k]

In [6]:
new_items = recommended_items(mf, data_matrix, 10)
new_recs = [explode(new_items[u], len(df_pivot.columns)) for u in range(df_pivot.shape[0])]
new_df_pivot_db = pd.DataFrame(new_recs, columns = df_pivot.columns)

mat = new_df_pivot_db.replace(0,np.nan).to_numpy()
df_popularity = recommender_bias_metrics(mat_pred=mat, metric_type='item_based')
df_popularity

Unnamed: 0_level_0,Value,Reference
Metric,Unnamed: 1_level_1,Unnamed: 2_level_1
Aggregate Diversity,0.999004,1
GINI index,0.440891,0
Exposure Distribution Entropy,6.579432,-
Average Recommendation Popularity,278.3216,-


### **Pipeline implementation**

In [7]:
from holisticai.pipeline import Pipeline

inprocessing_model = PopularityPropensityMF(K=40, beta=0.02, steps=100, verbose=1)

pipeline = Pipeline(
    steps=[
        ("bm_inprocessing", inprocessing_model),
    ]
)

pipeline.fit(data_matrix)

rankings  = pipeline.predict(data_matrix, top_n=10)
mat = rankings.pivot(columns='Y',index='X',values='score').replace(np.nan,0).to_numpy()
df = recommender_bias_metrics(mat_pred=mat>0, metric_type='item_based')
df_pop_pipeline =df.copy()
df_pop_pipeline

Unnamed: 0_level_0,Value,Reference
Metric,Unnamed: 1_level_1,Unnamed: 2_level_1
Aggregate Diversity,1.0,1
GINI index,0.441953,0
Exposure Distribution Entropy,6.578349,-
Average Recommendation Popularity,275.996493,-


### **Method: Two sided fairness**

### **Traditional implementation for FairRec**

Now, we will show the traditional implementation of the "Two sided fairness" method.

In [8]:
from holisticai.bias.mitigation import FairRec

fr = FairRec(rec_size=10, MMS_fraction=0.5)
fr.fit(data_matrix)

In [9]:
recommendations = fr.recommendation
new_recs = [explode(recommendations[key], len(df_pivot.columns)) for key in recommendations.keys()]

new_df_pivot_db = pd.DataFrame(new_recs, columns = df_pivot.columns)

mat = new_df_pivot_db.replace(0,np.nan).to_numpy()

df_tsf = recommender_bias_metrics(mat_pred=mat, metric_type='item_based')
df_tsf

Unnamed: 0_level_0,Value,Reference
Metric,Unnamed: 1_level_1,Unnamed: 2_level_1
Aggregate Diversity,1.0,1
GINI index,0.421428,0
Exposure Distribution Entropy,6.567894,-
Average Recommendation Popularity,317.154227,-


### **Pipeline implementation for FairRec**

In [10]:
from holisticai.pipeline import Pipeline

inprocessing_model = FairRec(rec_size=10, MMS_fraction=0.5)

pipeline = Pipeline(
    steps=[
        ("bm_inprocessing", inprocessing_model),
    ]
)

pipeline.fit(data_matrix)

rankings  = pipeline.predict(data_matrix, top_n=10)
mat = rankings.pivot(columns='Y',index='X',values='score').replace(np.nan,0).to_numpy()
df_tsf_pipeline = recommender_bias_metrics(mat_pred=mat>0, metric_type='item_based')
df_tsf_pipeline

Unnamed: 0_level_0,Value,Reference
Metric,Unnamed: 1_level_1,Unnamed: 2_level_1
Aggregate Diversity,1.0,1
GINI index,0.421428,0
Exposure Distribution Entropy,6.567894,-
Average Recommendation Popularity,317.154227,-
