# Contents

- **User Fairness in Recommendation Systems**  
  1. Introduction  
  2. Fairness in Recommendation Systems  
     2.1. Causes of Unfairness  
  3. Dimensions of Fairness in Recommendation Systems  
     3.1. Fairness Based on the Type  
     3.2. Fairness Based on the Subject  
     3.3. Fairness Based on the Target  
  4. Fairness Perspectives and Underlying Concepts  
     4.1. Consistent Fairness  
     4.2. Calibrated Fairness  
     4.3. Envy-Free Fairness  
     4.4. Rawlsian Maximin Fairness  
     4.5. Maximin-Shared Fairness  
     4.6. Counterfactual Fairness  
  5. Conclusion of the User Fairness Taxonomy  
  - References

# User fairness in recommendation systems

## 1. Introduction  
Recommender systems are intelligent software tools that assist users in navigating vast amount of information landscapes by predicting and suggesting users content aligned with their preferences, interests, and needs (Roy & Dutta, 2022). As digital platforms grow, these systems play a crucial role in mitigating information overload and enhancing user decision-making (Shah et al., 2016). They analyze explicit data, such as ratings, and implicit signals, such as viewing time and click behavior, to generate personalized recommendations using content-based filtering, collaborative filtering, and hybrid approaches (Roy & Dutta, 2022).  
The impact of recommender systems extends across industries. In e-commerce, they personalize shopping experiences, boosting engagement and sales (Roy & Dutta, 2022). Streaming platforms rely on them to improve user retention through tailored content recommendations (Shah, Gaudani, & Balani, 2016). Their applications also extend to social media content management, healthcare treatment planning, tourism recommendations, and education via adaptive learning materials (Roy & Dutta, 2022).  
As demand for personalization increases, recommender systems have become integral to digital services, influencing user engagement, retention, and business outcomes (Shah et al., 2016). However, their growing influence raises critical concerns regarding fairness, particularly in how they impact different user groups and stakeholders (Boratto et al., 2022). Addressing fairness challenges in recommendation systems is essential to ensuring equitable experiences and responsible deployment in various domains.

## 2. Fairness in Recommendation Systems  
Fairness in recommender systems ensures equitable treatment for users, items, and stakeholders, by preventing favoritism and discrimination (Jin et al., 2023; Wu et al., 2023). However, fairness is context-dependent, varying based on system objectives—such as ensuring equal exposure for items, reducing disparities among user groups, or balancing multi-stakeholder interests (Li et al., 2023). While fairness broadly implies equal treatment, its definition requires refinement in recommendation contexts (Wu et al., 2023).  
The significance of fairness arises from its influence on decision-making, shaping what content users engage with and how items gain visibility (Li et al., 2024). Unfair recommendations can reinforce societal biases, limit opportunities for underrepresented groups, and undermine trust in platforms (Jin et al., 2023). For instance, in movie recommendations, a system that primary promotes blockbuster films may systematically reduce exposure for smaller or niche films, disadvantaging users with diverse preferences and limiting visibility for smaller content creators. Addressing fairness helps create a more inclusive recommendation environment, ensuring diverse and relevant content reaches all users equitably.

### 2.1. Causes of Unfairness  
Unfairness in recommender systems occurs when the processes used to generate and distribute recommendations lead to systematically unequal treatment of users, items, or stakeholders (Li et al., 2023). Algorithms may inadvertently prioritize certain content or user groups, creating disparities in visibility, access, and overall user experience. This imbalance is not limited to individual users; it also affects content visibility, influencing access to opportunities such as educational resources and product recommendations (Tommasel & Assent, 2023).  
Without targeted fairness interventions, systems tend to amplify popularity bias (i.e. favoring widely consumed content while limiting exposure for niche or new items), thereby reinforcing existing inequalities (Li et al., 2023; Wang & Liu, 2022; Li et al., 2021; Jin et al., 2023). Addressing these issues requires a balanced approach that combines equality, by giving all users and items equal exposure, with equity, which adjusts for structural disadvantages (Wu et al., 2023).  

#### 2.1.1. Different Types of Biases  
Recommender systems, as data-driven technologies, are prone to biases that create systemic disadvantages and reinforce inequalities in information access (Li et al., 2021; Li et al., 2023). These biases emerge at multiple stages—data collection, model training, and evaluation—leading to unfair outcomes (Jin et al., 2023; Tommasel & Assent, 2023).  
Data bias occurs when training datasets reflect societal inequalities or imbalanced user interactions (Chen et al., 2021; Li et al., 2023). This includes exposure bias, where users primarily see a subset of available content, and historical bias, such as gender disparities in recommendations (Jin et al., 2023; Vassøy & Langseth, 2024). Algorithmic bias further exacerbates unfairness—models optimizing for engagement or accuracy often amplify popularity bias, favoring frequently interacted-with items while reducing visibility for equally relevant but less popular content (Jin et al., 2023; Wang & Liu, 2022). Similarly, collaborative filtering tends to prioritize active users, reinforcing group-based disparities (Li et al., 2021).  
These biases are intensified by feedback loops, where biased recommendations perpetuate existing patterns, further marginalizing underrepresented users and items (Jin et al., 2023; Wang & Liu, 2022). This contributes to filter bubbles and information cocoons, restricting diversity in recommendations (Tommasel & Assent, 2023). Addressing these biases requires balancing accuracy, diversity, and inclusivity through adaptive, long-term interventions (Wu et al., 2023; Vassøy & Langseth, 2024).

#### 2.1.2. Cold-Start Problems  
Cold-start problems arise when recommender systems lack sufficient data on new users or items, making personalized recommendations difficult (Jin et al., 2023; Cohen et al., 2017; Vartak et al., 2017). Since models rely on historical interactions, new users often receive generic, popularity-based suggestions, disadvantaging them compared to established users with richer interaction histories (Li et al., 2021).  
A key fairness concern is that cold-start users inherit biases from warm-start data, perpetuating existing imbalances and leading to less relevant or even discriminatory recommendations (Jin et al., 2023; Cohen et al., 2017). Similarly, new items struggle with visibility, as their lack of interactions limits exposure, reinforcing popularity bias (Jin et al., 2023). This contributes to the Matthew Effect, where popular items continue to dominate while niche or new items remain unseen, exacerbating long-term unfairness (Jin et al., 2023; Tommasel & Assent, 2023).

#### 2.1.3. Multiple Stakeholder Perspectives  
Fairness in recommender systems is complex due to competing demands from multiple stakeholders. Optimizing fairness for one group can create unfairness for another, making it difficult to satisfy all fairness requirements simultaneously (Wang et al., 2023; Li et al., 2023; Wu et al., 2023). For instance, prioritizing item-side fairness by reducing exposure inequality may conflict with user-side fairness, which seeks balanced recommendation utility across users. Multi-sided fairness aims to balance these concerns rather than favoring a single perspective (Li et al., 2023).

## 3. Dimensions of Fairness in Recommendation Systems  
Fairness is a multidimensional concept that can be understood through various lenses—ranging from how decisions are made, to how outcomes are distributed, to whom fairness is applied, and finally, to what level it is targeted. Each dimension captures a different facet of fairness and reflects distinct ethical, technical, and operational concerns. The following sections provide a structured taxonomy for conceptualizing fairness in recommender systems based on its type, subject, and target.

### 3.1. Fairness Based on The Type  
Fairness in recommender systems can be divided into process fairness and outcome fairness, each addressing different aspects of just and equitable recommendations.  
Process fairness, grounded in procedural justice theory, emphasizes the transparency, consistency, and inclusivity of the decision-making process (Lee et al., 2019). It focuses on whether users understand how recommendations are generated, whether the criteria are clearly defined, and whether users can influence or contest outcomes.  
Outcome fairness, rooted in distributive justice, focuses on the equitable allocation of recommendations to prevent systematic disadvantages (Li et al., 2023). It ensures that users with similar qualifications or preferences receive comparable recommendations, regardless of attributes like gender or race. By promoting inclusive and balanced recommendation outcomes, outcome fairness aims to mitigate structural inequalities (Deldjoo et al., 2024).

### 3.2. Fairness Based on The Subject  
Fairness in recommender systems can be conceptualized by identifying the subjects it affects—namely users, items, providers, platforms, and multiple stakeholders—each presenting unique fairness challenges.  
User fairness aims to ensure that all users receive equitable recommendation quality, diversity, and relevance, regardless of their interaction frequency or demographic characteristics (Leonhardt et al., 2018; Li et al., 2021). Traditional systems often favor highly active users, unintentionally disadvantaging those with fewer interactions (Burke et al., 2018), leading to persistent disparities in user experience (Li et al., 2021). Mitigation strategies include fairness-aware re-ranking and synthetic attribute generation when demographic data is unavailable (Burke et al., 2018). User fairness fosters trust and long-term engagement (Leonhardt et al., 2018) and is supported by legal frameworks like the EU Digital Services Act, which mandates transparency in algorithmic impacts (European Commission, 2022; Jin et al., 2023).  
Item fairness ensures that all items—regardless of popularity, novelty, or origin—have equal chances of being recommended (Rampisela et al., 2023). Popularity bias often limits exposure for less popular or new items (Li et al., 2024). Fairness-aware models can rebalance exposure based on item relevance and potential user interest, promoting diversity and preventing content dominance (Zhu et al., 2021; Rampisela et al., 2023).  
Provider fairness focuses on ensuring equitable visibility for content providers, preventing dominance by larger entities (Cheng et al., 2021; Boratto et al., 2022). Without fairness safeguards, smaller or independent creators may be marginalized, reducing content diversity and harming user choice (Wu et al., 2023; Wang et al., 2023). Supporting provider fairness contributes to a more competitive and diverse ecosystem (Deldjoo et al., 2023; Wu et al., 2023).  
Platform fairness seeks to balance user satisfaction, provider exposure, and business objectives. Commercial incentives, such as maximizing engagement or revenue, may conflict with fairness goals by favoring high-revenue content over relevance (Deldjoo et al., 2023). Maintaining platform fairness requires ethical alignment across stakeholders (Wu et al., 2023).  
Hybrid fairness, or multi-stakeholder fairness, addresses the challenge of balancing competing interests among users, providers, and platforms (Li et al., 2021). Single-sided optimization can disadvantage other groups, but fairness-aware strategies—like proportional exposure or diversity-promoting algorithms—can help reconcile these trade-offs and create a more inclusive recommendation environment (Jin et al., 2023).

### 3.3. Fairness Based on The Target  
Fairness in recommender systems can also be categorized by its target, distinguishing between individual fairness and group fairness. These two perspectives differ in focus—individual fairness ensures equitable treatment at the user level, while group fairness aims to reduce disparities among predefined demographic groups.  
Individual fairness emphasizes that users with similar characteristics should receive similar recommendations, often based on sensitive attributes, feature similarities, or latent traits (Li et al., 2021; Jin et al., 2023; Cheng et al., 2021). This approach prioritizes personalization and aims to fairly reflect user preferences and needs (Leonhardt et al., 2018). For instance, in healthcare recommendations, patients with comparable medical profiles should receive equally high-quality suggestions (Jin et al., 2023). However, operationalizing individual fairness is challenging due to the lack of consensus on how to define and measure user similarity (Li et al., 2021).  
Group fairness, on the other hand, ensures that different demographic groups—such as those defined by race, gender, or socioeconomic status—receive equitable treatment in recommendation outcomes (Boratto et al., 2022; Li et al., 2021; Wang & Liu, 2022; Jin et al., 2023). This perspective also relates to aggregated diversity, which aims to ensure that diverse content is fairly distributed across user groups (Leonhardt et al., 2018). A typical use case is in job recommendation systems, where fairness requires that equally qualified candidates from different racial backgrounds have equal chances of receiving favorable recommendations (Li et al., 2023). One common method to enforce group fairness is demographic parity, which ensures equal recommendation rates across groups. However, this can conflict with individual fairness and may lower overall utility, highlighting the need to balance both fairness perspectives (Jin et al., 2023; Li et al., 2021; Cheng et al., 2021).

## 4. Fairness Perspectives and Underlying Concepts  
Fairness in recommender systems is context-dependent and involves multiple considerations. To evaluate fairness, it is essential to first define the subject of fairness. This taxonomy focuses specifically on user fairness, examining it at both individual and group levels.  
Fairness can be looked from multiple perspectives. First, fairness can be static or dynamic, depending on whether it is measured at a single point in time or across multiple interactions (Cheng et al., 2021; Wang et al., 2023; Ge et al., 2021; Li et al., 2023). Second, it can be direct or indirect, referring to whether fairness is explicitly enforces within the system or evaluated externally as an outcome of the system’s outputs (Deldjoo et al., 2024; Council et al., 2004). Third, fairness can be associative or causal, based on whether fairness is assessed through statistical correlations or through causal reasoning and counterfactual analysis (Wu et al., 2020; Wang et al., 2023; Li et al., 2021).  

Following the framework proposed by Wang et al. (2023), user fairness is categorized into six key concepts: consistent fairness, calibrated fairness, envy-free fairness, counterfactual fairness, Rawlsian maximin fairness, maximin-shared fairness. Within this taxonomy, all the six fairness concepts are evaluated in a static context, meaning fairness is assessed at a single point in time rather than dynamically over user interactions. Additionally, although some fairness concepts could be enforced through constraints or model design, this taxonomy evaluates them through external evaluations after recommendations are generated, classifying them as indirect fairness. Lastly, the concepts are primarily rooted in associative fairness, using statistical measures to assess fairness outcomes, with the exception of counterfactual fairness, which relies on causal inference by examining the effect of sensitive attributes on recommendation outcomes.

### 4.1 Consistent Fairness  
Consistent fairness ensures that similar users (or groups) with similar preferences receive similar recommendation outcomes (Wang et al., 2023; Dwork et al., 2012; Mansoury et al., 2022; Li et al., 2023). In a movie recommendation system, suppose Alice, a female user, and Bob, a male user, both watched and rated similar movies (e.g., high-rated action and thriller films) and rated them all with 5 stars. If Bob receives recommendations for similar action and thriller movies, while Alice is recommended romantic comedies despite her clear preference for crime and thriller genres, this indicates a violation of consistent fairness.  
This form of fairness can be measured using the Gini coefficient, which measures the inequality in recommendation quality at the individual level (Wu et al., 2023; Fu et al., 2020; Ge et al., 2021; Leonhardt et al., 2018; Mansoury er al., 2020; Abdollahpouri, 2021). The Gini coefficient (G) is computed as:

$$
G = \frac{\sum_{i=1}^u \sum_{j=1}^u |r_i - r_j|}{2u^2 \bar{r}}
$$

where $r_i$ and $r_j$ are the recommendation scores (e.g. predicted ratings) for different users, $u$ is the total number of users, and $\bar{r}$ is the average predicted recommendation score across all users. The formula calculates the sum of absolute differences between all pairs of user recommendation scores, divided by twice the product of the square of the number of users and the average recommendation score. This normalization produces a value between 0 and 1. A Gini coefficient of 0 indicates perfect equality (all users receive the same quality of recommendations), while a value of 1 indicates perfect inequality. A high Gini coefficient indicates greater disparity and thus lower fairness. In recommender systems, values below 0.4 typically suggest more equitable distribution of recommendation quality.  
At both, individual and group level, consistent fairness can be evaluated using variance, which measures disparities in recommendation quality across similar users or user groups (e.g. men and women) (Wang et al., 2023; Wu et al., 2023; Rastegarpanah et al., 2019). The idea is to compare how consistently a model treats similar users – in a system, users with similar preferences or profiles should receive similar satisfaction (predicted ratings). The formula variance ($\sigma^2$) at individual level is:

$$
\sigma^2 = \frac{1}{u - 1} \sum_{i=1}^u (r_{u,i} - \bar{r}_u)^2
$$

where $u$ denotes a particular user, $r_{u,i}$ is the i-th recommendation score (e.g. predicted rating) for user $u$, and $\bar{r}_u$ is the average recommendation score for user $u$. At both levels, lower variance indicates more consistency within the recommendations, suggesting that the system is tailoring results reliably for that individual or group. However, a high variance does not necessarily mean that the system is then unfair. It can also mean that the users who have similar profiles for example based on gender or age, have different individual preferences. After all, not all members of a group are alike.  
Another useful metric is Absolute Different (AD), which measures the gap in recommendation quality between two groups (protected and un-protected group) (Wang et al., 2023; Wang et al., 2022). This means calculating the difference in the predicted recommendation scores:

$$
AD = |\bar{r}_{male} - \bar{r}_{female}|
$$

If male users receive an average prediction of 4.7 and female users receive 4.0, the AD is 0.7, indicating a gender gap in movie recommendations. This means that on average male users have better predictions and are more satisfied, but it can also just reflect the genuine differences that these two groups have in rating items and in their preferences. A lower AD indicates that the model treats both groups more similarly.

### 4.2 Calibrated Fairness  
Calibrated fairness requires that users receive recommendations proportional to their merit, such that more qualified users are allocated more relevant items (Wang et al., 2023; Li et al., 2023; Jabbari et al., 2017; Heidari et al., 2019). In a movie recommendation system, calibrated fairness would ensures that recommendations reflect user preferences rather than being skewed by popularity bias (Wang et al., 2023). Suppose Alice and Bob both rate crime and thriller movies with an average rating of 4.5. If Bob receives similarly aligned recommendations while Alice receives romantic comedies with an average predicted rating of 3.0, this reflects a violation of calibrated fairness.  
This can be measured using Kullback-Leibler (KL) divergence, which quantifies the divergence between a user’s expected preference distribution and the actual distribution of recommendation content (Tommasel & Assent, 2023; Li et al., 2022; Steck, 2018; Wan et al., 2020). It is defined as:

$$
D_{KL}(P||Q) = \sum_i P(u) \log \left( \frac{P(u)}{Q(u)} \right)
$$

where $P(u)$ is the user’s preference distribution, and $Q(u)$ is the actual recommendation distribution. $P$ can be computed by analyzing a user's historical interaction patterns, such as the frequency of genres in their positively rated items or the distribution of item attributes they've engaged with. $Q$ is derived from the set of items recommended to the user, analyzing the same attributes to create a comparable distribution. For example, if 70% of a user's positively rated movies are action films, but only 30% of their recommendations are action films, this creates a measurable divergence.  
The difference between a user's recommendation list and their history is quantified by comparing these two distributions. While the historical preference distribution $P$ captures what a user has enjoyed in the past, the recommendation distribution $Q$ represents what the system believes the user will enjoy in the future. Ideally, these should be well-calibrated without being identical, allowing for some novelty while respecting core preferences. A high KL divergence value indicates poor calibration.  
Alternatively, L1-norm, can be used:

$$
L1 = \sum_i |P(u) - Q(u)|
$$

This captures the total deviation between the expected and actual recommendation distributions (Tommasel & Assent, 2023; Li et al., 2022; Wang et al., 2023; Steck, 2018; Biega et al., 2018; Borges & Stefanidis, 2019; Kirnap et al., 2021). In this formula, $P(u)$ represents the expected or ideal distribution of recommendations for user $u$, while $Q(u)$ represents the actual observed distribution. The formula sums the absolute differences between these distributions across all users, providing a measure of overall disparity. A lower value suggests better alignment and higher fairness.

### 4.3 Envy-Free Fairness  
Envy-free fairness ensures that users are not envious of others' recommendations, meaning they do not prefer someone else’s list over their own (Budish, 2011; Jin et al., 2023; Li et al., 2023; Wu et al., 2023; Ghodsi et al., 2018). For example, if Alice and Bob have similar movie preferences, but Alice receives irrelevant recommendations while Bob’s list matches her taste better, Alice may prefer Bob’s list over her own, violating envy-free fairness.  
This can be measured using Mean Average Envy (MEA) metric, which measures at the individual level, how much an user “envies” other users who received higher predicted ratings for the same item, and at the group level the average envy among users within the same demographic group by comparing the satisfaction (predicted ratings) delivered from their own recommendations to that of others in the group (Wu et al., 2023; Wang et al., 2023). For group fairness, for any pair of users $u_i$ and $u_j$ within the same demographical attribute group (e.g. female or man), the total envy is calculated as:

$$
\text{envy}(u_i, u_j) = \max(f(l_{u_j}, u_i) - f(l_{u_i}, u_i), 0)
$$

where $f(l_{u_j}, u_i)$ is the satisfaction (often represented as the predicted rating) that user $u_i$ derives from $u_j$'s list and $f(l_{u_i}, u_i)$ is the satisfaction of their own list.  
The MAE for a group of $u$ users is then computed as:

$$
MAE = \frac{1}{u(u - 1)} \sum_{u_i \neq u_j} \text{envy}(u_i, u_j)
$$

The satisfaction can be thought as the predicted rating the user receives from the model. Higher the predicted rating, higher the satisfaction. Wu et al. (2023) mention that the lower the MAE, the fairer the system, but it can also just mean that the preferences of users are different. In group fairness, a higher MAE indicates greater disparity in predicted ratings among users within the same demographic group (e.g. male and female), pointing to potential fairness issues. MEA can also be used to measure individual fairness by averaging how much a user envies all other users' predicted ratings for the same item and averaging this across all such comparisons.

### 4.4 Rawlsian Maximin Fairness  
Rawlsian Maximin Fairness (RMF) seeks to improve the recommendation quality for the least-advantaged users by maximizing the minimum utility received (Deldjoo et al., 2023; Zehlike & Castillo, 2020; Rawls, 2020; Li et al., 2023). In a movie recommendation system, RMF ensures that low-interaction users (e.g. Alice with 10 ratings) are not consistently disadvantaged compared to highly active users (e.g., Bob with 200 ratings).  
This is measured using the bottom-N average metric, which evaluates recommendation quality for the least-favored users (Wu et al., 2023; Zhu et al., 2017):

$$
B_n = \frac{1}{|U_{min}|} \sum_{i \in U_{min}} r_i
$$

where $U_{min}$ is the set of the $n$ least advantaged users (typically the bottom $n\%$ of users sorted by their utility or recommendation scores), and $r_i$ is the recommendation score for user $i$. This formula calculates the average recommendation quality specifically for those users with the lowest scores, allowing to evaluate how well the system serves its most disadvantaged users. If Alice’s bottom-N average recommendation score is significantly lower than Bob’s, the system exhibits unfairness by failing to ensure equitable outcomes for less-active users.

### 4.5 Maximin-Shared Fairness  
Maximin-shared fairness (MSF) guarantees that each user or group receives at least a predefined minimum share of their recommendation utility (Budish, 2011; Wang & Liu, 2022; Ghodsi et al., 2018; Wang et al., 2023). MSF ensures that users like Alice, who have limited interaction histories, receive recommendations of sufficient quality rather than only serving the most active users. If a threshold (e.g., average rating 4.0) is used to define a fair share, and Alice receives 3.2, this violates MSF.  
Within this taxonomy, MSF is used as a group fairness evaluation metric. It can be measured using the fraction of satisfied users (FSU), which captures the proportion of users whose recommendation satisfaction meets or exceeds a predefined threshold (Patro et al., 2020):

$$
FSU = \frac{1}{|U|} \sum_{u \in U} \mathbf{1}_{(Q_u \geq Q_{min})}
$$

where $Q_u$ is the satisfaction for user $u$, and $Q_{min}$ is the minimum fair threshold. Within this taxonomy the threshold is defined as the average of each demographical group predicted ratings. Higher utility indicates that, on average, the customer is receiving higher-rated recommendations. The higher the fraction, the fairer the system.

### 4.6 Counterfactual Fairness  
Counterfactual fairness holds when a user’s recommendations remain unchanged even if their sensitive attribute (e.g., gender) is hypothetically altered while all other features stay constant (Wang et al., 2023; Kusner et al., 2017; Cheng et al., 2021; Wu et al., 2023; Li et al., 2023; Yao & Huang, 2017). For example, in a movie recommendation system, counterfactual fairness would require that changing Alice's gender from female to male, while keeping all other preferences and behaviors constant, does not result in significantly different recommendations. If the system recommends romantic comedies when Alice is marked as female, but switches to recommending thrillers when her gender is changed to male, this indicates that gender is causally influencing the recommendation outcome, thereby violating counterfactual fairness.  
According to Wang et al. (2023) and Li et al. (2021), counterfactual fairness can be measured using statistical tests, such as adversarial classification. This involves training a classifier to predict sensitive attributes (e.g. gender or age) from the user embeddings produced by the recommendation model. If the classifier performs poorly based on metrics like Precision, Recall, F1 score, or AUC (Area Under the Curve), it indicates that sensitive information is not encoded in the embeddings, thereby suggesting that the system satisfies counterfactual fairness.

## 5. Conclusion of the User Fairness Taxonomy  
As recommender systems increasingly shape digital experiences and decision-making, ensuring fairness has become a critical requirement for ethical and inclusive system design. This taxonomy offers a structured framework to understand, evaluate, and improve user fairness in these systems by addressing its underlying causes, dimensions, and evaluation strategies.

![Figure 1: Causes of Unfairness](attachment:image.png)  
*Figure 1: Causes of Unfairness*

First, fairness must be understood through its origins. As shown in the Figure 1, these include biases in data, algorithms, and feedback loops, as well as structural challenges like cold-start problems and competing stakeholder interests. Next, fairness is unpacked across three key dimensions: Type (process vs. outcome), Subject (users, items, providers, platforms, and stakeholders), and Target (individual vs. group). Together, these dimensions, shown in the Figure 2, provide a holistic view of who fairness applies to, how it should be implemented, and at what level.

![image-2.png](attachment:image-2.png)  
*Figure 2 Fairness Dimensions*

Finally, fairness is further explored through perspectives and concepts—including static vs. dynamic, direct vs. indirect, and associative vs. causal views. These guide the evaluation of six user fairness concepts (consistent, calibrated, envy-free, Rawlsian maximin, maximin-shared, and counterfactual fairness) using concrete metrics such as Gini coefficient, KL divergence, and classification scores, as illustrated in the Figure 3.

![image-3.png](attachment:image-3.png)  
*Figure 3 Fairness Perspectives, Concepts and Metrics*

By integrating these layers, this taxonomy equips researchers and practitioners with the tools to identify unfairness, apply targeted interventions, and evaluate fairness outcomes, thereby supporting more equitable and trustworthy recommender systems.

## References

Abdollahpouri, H., Mansoury, M., Burke, R., & Mobasher, B. (2019).  
*The unfairness of popularity bias in recommendation*. arXiv preprint arXiv:1907.13286.

Biega, A. J., Gummadi, K. P., & Weikum, G. (2018).  
Equity of attention: Amortizing individual fairness in rankings.  
In *Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR’18)* (pp. 405–414).  
[https://doi.org/10.1145/3209978.3210063](https://doi.org/10.1145/3209978.3210063)

Borges, R., & Stefanidis, K. (2019).  
Enhancing long-term fairness in recommendations with variational autoencoders.  
In *Proceedings of the 11th International Conference on Management of Digital EcoSystems (MEDES’19)* (pp. 95–102).  
[https://doi.org/10.1145/3297662.3365798](https://doi.org/10.1145/3297662.3365798)

Boratto, L., Fenu, G., & Marras, M. (2022).  
Consumer fairness in recommender systems: Contextualizing definitions and mitigations.  
In *Advances in Information Retrieval* (pp. 552–566). Springer.  
[https://doi.org/10.1007/978-3-030-72113-8_35](https://doi.org/10.1007/978-3-030-72113-8_35)

Budish, E. (2011).  
The combinatorial assignment problem: Approximate competitive equilibrium from equal incomes.  
*Journal of Political Economy, 119*(6), 1061–1103.  
[https://doi.org/10.1086/664613](https://doi.org/10.1086/664613)

Burke, R., Kontny, J., & Sonboli, N. (2018).  
Synthetic attribute data for evaluating consumer-side fairness.  
In *Proceedings of the FATREC Workshop on Responsible Recommendation (FATREC’18)*, Vancouver, Canada.

Cheng, W., Zhang, Y., Zhang, J., Zhu, Y., & Liu, Y. (2021).  
Bias and debias in recommender system: A survey and future directions.  
*Frontiers of Computer Science, 15*(5), 1–22.  
[https://doi.org/10.1007/s11704-021-0444-0](https://doi.org/10.1007/s11704-021-0444-0)

Cohen, D., Aharon, M., Koren, Y., Somekh, O., & Nissim, R. (2017).  
Expediting exploration by attribute-to-feature mapping for cold-start recommendations.  
In *Proceedings of the Eleventh ACM Conference on Recommender Systems* (pp. 184–192).

Council, N.R., et al. (2004).  
*Measuring Racial Discrimination*. National Academies Press, London.

Deldjoo, Y., Jannach, D., Bellogín, A., Difonzo, A., & Zabzonelli, D. (2024).  
Fairness in recommender systems: Research landscape and future directions.  
*User Modeling and User-Adapted Interaction, 34*, 457–511.  
[https://doi.org/10.1007/11257-023-09364-z](https://doi.org/10.1007/11257-023-09364-z)

Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012).  
Fairness through awareness.  
In *Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS’12)* (pp. 214–226).  
[https://doi.org/10.1145/2090236.2090255](https://doi.org/10.1145/2090236.2090255)

Fu, Z., Xian, Y., Gao, R., Zhao, J., Huang, Q., Ge, Y., Xu, S., Geng, S., Shah, C., Zhang, Y., et al. (2020).  
Fairness-aware explainable recommendation over knowledge graphs.  
In *Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval* (pp. 69–78).

Ge, Y., Liu, S., Gao, R., Xian, Y., Li, Y., Zhao, X., Pei, C., Sun, F., Ge, J., Ou, W., & Zhang, Y. (2021).  
Towards long-term fairness in recommendation.  
In *Proceedings of the 14th ACM International Conference on Web Search and Data Mining (WSDM’21)* (pp. 445–453).  
[https://doi.org/10.1145/3437963.3441824](https://doi.org/10.1145/3437963.3441824)

Ghodsi, M., Hajiaghayi, M., Seddighin, M., Seddighin, S., & Yami, H. (2018).  
Fair allocation of indivisible goods: Improvements and generalizations.  
In *Proceedings of the ACM Conference on Economics and Computation (EC’18)* (pp. 539–556).  
[https://doi.org/10.1145/3219166.3219238](https://doi.org/10.1145/3219166.3219238)

Grgić-Hlača, N., Zafar, M. B., Gummadi, K. P., & Weller, A. (2018).  
Beyond distributive fairness in algorithmic decision making: Feature selection for procedurally fair learning.  
*Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18)*, 32(1), 51–59.

Heidari, H., Ferrari, C., Gummadi, K. P., & Krause, A. (2019).  
Fairness behind a veil of ignorance: A welfare analysis for automated decision making.  
In *Proceedings of the 28th International Joint Conference on Artificial Intelligence* (pp. 5825–5831).  
[https://doi.org/10.24963/ijcai.2019/807](https://doi.org/10.24963/ijcai.2019/807)

Jabbari, S., Joseph, M., Kearns, M., Morgenstern, J., & Roth, A. (2017).  
Fairness in reinforcement learning.  
In *Proceedings of the 34th International Conference on Machine Learning* (pp. 1617–1626).  
[http://proceedings.mlr.press/v70/jabbari17a.html](http://proceedings.mlr.press/v70/jabbari17a.html)

Jin, D., Wang, L., Yang, X., & Shen, H. (2023).  
A survey of fairness-aware recommender systems.  
*Information Processing & Management, 60*(2), 103230.  
[https://doi.org/10.1016/j.ipm.2022.103230](https://doi.org/10.1016/j.ipm.2022.103230)

Kırnap, Ö., Diaz, F., Biega, A., Ekstrand, M., Carterette, B., & Yilmaz, E. (2021).  
Estimation of fair ranking metrics with incomplete judgments.  
In *Proceedings of the Web Conference (WWW’21)* (pp. 1065–1075).  
[https://doi.org/10.1145/3442381.3450080](https://doi.org/10.1145/3442381.3450080)

Lee, M., Jain, A., Cha, H. J., Ojha, S., & Kusbit, D. (2019).  
Procedural justice in algorithmic fairness: Leveraging transparency and outcome control for fair algorithmic mediation.  
*Proceedings of the ACM on Human-Computer Interaction, 3*(CSCW).  
[https://doi.org/10.1145/3359284](https://doi.org/10.1145/3359284)

Leonhardt, J., Anand, A., & Khosla, M. (2018).  
User fairness in recommender systems.  
In *Companion Proceedings of the 2018 World Wide Web Conference (WWW '18 Companion)* (pp. 101–102).  
[https://doi.org/10.1145/3184558.3186949](https://doi.org/10.1145/3184558.3186949)

Li, J., Ren, Y., Sanderson, M., & Deng, K. (2024).  
Explaining recommendation fairness from a user/item perspective.  
*ACM Transactions on Information Systems, 43*(1), Article 17.  
[https://doi.org/10.1145/3698877](https://doi.org/10.1145/3698877)

Li, S., Karatzoglou, A., & Gentile, C. (2021).  
Leave no user behind: Towards improving the utility of recommender systems for non-mainstream users.  
In *Proceedings of the 14th ACM International Conference on Web Search and Data Mining* (pp. 103–111).  
[https://doi.org/10.1145/3437963.3441769](https://doi.org/10.1145/3437963.3441769)

Li, S., Karatzoglou, A., & Gentile, C. (2023).  
Fairness in recommendation: Foundations, methods, and applications.  
*Information Processing & Management, 60*(3), 103338.  
[https://doi.org/10.1016/j.ipm.2022.103338](https://doi.org/10.1016/j.ipm.2022.103338)

Li, Y., Chen, H., Xu, S., Ge, Y., & Zhang, Y. (2021).  
Towards personalized fairness based on causal notion.  
In *Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’21)* (pp. 1054–1063).  
[https://doi.org/10.1145/3404835.3462966](https://doi.org/10.1145/3404835.3462966)

Li, Y., Hedia, M.-L., Ma, W., Lu, H., Zhang, M., Liu, Y., & Ma, S. (2022).  
Contextualized fairness for recommender systems in premium scenarios.  
*Big Data Research, 27*, 100300.  
[https://doi.org/10.1016/j.bdr.2021.100300](https://doi.org/10.1016/j.bdr.2021.100300)

Mansoury, M., Abdollahpouri, H., Pechenizkiy, M., Mobasher, B., & Burke, R. (2022).  
A graph-based approach for mitigating multi-sided exposure bias in recommender systems.  
*ACM Transactions on Information Systems (TOIS), 40*(2), 32:1–32:31.  
[https://doi.org/10.1145/3470948](https://doi.org/10.1145/3470948)

Rampisela, T. V., Maistro, M., Ruotsalo, T., & Lioma, C. (2023).  
Evaluation measures of individual item fairness for recommender systems: A critical study.  
*Manuscript submitted for publication*.

Rastegarpanah, B., Gummadi, K. P., & Crovella, M. (2019).  
Fighting fire with fire: Using antidote data to improve polarization and fairness of recommender systems.  
In *Proceedings of the 12th ACM International Conference on Web Search and Data Mining (WSDM’19)* (pp. 231–239).  
[https://doi.org/10.1145/3289600.3291002](https://doi.org/10.1145/3289600.3291002)

Rawls, J. (2020).  
*A theory of justice*. Harvard University Press.

Roy, D., & Dutta, M. (2022).  
A systematic review and research perspective on recommender systems.  
*Journal of Big Data, 9*(59).  
[https://doi.org/10.1186/s40537-022-00592-5](https://doi.org/10.1186/s40537-022-00592-5)

Shah, L., Gaudani, H. V., & Balani, P. (2016).  
Survey on recommendation system.  
*International Journal of Computer Applications, 137*(7), 43–49.  
[https://doi.org/10.5120/ijca2016908821](https://doi.org/10.5120/ijca2016908821)

Steck, H. (2018).  
Calibrated recommendations.  
In *Proceedings of the 12th ACM Conference on Recommender Systems (RecSys’18)* (pp. 154–162).

Tommasel, A., & Assent, I. (2023, December).  
Recommendation fairness and where to find it: An empirical study on fairness of user recommender systems.  
In *2023 IEEE International Conference on Big Data (BigData)* (pp. 4195–4204). IEEE.

Vartak, M., Thiagarajan, A., Miranda, C., Bratman, J., & Larochelle, H. (2017).  
A meta-learning perspective on cold-start recommendations for items.  
*Advances in neural information processing systems, 30*.

Vassøy, B., & Langseth, H. (2024).  
Consumer-side fairness in recommender systems: A systematic survey.  
*User Modeling and User-Adapted Interaction, 34*, 391–456.  
[https://doi.org/10.1007/s11257-020-09285-1](https://doi.org/10.1007/s11257-020-09285-1)

Wang, Y., & Liu, J. (2022).  
Study on fairness of group recommendation based on stochastic model checking.  
In *Proceedings of the 2022 International Conference on Artificial Intelligence and Computer Science* (pp. 457–465).  
[https://doi.org/10.1007/978-981-19-1419-1_49](https://doi.org/10.1007/978-981-19-1419-1_49)

Wang, Y., Zhang, Y., & Wang, Z. (2023).  
A survey on the fairness of recommender systems.  
*ACM Computing Surveys, 56*(5), 1–38.  
[https://doi.org/10.1145/3616865](https://doi.org/10.1145/3616865)

Wu, Y., Cao, J., & Xu, G. (2023).  
Fairness in recommender systems: Evaluation approaches and assurance strategies.  
*ACM Transactions on Knowledge Discovery from Data, 18*(1), 1–37.

Wu, Y., Sun, Y., Ma, Y., Liu, X., Zhang, Y., & Wu, Y. (2023).  
Fairness in recommender systems: Evaluation approaches.  
In *Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval* (pp. 2892–2897).  
[https://doi.org/10.1145/3539618.3591865](https://doi.org/10.1145/3539618.3591865)

Yao, S., & Huang, B. (2017).  
Beyond parity: Fairness objectives for collaborative filtering.  
*Proceedings of the 31st Conference on Neural Information Processing Systems (NeurIPS 2017)*, Long Beach, CA, USA.

Zehlike, M., & Castillo, C. (2020).  
Reducing disparate exposure in ranking: A learning to rank approach.  
In *Companion Proceedings of the Web Conference 2020 (WWW ’20 Companion)* (pp. 2849–2855). ACM.

Zhu, Z., Kim, J., Nguyen, T., Fenton, A., & Caverlee, J. (2021).  
Fairness among new items in cold start recommender systems.  
*Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21)*, 767–776.  
[https://doi.org/10.1145/3404835.3462948](https://doi.org/10.1145/3404835.3462948)