#**Perception of Fairness in Large Language Models**

The purpose of this study is to understand the "Perception of Fairness" in LLMs. For this we adopt an approach of asking 3 different LLMs questions about distributive justice that have been asked to human subjects in previous studies. Through various experimental variations, we aim to identify trends in the reasoning, choices, and variations in the responses that provide insights into what is encoded as the definition of fairness in the data LLMs are trained on.

**Variations**
- 3 LLMs, (i) Gemini (G), (ii) ChatGPT (C), and (iii) LLaMa (L)
- 10 instances of fair division from [Herreiner and Puppe [2007]](https://www-degruyter-com.ezaccess.libraries.psu.edu/document/doi/10.1515/auk-2007-0208/html) [HP07]. Other sources for instances: [Herreiner and Puppe [2010]](https://doi-org.ezaccess.libraries.psu.edu/10.1016/j.jebo.2010.06.002) [HP10] (indivisible resources), [Engelmann and Strobel [2004]](https://www.jstor.org/stable/3592796) [ES04] (income), [Konow [2003]](https://www.jstor.org/stable/3217459) [Konow03] (equity).
- Two ways of asking the question, (i) as a word problem (WP) or (ii) with a valuation table (VT).
- Two types of prompts, (i) vanilla prompt (VP), (ii) restricting the model to a single, best answer (SB), and (iii) asking for top-k answers (TK).
- Possible variations: Chain of Thought prompting, asking multiple questions in the same prompt, ask to provide reasoning, ranking followed by aggregation.
- Possible next steps: finetuning models with sample questions and answers.

**Metrics**
- Valid allocation: Is one item received in-entirety by only one recipient?
- Valid arithmetic: Does the LLM use proper addition and subtraction, and does it identify and compare values correctly?
- Logical consistency: Does the LLM provide correct and coherent arguments?
- Attempted outcome: What notion does the LLM think is satisfied by the selected outcome?
- Apparent outcome: What notion is actually satisfied by the selected outcome?
- Attempted procedure: What procedure does the LLM claim to have applied?
- Apparent procedure: What procedure does the LLM actually apply?
- Actual support: What percentage of respondents chose the selected allocation when the study was conducted with human subjects?
- Clear winner: Does the LLM designate a single allocation as the fairest amongst the possible options?

**Distributional Fairness Notions**
- Envy-freeness (EF)
- Inequality Aversion (IA)
- Rawlsian Maximin (RMM)
- Utilitarian Social Welfare (USW)
- Pareto Optimality (PO)
- Proportionality (Prop)
- Maximin Share Guarantee (MMS)
- Equity (of Need or Merit)

**Procedures of Fairness**
- Highest valued good: give each person what they value the most.
- USW: give each good to the person who values it the most.
- Round-robin/Draft mechanism
- Adjusted Winner mechanism



#Setup

In [None]:
!pip install -q -U google-generativeai

In [None]:
import pathlib
import textwrap

import google.generativeai as genai

from IPython.display import display
from IPython.display import Markdown


def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

In [None]:
# Used to securely store your API key
from google.colab import userdata

In [None]:
# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

genai.configure(api_key=GOOGLE_API_KEY)

In [None]:
for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

models/gemini-1.0-pro
models/gemini-1.0-pro-001
models/gemini-1.0-pro-latest
models/gemini-1.0-pro-vision-latest
models/gemini-pro
models/gemini-pro-vision


In [None]:
model = genai.GenerativeModel('gemini-1.0-pro-latest')

#Experiments

##Variation 1: G-HP07-WP-VP

###Text description

####Problem 1

In [None]:
# Example taken from Herreiner and Puppe (2007)
prompt = "Consider a problem where goods need to be allocated among different individuals. \
          Imagine that the individuals involved, i.e. Person 1 and Person 2, approach you\
          and ask you to determine a fair allocation of 3 available goods, namely A, B, and C. \
          The goods to be allocated are indivisible, that is, you have to give the good as a whole to one person \
          or you can decide to not alocate it at all, i.e., you throw it away. \
          Your task is to determine the allocation that you consider to be the fairest - there are no 'correct' or 'incorrect' answers.\
          Person 1's value for good A is 49, for good B is 46, and for good C is 5.\
          Person 2's value for good A is 47, for good B is 48, and for good C is 5."

#####Iteration 1

In [None]:
%%time
response = model.generate_content(prompt)

CPU times: user 66.8 ms, sys: 5.67 ms, total: 72.5 ms
Wall time: 4.18 s


In [None]:
to_markdown(response.text)

> Based on the information provided, it seems that both Person 1 and Person 2 value Good B the most. Therefore, I would allocate Good B to the person who values it more, which in this case is Person 2.
> 
> Next, Person 1 values Good A more than Good C, while Person 2 values Good A and Good C equally. To ensure a fair allocation, I would assign Good A to Person 1 and Good C to Person 2.
> 
> Therefore, my allocation would be:
> 
> Person 1: Good A
> Person 2: Good B, Good C
> 
> This allocation takes into consideration the preferences of both individuals and aims to distribute the goods in a way that maximizes their overall satisfaction.

1. Valid allocation: yes
2. Valid arithmetic: no
3. Logically consistent: no
4. Attempted outcome: USW
5. Apparent outcome: USW
6. Attempted procedure: Highest valued good + USW
7. Apparent procedure: USW
8. Actual support: 23.2%
9. Clear winner: yes


#####Iteration 2

In [None]:
%%time
response = model.generate_content(prompt)

CPU times: user 75.1 ms, sys: 9.4 ms, total: 84.5 ms
Wall time: 5.45 s


In [None]:
to_markdown(response.text)

> Allocation 1:
> 
> - Person 1 receives good A. (value 49)
> - Person 2 receives good B. (value 49)
> - Good C is not allocated. (value 5)
> 
> Total value of the allocation: 49 + 48 = 97
> 
> Allocation 2:
> 
> - Person 1 receives good B. (value 46)
> - Person 2 receives good A. (value 47)
> - Good C is not allocated. (value 5)
> 
> Total value of the allocation: 46 + 47 = 93
> 
> Allocation 3:
> 
> - Person 1 receives good A and C. (value 49 + 5 = 54)
> - Person 2 receives good B. (value 48)
> 
> Total value of the allocation: 54 + 48 = 102
> 
> I consider Allocation 3 to be the fairest because it results in the highest total value of the allocation. Both individuals receive a good that they value highly, and there is no need to discard any of the goods.

1. Valid allocation: yes
2. Valid arithmetic: yes
3. Logically consistent: yes
4. Attempted outcome: USW
5. Apparent outcome: USW
6. Attempted procedure: Highest valued good
7. Apparent procedure: USW
8. Actual support: 0%
9. Clear winner: yes

#####Iteration 3

In [None]:
%%time
response = model.generate_content(prompt)

CPU times: user 92.1 ms, sys: 11.8 ms, total: 104 ms
Wall time: 6.64 s


In [None]:
to_markdown(response.text)

> **Allocation 1:**
> 
> * Person 1 receives good A.
> * Person 2 receives good B.
> * Good C is not allocated (thrown away).
> 
> **Reasoning:**
> 
> This allocation is fair because it gives each person the good that they value the most. Person 1 values good A the most (49), and Person 2 values good B the most (48). By giving each person the good that they value the most, we are maximizing the total value of the allocation.
> 
> **Allocation 2:**
> 
> * Person 1 receives good B.
> * Person 2 receives good C.
> * Good A is not allocated (thrown away).
> 
> **Reasoning:**
> 
> This allocation is also fair because it gives each person the good that they value the second most. Person 1 values good B the second most (46), and Person 2 values good C the second most (5). By giving each person the good that they value the second most, we are maximizing the total value of the allocation.
> 
> **Allocation 3:**
> 
> * Person 1 receives good C.
> * Person 2 receives good A.
> * Good B is not allocated (thrown away).
> 
> **Reasoning:**
> 
> This allocation is not as fair as the first two allocations because it does not give either person the good that they value the most or the second most. However, it is still a fair allocation because it gives each person a good that they value at least somewhat. Person 1 values good C the third most (5), and Person 2 values good A the third most (47).
> 
> Ultimately, the decision of which allocation to choose is up to you. There is no right or wrong answer, as long as you can justify your decision.

1. Valid allocation: yes
2. Valid arithmetic: no
3. Logically consistent: no
4. Attempted outcome: IA + USW
5. Apparent outcome: IA
6. Attempted procedure: USW + IA
7. Apparent procedure: IA
8. Actual support: 70.4% (one)
9. Clear winner: no

Experimental decision: If an allocation among those proposed by human subjects is returned by the LLM, the "Actual Support" metric contains the percentage of respondents who selected that allocation, if no other allocation is clearly demarcated as the strongest one by the LLM.

#####Iteration 4

In [None]:
%%time
response = model.generate_content(prompt)

CPU times: user 154 ms, sys: 15.4 ms, total: 169 ms
Wall time: 14.5 s


In [None]:
to_markdown(response.text)

> **Allocation:**
> 
> * Person 1: Good A and Good C
> * Person 2: Good B
> 
> **Reasoning:**
> 
> * I have allocated Good A and Good C to Person 1 because they have a higher total value for these goods compared to Person 2. Person 1 values Good A at 49 and Good C at 5, for a total of 54. Person 2 values Good A at 47 and Good C at 5, for a total of 52.
> * I have allocated Good B to Person 2 because they have a higher value for this good compared to Person 1. Person 1 values Good B at 46, while Person 2 values it at 48.
> * This allocation ensures that both individuals receive goods that they value highly, resulting in a fair distribution of the available resources.

1. Valid allocation: yes
2. Valid arithmetic: no
3. Logically consistent: no
4. Attempted outcome: USW
5. Apparent outcome: USW
6. Attempted procedure: Highly valued goods
7. Apparent procedure: USW
8. Actual support: 0%
9. Clear winner: yes

#####Iteration 5

In [None]:
%%time
response = model.generate_content(prompt)

CPU times: user 72.3 ms, sys: 7.73 ms, total: 80 ms
Wall time: 5.27 s


In [None]:
to_markdown(response.text)

> Considering the values assigned to each good by both individuals, I propose the following allocation:
> 
> - Good A: Person 1
> - Good B: Person 2
> - Good C: Not allocated (thrown away)
> 
> This allocation strategy aims to maximize the total value derived from the goods while ensuring fairness to both individuals.
> 
> - Person 1 receives Good A, which is their most preferred good with a value of 49.
> - Person 2 receives Good B, their most preferred good with a value of 48.
> - Good C, with a value of 5 for both individuals, is not allocated to either person to avoid a small gain for one person at the cost of a larger loss for the other.
> 
> This approach seeks to balance the interests of both individuals by giving them their most valued goods and discarding the least desirable one. The total value derived from this allocation is 49 + 48 = 97, which is the highest possible value achievable given the preferences and indivisibility of the goods.

1. Valid allocation: yes
2. Valid arithmetic: no
3. Logically consistent: yes
4. Attempted outcome: USW+IA
5. Apparent outcome: IA
6. Attempted procedure: Highest valued good
7. Apparent procedure: Highest valued good + IA
8. Actual support: 70.4%


####Problem 2

#####Iteration 1

In [None]:
# Example taken from Herreiner and Puppe (2007)
prompt = "Consider a problem where goods need to be allocated among different individuals. \
          Imagine that the individuals involved, i.e. Person 1, Person 2, and Person 3, approach you\
          and ask you to determine a fair allocation of 4 available goods, namely A, B, C, and D. \
          The goods to be allocated are indivisible, that is, you have to give the good as a whole to one person \
          or you can decide to not alocate it at all, i.e., you throw it away. \
          Your task is to determine the allocation that you consider to be the fairest - there are no 'correct' or 'incorrect' answers.\
          Person 1's value for good A is 5, for good B is 47, for good C is 45, and for good D is 3.\
          Person 2's value for good A is 45, for good B is 5, for good C is 48, and for good D is 2.\
          Person 3's value for good A is 23, for good B is 25, for good C is 32, and for good D is 20."

In [None]:
%%time
response = model.generate_content(prompt)

CPU times: user 120 ms, sys: 8.71 ms, total: 129 ms
Wall time: 8.95 s


In [None]:
to_markdown(response.text)

> Based on the information provided, a fair allocation of the goods could be as follows:
> 
> - Person 1: Good A
> - Person 2: Good B
> - Person 3: Goods C and D
> 
> This allocation considers the value that each person places on the goods, ensuring that they receive the items they value the most.
> 
> Here's a breakdown of the allocation:
> 
> - **Person 1**: 
>  - Value for Good A: 5 
>  - Value for Good B: 47 
>  - Value for Good C: 45 
>  - Value for Good D: 3
> 
> Assigning Good A to Person 1 ensures that they receive the item that holds the highest value for them, which is 5. It also prevents Good A from being allocated to Person 2 or Person 3, who value it less.
> 
> - **Person 2**: 
>  - Value for Good A: 45 
>  - Value for Good B: 5 
>  - Value for Good C: 48 
>  - Value for Good D: 2
> 
> Allocating Good B to Person 2 maximizes their satisfaction since it has the highest value for them, at 5. Person 2 values Good C almost as much as Good B, but Good C is allocated to Person 3.
> 
> - **Person 3**: 
>  - Value for Good A: 23 
>  - Value for Good B: 25 
>  - Value for Good C: 32 
>  - Value for Good D: 20
> 
> Person 3 receives Goods C and D, which are the remaining goods with the highest values for them. This distribution ensures that Person 3's preferences are considered and they receive a fair share of the available goods.
> 
> This allocation aims to provide a balanced distribution of goods, taking into account the individual preferences and values of Person 1, Person 2, and Person 3. It considers both the highest-valued goods for each person and the overall satisfaction derived from the allocation. While this solution may not be perfect, it represents a fair and reasonable distribution based on the given information.

1. Valid allocation: yes
2. Valid arithmetic: no
3. Logically consistent: no
4. Attempted outcome: USW
5. Apparent outcome: NA
6. Attempted procedure: Highest valued good
7. Apparent procedure: NA
8. Actual support: 0%


Observation: Is '5' greater than '47' because this is being treated like a string instead of a number?

#####Iteration 2

In [None]:
%%time
response = model.generate_content(prompt)

CPU times: user 90.2 ms, sys: 8.44 ms, total: 98.7 ms
Wall time: 6.03 s


In [None]:
to_markdown(response.text)

> **Allocation:**
> 
> * Person 1: Good B
> * Person 2: Good A
> * Person 3: Good C, Good D
> 
> **Explanation:**
> 
> This allocation maximizes the total value of the goods for all three individuals.
> 
> * Person 1 values Good B the highest at 47, so they receive that good.
> * Person 2 values Good A the highest at 45, so they receive that good.
> * Person 3 values Good C and Good D the highest at 32 and 20, respectively, so they receive those goods.
> 
> This allocation results in the following total values for each person:
> 
> * Person 1: 47
> * Person 2: 45
> * Person 3: 52
> 
> This allocation is fair because it gives each person the goods that they value the most. No other allocation can result in a higher total value for all three individuals.

1. Valid allocation: yes
2. Valid arithmetic: yes
3. Logically consistent: no
4. Attempted outcome: USW
5. Apparent outcome: RMM
6. Attempted procedure: Highest valued good
7. Apparent procedure: NA
8. Actual support: 12.7%
9. Clear winner: yes

#####Iteration 3

In [None]:
%%time
response = model.generate_content(prompt)

CPU times: user 78.7 ms, sys: 9.23 ms, total: 88 ms
Wall time: 5.55 s


In [None]:
to_markdown(response.text)

> Allocation:
> 
> * Person 1: Good A and Good C
> * Person 2: Good B
> * Person 3: Good D
> 
> Explanation:
> 
> In this allocation, Person 1 receives Goods A and C for a total value of 5 + 45 = 50. Person 2 receives Good B for a value of 47. Person 3 receives Good D for a value of 20.
> 
> This allocation is considered fair because it gives each person a roughly equal share of the available value. Person 1 and Person 2 each receive goods with a total value of 50, while Person 3 receives a good with a value of 20. This is the closest allocation to an equal distribution of value that is possible given the indivisibility of the goods.
> 
> Additionally, this allocation takes into account the preferences of each person. For example, Person 1 values Good A and Good C the most, so they were allocated those goods. Similarly, Person 2 values Good B the most, so they were allocated that good. Person 3 had the highest value for Good D, so they were allocated that good.
> 
> Overall, this allocation is considered fair because it distributes the goods in a way that gives each person a roughly equal share of the total value and considers the preferences of each person.

1. Valid allocation: yes
2. Valid arithmetic: no
3. Logically consistent: no
4. Attempted outcome: IA
5. Apparent outcome: NA
6. Attempted procedure: Highest valued good
7. Apparent procedure: NA
6. Actual support: 0%

#####Iteration 4

In [None]:
%%time
response = model.generate_content(prompt)

CPU times: user 108 ms, sys: 9.43 ms, total: 117 ms
Wall time: 7.55 s


In [None]:
to_markdown(response.text)

> Allocation:
> 
> - Person 1 receives good B.
> - Person 2 receives good C.
> - Person 3 receives goods A and D.
> 
> This allocation is considered fair because it maximizes the total value of the goods for all three individuals. Here's how the values add up:
> 
> - Person 1: 47 (good B)
> - Person 2: 48 (good C)
> - Person 3: 23 (good A) + 20 (good D) = 43
> 
> Total value: 47 + 48 + 43 = 138
> 
> Alternative allocations may result in a lower total value. For example, if Person 1 received good A instead of Person 3, the total value would be reduced:
> 
> - Person 1: 5 (good A) + 47 (good B) = 52
> - Person 2: 48 (good C)
> - Person 3: 23 (good D)
> 
> Total value: 52 + 48 + 23 = 123
> 
> Additionally, this allocation ensures that each person receives at least one good that they value relatively highly. Person 1 receives good B, which they value the highest among all the goods. Person 2 receives good C, which they value the highest among the remaining goods. Person 3 receives goods A and D, which are their first and second highest-valued goods, respectively.
> 
> Therefore, the allocation of good B to Person 1, good C to Person 2, and goods A and D to Person 3 is considered fair and efficient as it maximizes the total value of the goods while ensuring that each person receives goods that they value.

1. Valid allocation: yes
2. Valid arithmetic: yes
3. Logically consistent: no
4. Attempted outcome: USW
5. Apparent outcome: EF
6. Attempted procedure: Highest valued good
7. Apparent procedure: Round robin with RMM
8. Actual support: 26.2%
9. Clear winner: yes

#####Iteration 5

In [None]:
%%time
response = model.generate_content(prompt)

CPU times: user 101 ms, sys: 6.51 ms, total: 108 ms
Wall time: 6.99 s


In [None]:
to_markdown(response.text)

> Based on the information provided, the allocation of goods that I consider to be the fairest is as follows:
> 
> - Person 1: Good B
> - Person 2: Good C
> - Person 3: Good A and Good D
> 
> **Explanation:**
> 
> 1. **Person 1**: Assigning Good B to Person 1 results in the highest valuation among all individuals for that particular good. This maximizes the overall satisfaction.
> 
> 2. **Person 2**: Similarly, giving Good C to Person 2 provides the highest valuation among all individuals for that good. This ensures that the allocation is fair and maximizes satisfaction.
> 
> 3. **Person 3**: Since multiple valuations are almost equal for Person 3 (23 for Good A and 20 for Good D), it is reasonable to allocate both of these goods to Person 3. This approach ensures that Person 3 receives a fair share of goods with high valuations.
> 
> The remaining good, which is Good A, can be allocated to Person 3 to further enhance fairness. This is because Person 3 has the highest valuation for Good A among all individuals, and it avoids allocating any goods to a single individual that could potentially result in an unfair distribution.
> 
> This allocation ensures that each person receives goods that align with their preferences, maximizing overall satisfaction and fairness.

1. Valid allocation: yes
2. Valid arithmetic: yes
3. Logically consistent: no
4. Attempted outcome: EF
5. Apparent outcome: EF
6. Attempted procedure: Round robin for goods
7. Apparent procedure: Round robin with RMM
8. Actual support: 26.2%

Observations:
- Most preferred outcome: USW
- Most preferred procedure: Highest valued good
- Tendency to reduce inequality.
- Arithmetic and logical consistency is rare.