# Violations of Preferential Equality and Preferential Compensation

In [31]:
from pref_voting.generate_profiles import * 
from pref_voting.voting_methods import *
from pref_voting.rankings import *
from pref_voting.profiles_with_ties import ProfileWithTies
from pref_voting.iterative_methods import top_n_instant_runoff_for_truncated_linear_orders

import glob
from zipfile import ZipFile
import os
import io
from tqdm.notebook import tqdm
import ray # optional for parallel processing

Recall that Preferential Equality (in its coalitional form) is violated when there are candidates $x,y$ and coalitions $I,J$ of voters of equal size such that if everyone in $I$ switches from ranking $x$ immediately above $y$ to ranking $y$ immediately above $x$, this will produce a different winner (or winners) than if everyone in $J$ switches from ranking $x$ immediately above $y$ to ranking $y$ immediately above $x$.

 Recall that Preferential Compensation (in its coaliation form) is violated when there are candidates $x,y$ and coalitions $I,J$ of voters of equal size such that if everyone in $I$ switches from ranking $x$ immediately above $y$ to ranking $y$ immediately above $x$, while everyone in $J$ switches from ranking $y$ immediately above $x$ to ranking $x$ immediately above $y$, this changes the winner(s).

### Helper functions

In [29]:
def same_ranking_extended_strict_pref(ranking1, ranking2, candidates): 
    # check if ranking1 and ranking2 have the same ranking of candidates
    for c1 in candidates:
        for c2 in candidates:
            if (ranking1.extended_strict_pref(c1, c2) and not ranking2.extended_strict_pref(c1, c2)) or (not ranking1.extended_strict_pref(c1, c2) and ranking2.extended_strict_pref(c1, c2)):
                return False
    return True

def get_winner_runner_up_loser(profile): 
    """Finds the winner, runner-up, and loser in a three-candidate IRV election."""
    pl_scores = profile.plurality_scores()
    
    # Loser = candidate with fewest first-place votes
    lowest_pl_score = sorted(set(pl_scores.values()))[0]
    loser = [c for c in pl_scores if pl_scores[c] == lowest_pl_score][0]

    # Winner = the IRV winner
    winner = instant_runoff_for_truncated_linear_orders(profile)[0]

    # Runner-up = the remaining candidate
    runner_up = [c for c in profile.candidates if c not in [winner, loser]][0]

    return winner, runner_up, loser

In [30]:
r1 = Ranking({
    "a":1, 
    "b":2, 
    "c":3})

r2 = Ranking({
    "a":1,
    "b":2}
)

same_ranking_extended_strict_pref(r1, r2, ["a", "b", "c"])

True

# Preferential Equality violations in three-candidate IRV elections

Assume a given profile represent an IRV election with three candidates. Assume the winner is candidate `a`, the runner up is candidate `b`, and the loser is candidate `c`.

Naively, there are six ways preferential equality could be violated, but only three are actually possible:

## Case A: Swapping $b$ and $c$

#### Violations possible in Case A1:

**Case A1.** Flipping any number of voters with $abc$ to $acb$ ($bc$ to $cb$) will not change the winner, since $c$ will still be eliminated in the first round, and we have not changed the ranking of $a$ vs. $b$. (Hence we do not test whether such flipping changes the winner in the code below.)

Now if flipping $\min(\text{num\_abc}, \text{num\_bca})$ voters with the ranking $bca$ to $cba$ ($bc$ to $cb$) changes the winner, then **this is a violation of Preferential Equality**. 

Note that if flipping some $k < \min(\text{num\_abc}, \text{num\_bca})$ voters from $bca$ to $cba$ changes the winner, because $b$ is eliminated in the first round, then flipping any number between $k$ and $\min(\text{num\_abc}, \text{num\_bca})$ voters will have the same effect, and no matter how many voters between $k$ and $\min(\text{num\_abc}, \text{num\_bca})$ we flip, the resulting profiles will be identical with respect to $a$ vs. $c$. 

Hence all we need to check in this case is whether flipping $\min(\text{num\_abc}, \text{num\_bca})$ from $bca$ to $cba$ changes the winner.

#### No violation possible in Case A2:

**Case A2.** Flipping any number of voters with $acb$ to $abc$ ($cb$ to $bc$) will not change the winner, since $c$ will still be eliminated in the first round, and we have not changed the ranking of $a$ vs. $b$.

Similarly, flipping any number of voters with the ranking $cba$ to $bca$ ($cb$ to $bc$) will not change the winner, since $c$ will still be eliminated in the first round, and we have not changed the ranking of $a$ vs. $b$.

## Case B: Swapping $a$ and $c$

#### Violation possible in Case B1:

**Case B1.** Flipping any number of voters with $bac$ to $bca$ ($ac$ to $ca$) will not change the winner, since $c$ will still be eliminated in the first round, and we have not changed the ranking of $a$ vs. $b$. (Hence we do not test whether such flipping changes the winner in the code below.)

Now if flipping up to $\min(\text{num\_bac}, \text{num\_acb})$ voters with the ranking $acb$ to the ranking $cab$ ($ac$ to $ca$) changes the winner, then **this is a violation of Preferential Equality**. 

Note that if flipping some $k < \min(\text{num\_bac}, \text{num\_acb})$ voters from $acb$ to $cab$ changes the winner, because either $a$ or $b$ is eliminated in the first round, then flipping any number between $k$ and $\min(\text{num\_bac}, \text{num\_acb})$ voters will have the same effect; for if $k$ flips causes $a$ to be eliminated, then more flips will also cause $a$ to be eliminated; and if $k$ flips causes $b$ to be eliminated and then $c$ to win, then more flips will either cause $b$ will to still be eliminated and $c$ to win or cause $a$ to be eliminated, in which case again the winner has changed. 

Hence all we need to check in this case is whether flipping $\min(\text{num\_bac}, \text{num\_acb})$ from $acb$ to $cab$ changes the winner. 

#### No violation possible in Case B2:

**Case B2.** Flipping any number of voters with $bca$ to $bac$ ($ca$ to $ac$) will not change the winner, since $c$ will still be eliminated in the first round, and we have not changed the ranking of $a$ vs. $b$.

Similarly, flipping some voters with the ranking $cab$ to $acb$ ($ca$ to $ac$) will not change the winner, since $c$ will still be eliminated in the first round, and we have not changed the ranking of $a$ vs. $b$.

## Case C: Swapping $a$ and $b$

#### No violation possible in Case C1:

*Case C1.** Flipping any number of voters with $cab$ to $cba$ ($ab$ to $ba$) might change the winner, since although $c$ will still be eliminated in the first round, we have changed the ranking of $a$ vs. $b$.

But then flipping the same number of voters with the ranking $abc$ to $bac$ ($ab$ to $ba$) will change the winner in the same way, since $c$ will still be eliminated in the first round, and we have changed the ranking of $a$ vs. $b$ in exactly the same way.

#### Violation possible in Case C2:

**Case C2.** Flipping any number of voters with $cba$ to $cab$ ($ba$ to $ab$) will not change the winner, since $c$ will still be eliminated in the first round, and then $a$ will still win (with even more votes in the second round that before).

Now if flipping up to $\min(\text{num\_cba}, \text{num\_bac})$ voters with the ranking $bac$ to $abc$ ($ba$ to $ab$) changes the winner (since then $b$ receives fewer first-place votes and hence might be eliminate in the first round instead of $c$), then **this is a violation of Preferential Equality**. 

Note that if flipping some $k < \min(\text{num\_cba}, \text{num\_bac})$ voters from $bac$ to $abc$ changes the winner, because $b$ is eliminated and then $c$ wins, then flipping more voters from $bac$ to $abc$ will have the same effect, since $b$ will be eliminated and the resulting profile will be the same with respect to $a$ vs. $c$. 

Hence all we need to check in this case is whether flipping any $\min(\text{num\_cba}, \text{num\_bac})$ from $bac$ to $abc$ changes the winner.

In [3]:
def has_irv_preferential_equality_violation(profile, winner, runner_up, loser): 

    ''' 
    Returns True only if the profile has a Preferential Equality Violation, assuming the profile represents a three-candidate IRV election.

    The winner is candidate `a`, the runner up is candidate `b`, and the loser is candidate `c`.

    '''

    abc = Ranking({
        winner:1, 
        runner_up:2, 
        loser:3})
    
    acb = Ranking({
        winner:1, 
        loser:2, 
        runner_up:3})
    
    bca = Ranking({
        runner_up:1, 
        loser:2, 
        winner:3})
    
    bac = Ranking({
        runner_up:1, 
        winner:2, 
        loser:3})
    
    cab = Ranking({
        loser:1, 
        winner:2, 
        runner_up:3})
    
    cba = Ranking({
        loser:1, 
        runner_up:2, 
        winner:3})

    ranking_types, rcounts = profile.rankings_counts

    # Check for Case A1:

    num_abc = 0
    for r, c in zip(ranking_types, rcounts): 
        if same_ranking_extended_strict_pref(r, abc, profile.candidates):
            num_abc += c
    
    new_prof = profile.replace_rankings(bca, cba, num_abc) # note that replace_ranking in effect switches min(num_abc,num_bca) voters from bca to cba


    if instant_runoff_for_truncated_linear_orders(new_prof) != [winner]: 
        return True, "A1"

    # Check for Case B1:

    num_bac = 0
    for r, c in zip(ranking_types, rcounts): 
        if same_ranking_extended_strict_pref(r, bac, profile.candidates):
            num_bac += c

    new_prof = profile.replace_rankings(acb, cab, num_bac) # note that replace_ranking in effect switches min(num_bac,num_acb) voters from acb to cab

    if instant_runoff_for_truncated_linear_orders(new_prof) != [winner]: 
        return True, "B1"
    
    # Check for Case C2:

    num_cba = 0
    for r, c in zip(ranking_types, rcounts): 
        if same_ranking_extended_strict_pref(r, cba, profile.candidates):
            num_cba += c

    new_prof = profile.replace_rankings(bac, abc, num_cba) # note that replace_ranking in effect switches min(num_cba,num_bac) voters from bac to abc

    if instant_runoff_for_truncated_linear_orders(new_prof) != [winner]: 
        return True, "C2"

    return False, None

# Preferential Compensation violations in three-candidate IRV elections

Naively, there are six ways preferential compensation could be violated, but only four are actually possible:

## Case A: Swapping $b$ and $c$

#### No violation in Case A1:

**Case A1.** Let $N$ be the min of the number of voters with $abc$ and the number with $cba$. If flipping $N$ voters with $abc$ to $acb$ ($bc$ to $cb$) while also flipping $N$ voters with $cba$ to $bca$ ($cb$ to $bc$) changes the winner, this is a violation of Preferential Compensation. But this will not change the winner, since $c$ will still be eliminated in the first round (since we are giving some of $c$'s first-place votes to $b$), and we have not changed the ranking of $a$ vs. $b$.

#### Violations possible in Case A2:

**Case A2.** Let $N$ be the min of the number of voters with $acb$ and the number of voters with $bca$. If flipping $N$ voters with $acb$ to $abc$ ($cb$ to $bc$) while also flipping $N$ voters with $bca$ to $cba$ ($bc$ to $cb$) changes the winner, this is a violation of Preferential Compensation.

## Case B: Swapping $a$ and $c$

#### No violations in Case B1:

**Case B1.** Let $N$ be the min of the number of voters with $bac$ and the number of voters with $cab$. If flipping $N$ voters with $bac$ to $bca$ ($ac$ to $ca$) while also flipping $N$ voters with $cab$ to $acb$ ($ca$ to $ac$) changes the winner, this is a violation of Preferential Compensation. But this will not change the winner, since $c$ will still be eliminated in the first round (since we are giving some of $c$'s first-place votes to $a$), and we have not changed the ranking of $a$ vs. $b$.

#### Violations possible in Case B2:

**Case B2.** Let $N$ be the min of the number of voters with $bca$ and the number of voters with $acb$. If flipping $N$ voters with $bca$ to $bac$ ($ca$ to $ac$) while also flipping $N$ voters with $acb$ to $cab$ ($ac$ to $ca$) changes the winner, this is a violation of Preferential Compensation.

## Case C: Swapping $a$ and $b$

#### Violations possible in Case C1:

**Case C1.** Let $N$ be the min of the number of voters with $cab$ and the number of voters with $abc$. If flipping $N$ voters with $cab$ to $cba$ ($ab$ to $ba$) while also flipping $N$ voters with $bac$ to $abc$ ($ba$ to $ab$) changes the winner (since $b$ might be eliminated in the first round instead of $c$), this is a violation of Preferential Compensation.

#### Violations possible in Case C2:

**Case C2.** Let $N$ be the min of the number of voters with $cba$ and the number of voters with $bac$. If flipping $N$ voters with $cba$ to $cab$ ($ba$ to $ab$) while also flipping $N$ voters with $abc$ to $bac$ ($ab$ to $ba$) changes the winner, this is a violation of Preferential Compensation.

In [4]:
def has_irv_preferential_compensation_violation(profile, winner, runner_up, loser): 
    ''' 
    Returns True only if the profile has a Preferential Compensations Violation, assuming the profile is a three-candidate IRV election.

    The winner is candidate `a`, the runner up is candidate `b`, and the loser is candidate `c`.

    '''

    abc = Ranking({
        winner:1, 
        runner_up:2, 
        loser:3})
    
    acb = Ranking({
        winner:1, 
        loser:2, 
        runner_up:3})
    
    bca = Ranking({
        runner_up:1, 
        loser:2, 
        winner:3})
    
    bac = Ranking({
        runner_up:1, 
        winner:2, 
        loser:3})
    
    cab = Ranking({
        loser:1, 
        winner:2, 
        runner_up:3})
    
    cba = Ranking({
        loser:1, 
        runner_up:2, 
        winner:3})

    ranking_types, rcounts = profile.rankings_counts

    num_abc = 0
    for r,c in zip(ranking_types, rcounts): 
        if same_ranking_extended_strict_pref(r, abc, profile.candidates):
            num_abc += c

    num_acb = 0
    for r,c in zip(ranking_types, rcounts): 
        if same_ranking_extended_strict_pref(r, acb, profile.candidates):
            num_acb += c

    num_bac = 0
    for r, c in zip(ranking_types, rcounts): 
        if same_ranking_extended_strict_pref(r, bac, profile.candidates):
            num_bac += c

    num_bca = 0
    for r, c in zip(ranking_types, rcounts): 
        if same_ranking_extended_strict_pref(r, bca, profile.candidates):
            num_bca += c

    num_cab = 0
    for r, c in zip(ranking_types, rcounts): 
        if same_ranking_extended_strict_pref(r, cab, profile.candidates):
            num_cab += c

    num_cba = 0
    for r, c in zip(ranking_types, rcounts): 
        if same_ranking_extended_strict_pref(r, cba, profile.candidates):
            num_cba += c

    # Case A2

    num = min(num_acb, num_bca)

    intermediate_prof = profile.replace_rankings(acb, abc, num)
    new_prof = intermediate_prof.replace_rankings(bca, cba, num)

    if instant_runoff_for_truncated_linear_orders(new_prof) != [winner]: 
        return True, "A2"

    # Case B2

    num = min(num_acb, num_bca)

    intermediate_prof = profile.replace_rankings(acb, cab, num)
    new_prof = intermediate_prof.replace_rankings(bca, bac, num)

    if instant_runoff_for_truncated_linear_orders(new_prof) != [winner]: 
        return True, "B2"
    
    # Case C1

    num = min(num_cab, num_abc)

    intermediate_prof = profile.replace_rankings(cab, cba, num)
    new_prof = intermediate_prof.replace_rankings(bac, abc, num)

    if instant_runoff_for_truncated_linear_orders(new_prof) != [winner]: 
        return True, "C1"
    
    # Case C2

    num = min(num_cba, num_bac)

    intermediate_prof = profile.replace_rankings(cba, cab, num)
    new_prof = intermediate_prof.replace_rankings(abc, bac, num)

    if instant_runoff_for_truncated_linear_orders(new_prof) != [winner]: 
        return True, "C2"

    return False, None

In [5]:
def top_n_instant_runoff_for_truncated_linear_orders(
    profile, 
    n,
    curr_cands = None, 
    threshold = None, 
    hide_warnings = True): 
    """
    Returns the top n candidates according to the Instant Runoff method: Iteratively remove candidates until there are at most n candidates left.   Note that since there may be multiple candidates with the lowest plurality score, this may return less than n candidates.

    """
    
    assert all([not r.has_overvote() for r in profile.rankings]), "Instant Runoff is only defined when all the ballots are truncated linear orders."
    
    curr_cands = profile.candidates if curr_cands is None else curr_cands

    if len(curr_cands) <= n:
        return sorted(curr_cands)

    # we need to remove empty rankings during the algorithm, so make a copy of the profile
    prof2 = copy.deepcopy(profile) 
    
    _prof = prof2.remove_candidates([c for c in profile.candidates if c not in curr_cands])

    # remove the empty rankings
    _prof.remove_empty_rankings()
    
    remaining_candidates = _prof.candidates
        
    pl_scores = _prof.plurality_scores()
    
    while len(remaining_candidates) > n: 
        reduced_prof = _prof.remove_candidates([c for c in _prof.candidates if c not in remaining_candidates])
        
        # after removing the candidates, there might be some empty ballots.
        reduced_prof.remove_empty_rankings()

        pl_scores = reduced_prof.plurality_scores()
        min_pl_score = min(pl_scores.values())
        cands_to_remove = [c for c in pl_scores.keys() if pl_scores[c] == min_pl_score]

        if not hide_warnings and len(cands_to_remove) > 1: 
            print(f"Warning: multiple candidates removed in a round: {', '.join(map(str,cands_to_remove))}")
            
        if len(cands_to_remove) == len(reduced_prof.candidates): 
            # all remaining candidates have the same plurality score.
            remaining_candidates = reduced_prof.candidates
            break 

        remaining_candidates = [c for c in remaining_candidates if c not in cands_to_remove]

    if len(remaining_candidates) != n:
        if not hide_warnings:
            print(f"Warning: cannot reduce to exactly {n} candidates.")
        return None

    return sorted(remaining_candidates)

# Find violations

In [36]:
def find_violations(profiles): 

    num_profs = 0
    num_profs_no_absolute_maj_winner = 0

    num_pref_equality_violations = 0  
    num_pref_compensation_violations = 0  

    pref_eq_violation_types = {"A1": 0, "B1": 0, "C2": 0}
    pref_comp_violation_types = {"A2": 0, "B2": 0, "C1": 0, "C2": 0}

    for prof in tqdm(profiles): 

        if not prof.is_truncated_linear: 
            continue
        prof.remove_empty_rankings() 

        top_three = top_n_instant_runoff_for_truncated_linear_orders(prof, 3)

        if top_three is None:
            continue

        restricted_prof = prof.remove_candidates([c for c in prof.candidates if c not in top_three])

        restricted_prof.remove_empty_rankings() 
        
        if len(restricted_prof.candidates) != 3:
            continue



        irv_ws = instant_runoff_for_truncated_linear_orders(restricted_prof)

        if len(irv_ws) != 1: 
            continue
        
        num_profs += 1

        absolute_majority_winner = absolute_majority(restricted_prof)
        if len(absolute_majority_winner) == 1: 
            continue

        #print("Original profile:")
        #prof.display()
        #print("Top three:", top_three)
        #print("Restricted profile:")
        #restricted_prof.display()
        #print(restricted_prof.description())
        #print(restricted_prof.plurality_scores())
        #print(instant_runoff_for_truncated_linear_orders(restricted_prof))
        
        num_profs_no_absolute_maj_winner += 1

        winner, runner_up, loser = get_winner_runner_up_loser(restricted_prof)

        pref_equality_violation, violation_type = has_irv_preferential_equality_violation(restricted_prof, winner, runner_up, loser)
        if pref_equality_violation: 
            num_pref_equality_violations += 1

        if violation_type == "A1":
            pref_eq_violation_types["A1"] += 1
        elif violation_type == "B1":
            pref_eq_violation_types["B1"] += 1
        elif violation_type == "C2":
            pref_eq_violation_types["C2"] += 1

        pref_comp_violation, violation_type = has_irv_preferential_compensation_violation(restricted_prof, winner, runner_up, loser)

        if pref_comp_violation: 
            num_pref_compensation_violations += 1

        if violation_type == "A2":
            pref_comp_violation_types["A2"] += 1
        elif violation_type == "B2":
            pref_comp_violation_types["B2"] += 1
        elif violation_type == "C1":
            pref_comp_violation_types["C1"] += 1
        elif violation_type == "C2":
            pref_comp_violation_types["C2"] += 1

    print(f"{num_profs_no_absolute_maj_winner} out of {num_profs} relevant profiles have no absolute majority winner: {num_profs_no_absolute_maj_winner/num_profs}\n")

    print(f"Preferential Equality Violations\n{num_pref_equality_violations} violations out of {num_profs_no_absolute_maj_winner} profiles: {num_pref_equality_violations/num_profs_no_absolute_maj_winner}\n")

    print(f"Type A1: {pref_eq_violation_types['A1']}")
    print(f"Type B1: {pref_eq_violation_types['B1']}")
    print(f"Type C2: {pref_eq_violation_types['C2']}")

    print("")
    print(f"Preferential Compensation Violations\n{num_pref_compensation_violations} violations out of {num_profs_no_absolute_maj_winner} profiles: {num_pref_compensation_violations/num_profs_no_absolute_maj_winner}\n")

    print(f"Type A2: {pref_comp_violation_types['A2']}")
    print(f"Type B2: {pref_comp_violation_types['B2']}")
    print(f"Type C1: {pref_comp_violation_types['C1']}")
    print(f"Type C2: {pref_comp_violation_types['C2']}")

In [37]:
@ray.remote
def process_profile(prof):

    if not prof.is_truncated_linear: 
        return None
    prof.remove_empty_rankings() 

    top_three = top_n_instant_runoff_for_truncated_linear_orders(prof, 3)

    if top_three is None:
        return None

    restricted_prof = prof.remove_candidates([c for c in prof.candidates if c not in top_three])
    restricted_prof.remove_empty_rankings() 
        
    if len(restricted_prof.candidates) != 3:
        return None

    irv_ws = instant_runoff_for_truncated_linear_orders(restricted_prof)

    if len(irv_ws) != 1: 
        return None
    
    absolute_majority_winner = absolute_majority(restricted_prof)
    if len(absolute_majority_winner) == 1: 
        return {'no_absolute_majority': False}

    winner, runner_up, loser = get_winner_runner_up_loser(restricted_prof)

    pref_equality_violation, pref_equality_violation_type = has_irv_preferential_equality_violation(restricted_prof, winner, runner_up, loser)
    pref_compensation_violation, pref_compensation_violation_type = has_irv_preferential_compensation_violation(restricted_prof, winner, runner_up, loser)

    return {
        'no_absolute_majority': True,
        'pref_equality_violation': pref_equality_violation,
        'pref_equality_violation_type': pref_equality_violation_type,
        'pref_compensation_violation': pref_compensation_violation,
        'pref_compensation_violation_type': pref_compensation_violation_type
    }

def find_violations_parallel(profiles):

    futures = [process_profile.remote(prof) for prof in profiles]

    results = []
    for f in tqdm(futures):
        result = ray.get(f)
        if result is not None:
            results.append(result)

    num_profs = len(results)
    num_profs_no_absolute_maj_winner = sum(1 for r in results if r['no_absolute_majority'])
    num_pref_equality_violations = sum(1 for r in results if r.get('pref_equality_violation'))
    num_pref_compensation_violations = sum(1 for r in results if r.get('pref_compensation_violation'))

    pref_eq_violation_types = {"A1": 0, "B1": 0, "C2": 0}
    pref_comp_violation_types = {"A2": 0, "B2": 0, "C1": 0, "C2": 0}

    for r in results:
        if r.get('pref_equality_violation'):
            pref_eq_violation_types[r['pref_equality_violation_type']] += 1
        if r.get('pref_compensation_violation'):
            pref_comp_violation_types[r['pref_compensation_violation_type']] += 1

    print(f"{num_profs_no_absolute_maj_winner} out of {num_profs} profiles have no absolute majority winner: {num_profs_no_absolute_maj_winner/num_profs if num_profs else 0}\n")

    print(f"Preferential Equality Violations\n{num_pref_equality_violations} violations out of {num_profs_no_absolute_maj_winner} profiles: {num_pref_equality_violations/num_profs_no_absolute_maj_winner if num_profs_no_absolute_maj_winner else 0}\n")

    print(f"Type A1: {pref_eq_violation_types['A1']}")
    print(f"Type B1: {pref_eq_violation_types['B1']}")
    print(f"Type C2: {pref_eq_violation_types['C2']}")

    print("")
    print(f"Preferential Compensation Violations\n{num_pref_compensation_violations} violations out of {num_profs_no_absolute_maj_winner} profiles: {num_pref_compensation_violations/num_profs_no_absolute_maj_winner if num_profs_no_absolute_maj_winner else 0}\n")

    print(f"Type A2: {pref_comp_violation_types['A2']}")
    print(f"Type B2: {pref_comp_violation_types['B2']}")
    print(f"Type C1: {pref_comp_violation_types['C1']}")
    print(f"Type C2: {pref_comp_violation_types['C2']}")

In [38]:
#num_trials = 100
#num_cands = 3
#num_voters = 1001

#profiles = [generate_profile(num_cands, num_voters).to_profile_with_ties() for _ in range(1000)]

#find_violations(profiles)

## Stable Voting Website

In [39]:
profiles = [ProfileWithTies.read(fname) for fname in glob.glob('real_elections/stable_voting_dataset/*')]

find_violations(profiles)

  0%|          | 0/657 [00:00<?, ?it/s]

60 out of 130 relevant profiles have no absolute majority winner: 0.46153846153846156

Preferential Equality Violations
45 violations out of 60 profiles: 0.75

Type A1: 8
Type B1: 37
Type C2: 0

Preferential Compensation Violations
25 violations out of 60 profiles: 0.4166666666666667

Type A2: 10
Type B2: 12
Type C1: 0
Type C2: 3


## Preflib Dataset

In [40]:
parallel = True
display_profiles = False

profiles = []
elections = []
file_names = []

soi_profiles = []
toi_profiles = []
toc_profiles = []

print("Reading in .soi files...")
for fname in tqdm(glob.glob("real_elections/preflib_dataset/*.soi")):

    election_name = fname.split("/")[-1].split(".")[0]

    elections.append(election_name)
    file_names.append(fname)
    prof = ProfileWithTies.read(fname)

    if display_profiles:
        prof.display()

    soi_profiles.append(prof)
    profiles.append(prof)

#print(f"Checking {len(soi_profiles)} .soi files...")

#if parallel:
#    ray.init(ignore_reinit_error=True)
#    find_violations_parallel(soi_profiles)
#    ray.shutdown()
#else:
#    find_violations(soi_profiles)

print("Reading in .toi files...")
skipped_toi = 0
for fname in tqdm(glob.glob("real_elections/preflib_dataset/*.toi")):

    election_name = fname.split("/")[-1].split(".")[0]

    if election_name in elections: 
        print(f"Already have {election_name}.")
        skipped_toi += 1
        continue

    elections.append(election_name)
    file_names.append(fname)
    prof = ProfileWithTies.read(fname)

    if display_profiles:
        prof.display()

    toi_profiles.append(prof)
    profiles.append(prof)

print(f"Skipped {skipped_toi} .toi files.")
#print(f"Checking {len(toi_profiles)} .toi files...")

#if parallel:
#    ray.init(ignore_reinit_error=True)
#    find_violations_parallel(toi_profiles)
#    ray.shutdown()
#else:
#    find_violations(toi_profiles)

print("")
print("Reading in .toc files...")
skipped_toc = 0
for fname in tqdm(glob.glob("real_elections/preflib_dataset/*.toc")):

    election_name = fname.split("/")[-1].split(".")[0]
    
    if election_name in elections: 
        skipped_toc += 1
        continue

    elections.append(election_name)
    file_names.append(fname)
    prof = ProfileWithTies.read(fname)

    if display_profiles:
        prof.display()
    
    toc_profiles.append(prof)
    profiles.append(prof)

print(f"Skipped {skipped_toc} .toc files.")
#print(f"Checking {len(toc_profiles)} .toc files...")
#if parallel:
#    ray.init(ignore_reinit_error=True)
#    find_violations_parallel(toc_profiles)  
#    ray.shutdown()
#else:
#    find_violations(toc_profiles)

print("")
print("Checking all files together...")
if parallel:
    ray.init(ignore_reinit_error=True)
    find_violations_parallel(profiles)
    ray.shutdown()
else:
    find_violations(profiles)

Reading in .soi files...


  0%|          | 0/308 [00:00<?, ?it/s]

Reading in .toi files...


  0%|          | 0/34 [00:00<?, ?it/s]

Skipped 0 .toi files.

Reading in .toc files...


  0%|          | 0/85 [00:00<?, ?it/s]

Skipped 63 .toc files.

Checking all files together...


2025-04-03 09:21:57,782	INFO worker.py:1852 -- Started a local Ray instance.


  0%|          | 0/364 [00:00<?, ?it/s]

104 out of 150 profiles have no absolute majority winner: 0.6933333333333334

Preferential Equality Violations
10 violations out of 104 profiles: 0.09615384615384616

Type A1: 4
Type B1: 6
Type C2: 0

Preferential Compensation Violations
17 violations out of 104 profiles: 0.16346153846153846

Type A2: 4
Type B2: 5
Type C1: 0
Type C2: 8


## CIVS Dataset

In [11]:
import json
profiles = []
_civs_elections = json.load(open("real_elections/civs_dataset/2024-12-15.json"))

civs_elections = _civs_elections['elections']
profiles = []
for election in tqdm(civs_elections):
    if election["test"] == "yes":
        continue
    ballots = []
    num_candidates = election['num_choices']
    for b in election['ballots']:
        ballots.append({cand: rank for cand, rank in enumerate(b) if rank != "?"})
    profiles.append(ProfileWithTies(ballots, candidates=list(range(num_candidates))))
    
find_violations(profiles)

  0%|          | 0/22477 [00:00<?, ?it/s]

  0%|          | 0/22477 [00:00<?, ?it/s]

590 out of 1216 relevant profiles have no absolute majority winner: 0.48519736842105265

Preferential Equality Violations
393 violations out of 590 profiles: 0.6661016949152543

Type A1: 108
Type B1: 285
Type C2: 0

Preferential Compensation Violations
340 violations out of 590 profiles: 0.576271186440678

Type A2: 137
Type B2: 133
Type C1: 6
Type C2: 64


## Otis 2022 Dataset

In [12]:
parallel = True

if parallel:

    ray.init(ignore_reinit_error=True)

    items_to_skip = ['"skipped', 'overvote', 'undervote']

    @ray.remote
    def process_zip_file(file_path):
        enames, profiles = [], []
        with ZipFile(file_path, 'r') as zip_ref:
            for name in zip_ref.namelist():
                if name.endswith(".csv"):
                    with zip_ref.open(name) as f:
                        csv_text = f.read().decode('utf-8')
                        csv_buffer = io.StringIO(csv_text)
                        prof = ProfileWithTies.read(
                            csv_buffer,
                            file_format='csv',
                            csv_format='rank_columns',
                            items_to_skip=items_to_skip
                        )
                        enames.append(name)
                        profiles.append(prof)
        return enames, profiles

    zip_files = glob.glob("real_elections/otis_2022_dataset/*.zip")

    # Submit tasks to Ray
    futures = [process_zip_file.remote(file) for file in zip_files]

    # Gather results with progress bar
    results = []
    for f in tqdm(futures):
        results.append(ray.get(f))

    # Aggregate results
    all_enames, all_profiles = [], []
    for enames, profiles in results:
        all_enames.extend(enames)
        all_profiles.extend(profiles)

    profiles = all_profiles

    ray.shutdown()

else:

    #This will take about 17 minutes to run

    items_to_skip = [
        '"skipped', 
        'overvote', 
        'undervote']

    profiles = []
    enames = []
    for file in tqdm(glob.glob("real_elections/otis_2022_dataset/*.zip")):

        if not file.endswith(".csv") and not file.endswith(".zip"):
            continue
        # if file ends with .zip unzip the file and process it 
        if file.endswith(".zip"):
            with ZipFile(file, 'r') as zip_ref:
                # Iterate through each file inside the zip
                for name in zip_ref.namelist():
                    # Only process .csv files
                    if name.endswith(".csv"):
                        with zip_ref.open(name) as f:
                            # Read the CSV data into memory
                            csv_bytes = f.read()
                            # Decode bytes to string
                            csv_text = csv_bytes.decode('utf-8')
                            # Create a file-like StringIO object
                            csv_buffer = io.StringIO(csv_text)
                            
                            # Now pass this StringIO to ProfileWithTies.read
                            prof = ProfileWithTies.read(
                                csv_buffer,
                                file_format='csv',
                                csv_format='rank_columns',
                                items_to_skip=items_to_skip
                            )
                            enames.append(name)
                            profiles.append(prof)           

2025-04-02 22:30:40,556	INFO worker.py:1852 -- Started a local Ray instance.


  0%|          | 0/458 [00:00<?, ?it/s]

In [13]:
if parallel:

    ray.init(ignore_reinit_error=True)
    find_violations_parallel(profiles)
    ray.shutdown()
    
else:
    find_violations(profiles)

2025-04-02 22:33:19,598	INFO worker.py:1852 -- Started a local Ray instance.


  0%|          | 0/458 [00:00<?, ?it/s]

194 out of 448 profiles have no absolute majority winner: 0.4330357142857143

Preferential Equality Violations
33 violations out of 194 profiles: 0.17010309278350516

Type A1: 7
Type B1: 26
Type C2: 0

Preferential Compensation Violations
38 violations out of 194 profiles: 0.1958762886597938

Type A2: 6
Type B2: 16
Type C1: 1
Type C2: 15
