Skip to content

Commit

Permalink
Merge pull request #121 from cdonnay/v2.0.0
Browse files Browse the repository at this point in the history
V2.0.0
  • Loading branch information
cdonnay committed Mar 1, 2024
2 parents 0606b97 + c393d45 commit a1c2526
Show file tree
Hide file tree
Showing 31 changed files with 2,403 additions and 958 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ dist/
.ipynb_checkpoints
.idea
extra_data/
.venv
33 changes: 13 additions & 20 deletions docs/SCR_ballot_generators.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,30 +16,23 @@ The Impartial Anonymous Culture model has $\alpha = 1$. This means that the poin

## Candidate Simplex Models

### Plackett-Luce
### Name-Plackett-Luce
The name-Plackett-Luce model (n-PL) samples ranked ballots as follows. Assume there are $n$ blocs of voters. Within a bloc, say bloc $A$, voters have $n$ preference intervals, one for each slate of candidates. A bloc also has a fixed $n$-tuple of cohesion parameters $\pi_A = (\pi_{AA}, \pi_{AB},\dots)$; we require that $\sum_B \pi_{AB}=1$. To generate a ballot for a voter in bloc $A$, each preference interval $I_B$ is rescaled by the corresponding cohesion parameter $\pi_{AB}$, and then concatenated to create one preference interval.
Voters then sample without replacement from the combined preference interval.

The Plackett-Luce model (PL) samples ranked ballots as follows. Given a bloc's preference interval, it samples candidates without replacement from the interval. That means when a candidate is selected, their portion of the interval is removed, and the interval is normalized to be length 1 again.
### Name-Bradley-Terry
The name-Bradley-Terry model (n-BT) samples ranked ballots as follows. Assume there are $n$ blocs of voters. Within a bloc, say bloc $A$, voters have $n$ preference intervals, one for each slate of candidates. A bloc also has a fixed $n$-tuple of cohesion parameters $\pi_A = (\pi_{AA}, \pi_{AB},\dots)$; we require that $\sum_B \pi_{AB}=1$. To generate a ballot for a voter in bloc $A$, each preference interval $I_B$ is rescaled by the corresponding cohesion parameter $\pi_{AB}$, and then concatenated to create one preference interval.
Voters then sample ballots proportional to pairwise probabilities of candidates. That is, the probability that the ballot $C_1>C_2>C_3$ is sampled is proprotional to $P(C_1>C_2)P(C_2>C_3)P(C_1>C_3)$, where these pairwise probabilities are given by $P(C_1>C_2) = C_1/(C_1+C_2)$.
Here $C_i$ denotes the length of $C_i$'s share of the combined preference interval.

- The PL model generates full ballots, with the caveat that any candidates with 0 support are listed as ties at the end of the ballot.
### Name-Cumulative
The name-Cumulative model (n-C) samples ranked ballots as follows. Assume there are $n$ blocs of voters. Within a bloc, say bloc $A$, voters have $n$ preference intervals, one for each slate of candidates. A bloc also has a fixed $n$-tuple of cohesion parameters $\pi_A = (\pi_{AA}, \pi_{AB},\dots)$; we require that $\sum_B \pi_{AB}=1$. To generate a ballot for a voter in bloc $A$, each preference interval $I_B$ is rescaled by the corresponding cohesion parameter $\pi_{AB}$, and then concatenated to create one preference interval. To generate a ballot, voters sample with replacement from the combined interval as many times as determined by the length of the desired ballot.

- It can be initialized directly from a set of preference intervals (one for each bloc), or by using [from_params](api.md#ballot-generators). This method uses cohesion and Dirichlet parameters.
### Slate-Plackett-Luce
The slate-Plackett-Luce model (s-PL) samples ranked ballots as follows. Assume there are $n$ blocs of voters. Within a bloc, say bloc $A$, voters have $n$ preference intervals, one for each slate of candidates. A bloc also has a fixed $n$-tuple of cohesion parameters $\pi_A = (\pi_{AA}, \pi_{AB},\dots)$; we require that $\sum_B \pi_{AB}=1$. Now the cohesion parameters play a different role than in the name models above. For s-PL, $\pi_{AB}$ gives the probability that we put a $B$ candidate in each position on the ballot. If we have already exhausted the number of $B$ candidates, we remove $\pi_{AB}$ and renormalize. Once we have a ranking of the slates on the ballot, we fill in candidate ordering by sampling without replacement from each individual preference interval (we do not concatenate them!).

- The PL model can handle arbitrarily many blocs.

- The PL model also requires information about what proportion of voters belong to each bloc.

### Bradley-Terry

The Bradley-Terry model (BT) samples ranked ballots as follows. Given a preference interval, the probability of sampling the ballot $A>B>C$ is equal to the product of the probabilities $P(A>B)P(B>C)P(A>C)$. One of these probabilities can be computed as $P(A>B) = A/(A+B)$, where we let $A$ denote both the candidate and the length of its interval.


- The BT model generates full ballots, with the caveat that any candidates with 0 support are listed as ties at the end of the ballot.

- It can be initialized directly from a set of preference intervals (one for each bloc), or by using [from_params](api.md#ballot-generators). This method uses cohesion and Dirichlet parameters.

- The BT model can handle arbitrarily many blocs.

- The BT model also requires information about what proportion of voters belong to each bloc.
### Slate-Bradley-Terry
The slate-Bradley-Terry model (s-BT) samples ranked ballots as follows. We assume there are 2 blocs of voters. Within a bloc, say bloc $A$, voters have 2 preference intervals, one for each slate of candidates. A bloc also has a fixed tuple of cohesion parameters $\pi_A = (\pi_A, 1-\pi_A)$. Now the cohesion parameters play a different role than in the name models above. For s-BT, we again start by filling out a ballot with bloc labels only. Now, the probability that we sample the ballot $A>A>B$ is proportional to $\pi_A^2$; just like name-Bradley-Terry, we are computing pairwise comparisons. In $A>A>B$, slate $A$ must beat slate $B$ twice. As another example, the probability of $A>B>A$ is proportional to $\pi_A(1-\pi_A)$. Once we have a ranking of the slates on the ballot, we fill in candidate ordering by sampling without replacement from each individual preference interval (we do not concatenate them!).

### Alternating-Crossover

Expand Down
5 changes: 2 additions & 3 deletions docs/SCR_simplex.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,11 @@ The value $\alpha$ is never allowed to be 0 or $\infty$, so VoteKit uses an arbi

### Cohesion Parameters

When there are multiple blocs, or types, of voters, we utilize cohesion parameters to measure how much voters prefer candidates from their own bloc versus the opposing blocs. Suppose there are two blocs of voters, $X,Y$. We assume that voters from the $X$ bloc have some underlying [preference interval](SCR_preference_intervals.md) $I_{XX}$ for candidates within their bloc, and a different underlying preference interval $I_{XY}$ for the candidates in the opposing bloc . We then assume that voters in $X$ prefer $X$ candidates with proportion $\pi_X$.

In order to construct one preference interval for $X$ voters, we take $I_{XX}$ and scale it by $\pi_X$, then we take $I_{XY}$ and scale it by $1-\pi_X$, and finally we concatenate the two. As a concrete example, if $\pi_X = .75$, this means that 3/4 of the preference interval for $X$ voters is taken up by candidates from the $X$ bloc, and the other 1/4 by $Y$ candidates.
When there are multiple blocs, or types, of voters, we utilize cohesion parameters to measure how much voters prefer candidates from their own bloc versus the opposing blocs. In our name models, like `name_PlackettLuce` or `name_BradleyTerry`, the cohesion parameters operate as follows. Suppose there are two blocs of voters, $X,Y$. We assume that voters from the $X$ bloc have some underlying [preference interval](SCR_preference_intervals.md) $I_{XX}$ for candidates within their bloc, and a different underlying preference interval $I_{XY}$ for the candidates in the opposing bloc. In order to construct one preference interval for $X$ voters, we take $I_{XX}$ and scale it by $\pi_X$, then we take $I_{XY}$ and scale it by $1-\pi_X$, and finally we concatenate the two. As a concrete example, if $\pi_X = .75$, this means that 3/4 of the preference interval for $X$ voters is taken up by candidates from the $X$ bloc, and the other 1/4 by $Y$ candidates.

![](assets/cohesion_parameters.png)

In our slate models, like `slate_PlackettLuce`, the cohesion parameter is used to determine the probability of sampling a particular slate at each position in the ballot. How exactly this is done depends on the model. Then candidate names are filled in afterwards by sampling without replacement from each preference interval.
### Combining Dirichlet and Cohesion

When there are multiple blocs of voters, we need more than one $\alpha$ value for the Dirichlet distribution. Suppose there are two blocs of voters, $X,Y$. Then we need four values, $\alpha_{XX}, \alpha_{XY}, \alpha_{YX}, \alpha_{YY}$. The value $\alpha_{XX}$ determines what kind of preferences $X$ voters will have for $X$ candidates. The value $\alpha_{XY}$ determines what kind of preferences $X$ voters have for $Y$ candidates. We sample preference intervals from the candidate simplex using these $\alpha$ values, and then use cohesion parameters to combine them into a single interval, one for each bloc. This is how [from_params](api.md#ballot-generators) initializes different ballot generator models.
Expand Down
11 changes: 9 additions & 2 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ hide:
rendering:
heading_level: 4

### ::: votekit.pref_interval
rendering:
heading_level: 4

### ::: votekit.pref_profile
rendering:
heading_level: 4
Expand Down Expand Up @@ -39,13 +43,16 @@ hide:
members:
- BallotGenerator
- BallotSimplex
- PlackettLuce
- BradleyTerry
- slate_PlackettLuce
- name_PlackettLuce
- slate_BradleyTerry
- name_BradleyTerry
- AlternatingCrossover
- CambridgeSampler
- OneDimSpatial
- ImpartialCulture
- ImpartialAnonymousCulture
- name_Cumulative

## Elections
### ::: votekit.elections.election_types
Expand Down
9 changes: 7 additions & 2 deletions src/votekit/__init__.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,19 @@
from .ballot_generator import ( # noqa
PlackettLuce,
BradleyTerry,
name_PlackettLuce,
name_BradleyTerry,
BallotSimplex,
ImpartialCulture,
ImpartialAnonymousCulture,
CambridgeSampler,
AlternatingCrossover,
name_Cumulative,
slate_BradleyTerry,
slate_PlackettLuce,
)
from .pref_interval import PreferenceInterval
from .ballot import Ballot # noqa
from .pref_profile import PreferenceProfile # noqa
from .pref_interval import PreferenceInterval # noqa
from .cleaning import ( # noqa
remove_empty_ballots,
deduplicate_profiles,
Expand Down
9 changes: 5 additions & 4 deletions src/votekit/ballot.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,11 @@ class Ballot:
**Attributes**
`ranking`
: list of candidate ranking. Entry i of the list is a set of candidates ranked in position i.
: tuple of candidate ranking. Entry $i$ of the tuple is a frozenset of candidates ranked
in position $i$.
`weight`
: weight assigned to a given a ballot. Defaults to 1.
: (Fraction) weight assigned to a given a ballot. Defaults to 1.
`voter_set`
: optional set of voters who cast a given a ballot.
Expand All @@ -24,7 +25,7 @@ class Ballot:
: optional ballot id.
"""

ranking: list[set] = field(default_factory=list)
ranking: tuple[frozenset, ...] = field(default_factory=tuple)
weight: Fraction = Fraction(1, 1)
voter_set: Optional[set[str]] = None
id: Optional[str] = None
Expand Down Expand Up @@ -62,7 +63,7 @@ def __eq__(self, other):
return True

def __hash__(self):
return hash(str(self.ranking))
return hash(self.ranking)

def __str__(self):
weight_str = f"Weight: {self.weight}\n"
Expand Down
Loading

0 comments on commit a1c2526

Please sign in to comment.