# Generating `PreferenceProfiles`
We have already seen the use of a  `PreferenceProfile` generator (the Impartial Culture Model) in the Plotting and Ballot Graph tutorials. Now, let's dive into the rest that are included in `votekit`:
- Impartial Culture
- Impartial Anonymous Culture
- Plackett Luce
- Bradley Terry
- Alternating Crossover
- OneDimSpatial
- Cambridge Sampler

In [7]:
import votekit.ballot_generator as bg
from votekit.plots.profile_plots import plot_summary_stats

The three simplest to use are the Impartial Culture, Impartial Anonymous Culture, and 1-D spatial models. For $m$ candidates and $n$ voters, the Impartial Culture model generates `PreferenceProfiles` uniformly at random out of the $(m!)^n$ possible profiles. Remember, a `PreferenceProfile` is a tuple of length $n$ that stores a linear ranking in each slot.

The Impartial Anonymous Culture model does the same thing, but treats profiles that are the same up to permutation of the voters as identical. That is, the profile $(A>B, B>A)$ is now identical to $(B>A, A>B)$. It's like the IAC model treats profiles as sets, rather than tuples. 

In [18]:
candidates = ["A", "B", "C"]
number_of_ballots = 50
#Impartial Culture
ic = bg.ImpartialCulture(candidates = candidates)
ic_profile = ic.generate_profile(number_of_ballots)

#Impartial Anonymous Culture
iac = bg.ImpartialAnonymousCulture(candidates = candidates)
iac_profile = iac.generate_profile(number_of_ballots)


The 1-D Spatial model assigns each candidate a random point on the real line according to the standard normal distribution. It then does the same for each voter, and then a voter ranks candidates by their distance from the voter.

In [17]:
one_d = bg.OneDimSpatial(candidates = candidates)
one_d_profile = one_d.generate_profile(number_of_ballots)

To use the other models, we need a bit more information than just the candidates. Suppose there are two blocs (or groups) of voters, $Q$ and $R$. The $Q$ bloc is estimated to be about 70% of the voting population, while the $R$ block is about 30%. Within each bloc there is preference for different candidates, which we record in the variable `pref_interval_by_bloc`. 

In this example, suppose each bloc has two candidates running, but there is some crossover in which some voters from bloc $Q$ actually prefer the candidates from bloc $R$. The $R$ bloc, being much more insular, does not prefer either of $Q$s candidates.

In [19]:
candidates = ["Q1", "Q2", "R1", "R2"]

# presumably tells me the percent of the population in each bloc
bloc_voter_prop = {"Q": 0.7, "R": 0.3}

# within each block, who prefers which candidate
pref_interval_by_bloc = {
    "Q": {"Q1": 0.4, "Q2": 0.3, "R1": 0.2, "R2": 0.1},
    "R": {"Q1": 0, "Q2": 0, "R1": 0.4, "R2": 0.6}
}

For both the Plackett-Luce and Bradley-Terry model, this is now all the information they need to generate profiles.

For each voter, the Plackett-Luce model samples from the list of candidates without replacement according to the distribution defined by that voter's bloc in `pref_interval_by_bloc`.

# HOW DO THEY WORK

In [20]:
# Plackett-Luce
pl = bg.PlackettLuce(pref_interval_by_bloc=pref_interval_by_bloc,
                     bloc_voter_prop=bloc_voter_prop, 
                     candidates=candidates)

pl_profile = pl.generate_profile(number_of_ballots)

# Bradley-Terry
bt = bg.BradleyTerry(pref_interval_by_bloc=pref_interval_by_bloc,
                     bloc_voter_prop=bloc_voter_prop, 
                     candidates=candidates)

bt_profile = bt.generate_profile(number_of_ballots)

print(pl_profile)

ValueError: Fewer non-zero entries in p than size

Alternating Crossover
Cambridge Sampler