# Profiles and Elections

A profile is a collection of ballots. An election is the rule/algorithm that converts the collection of ballots into a set of elected candidates. Colloquially, these two are rarely distinguished.


But mathematically speaking these are two separate parts of a voting pipeline. We can take the same profile and run it through many different election rules to see how the rules change the outcome. Similarly, we can feed many different profiles through the same election rule to see how properties of the profiles affect the elected candidates. 

In this session we will learn how to use some of the ballot generator models built into VoteKit to create profiles of ballots, and then we will feed them through some common elections methods.

## Generating Profiles

As we saw earlier, there are many models for generating ballots available to us, and these models have different knobs we can turn to model different voter behaviors. Here is how we can implement some of the most common in VoteKit.

The slate-Plackett-Luce (sPL) model models "impulsive" voters. These are voters who rank their candidates only by considering how much they like each individual candidate. Later we will see a model for "deliberative" voters. Both use the same set up, so we will focus on sPL first.

The first parameter we need to decide is the number and size of our voting blocs. This could be Democract/Republican, White/POC, or some other breakdown of the voters. You aren't limited to just two blocs, although the number of parameters ends up growing a LOT when you start adding more blocs, so we will stick with two for now.

In [None]:
from votekit.ballot_generator import slate_PlackettLuce


bloc_voter_prop = {"X": .8, "Y": .2}



In [None]:
slate_to_candidates = {"X": ["X1", "X2"],
                        "Y": ["Y1", "Y2"]}

In [None]:
# the values of .9 indicate that these blocs are highly polarized;
# they prefer their own candidates much more than the opposing slate
cohesion_parameters = {"X": {"X":.9, "Y":.1},
                        "Y": {"Y":.9, "X":.1}}



In [None]:
alphas = {"X": {"X":2, "Y":1},
                    "Y": {"X":1, "Y":.5}}



# the from_params method allows us to sample from
# the Dirichlet distribution for our intervals
pl = slate_PlackettLuce.from_params(slate_to_candidates=slate_to_candidates,
          bloc_voter_prop=bloc_voter_prop,
          cohesion_parameters=cohesion_parameters,
          alphas=alphas)

## Loading Profiles 

VoteKit allows you to load profiles of ballots in from other sources. You have the ability to:
- load Scottish election csv files from https://github.com/mggg/scot-elex
- load csv files where each row represents a ballot and each column of the csv represents a ranking position
- load PreferenceProfiles made with VoteKit

In all cases, the file you want to load needs to be in the working directory of your Python file/notebook.
Here are a few examples!

### Scottish

Go to https://github.com/mggg/scot-elex and download a Scottish election of your choice! Put the csv file in your working directory, and change the name of the file below to match the name of your file.

In [1]:
from votekit.cvr_loaders import load_scottish

scottish_profile, num_seats, cand_list, cand_to_party, ward = load_scottish("west_dunbartonshire_2017_ward2.csv")

print(f"Election in ward {ward}")
print(f"Number of seats for election: {num_seats}")
print()
print("With the following candidates:")
for cand in cand_list:
    print(cand)


print("\nThe candidates have the following party IDs:")
for cand, party in cand_to_party.items():
    print(f"{cand} is identified with {party}.")

print(f"\nThe first 10 ballots of the profile are:")
print(scottish_profile.df.head(10).to_string())


Election in ward Leven
Number of seats for election: 4

With the following candidates:
Jim Bollan
Ian Dickson
George Drummond
Caroline Mcallister
Michelle Marie Mcginty
John Kelly Millar
Peter Parlane
Sean Quinn

The candidates have the following party IDs:
Jim Bollan is identified with West Dunbartonshire Community (WDuns).
Ian Dickson is identified with Scottish National Party (SNP).
George Drummond is identified with Liberal Democrat (LD).
Caroline Mcallister is identified with Scottish National Party (SNP).
Michelle Marie Mcginty is identified with Labour (Lab).
John Kelly Millar is identified with Labour (Lab).
Peter Parlane is identified with Conservative and Unionist Party (Con).
Sean Quinn is identified with Green (Gr).

The first 10 ballots of the profile are:
                 Ranking_1      Ranking_2              Ranking_3                 Ranking_4                 Ranking_5            Ranking_6          Ranking_7          Ranking_8 Voter Set Weight
Ballot Index               

### Load csv with ballot rows and ranking columns 
 
Go to https://github.com/mggg/VoteKit/tree/main/examples/data and download the Minnesota 2013 mayoral election cast vote record. Put it in your working directory. Open the file and notice that each row is a ballot, and each column represents one position of the ranking. Voters were allowed to rank 3 candidates.

In [2]:
from votekit.cvr_loaders import load_csv

mn_profile = load_csv("mn_2013_cast_vote_record.csv")

print(f"The maximum number of candidates you could rank was {mn_profile.max_ranking_length}.\n")

print(f"The candidates for the Minnesota 2013 mayoral race were:\n")

for cand in mn_profile.candidates:
    print(cand)

The maximum number of candidates you could rank was 3.

The candidates for the Minnesota 2013 mayoral race were:

ABDUL M RAHAMAN "THE ROCK"
DAN COHEN
JAMES EVERETT
MARK V ANDERSON
TROY BENJEGERDES
undervote
ALICIA K. BENNETT
BETSY HODGES
MARK ANDREW
MIKE GOULD
BILL KAHN
BOB FINE
CAM WINTON
DON SAMUELS
JACKIE CHERRYHOMES
JEFFREY ALAN WAGNER
JOHN LESLIE HARTWIG
KURTIS W. HANNA
JOSHUA REA
MERRILL ANDERSON
NEAL BAXTER
STEPHANIE WOODRUFF
UWI
BOB "AGAIN" CARNEY JR
TONY LANE
CAPTAIN JACK SPARROW
GREGG A. IVERSON
JAMES "JIMMY" L. STROUD, JR.
JAYMIE KELLY
CYD GORMAN
EDMUND BERNARD BRUYERE
DOUG MANN
CHRISTOPHER ROBIN ZIMMERMAN
RAHN V. WORKCUFF
JOHN CHARLES WILSON
OLE SAVIOR
overvote
CHRISTOPHER CLARK


Let's take a moment to make a quick edit to this profile, and then save the result to a "pickle" file, which is a way of storing Python data. This is super useful, because raw cast vote records often need processing to be run through election methods, and the "pickle" format allows us to do our processing once, and then use the saved profile from then on.

Notice that in the candidate list above, there are three strange names: "overvote", "undervote", and "UWI". In MN, when a voter put more than one candidate in a ranking position it was recorded as "overvote". If they skipped a slot, it was an "undervote". If they listed an unregistered write-in candidate, it was recorded as "UWI". The IRV election that was run on this profile needs those three "candidates" scrubbed from the ballots before it can be run. VoteKit includes some cleaning functions that make this easy.


In [3]:
from votekit.cleaning import remove_and_condense

# remove deletes the listed "candidates" from each ballot
# condense moves up any lower ranked candidates
cleaned_mn_profile = remove_and_condense(["overvote", "undervote", "UWI"], mn_profile)

print(f"The difference between the raw profile's candidate list and the cleaned profile's list is {set(mn_profile.candidates)-set(cleaned_mn_profile.candidates)}.")

The difference between the raw profile's candidate list and the cleaned profile's list is {'undervote', 'UWI', 'overvote'}.


And now we can save the processed profile. This allows us to load it back in later.

In [5]:
from votekit.pref_profile import PreferenceProfile

cleaned_mn_profile.to_pickle("cleaned_mn_profile.pkl")

loaded_mn_profile = PreferenceProfile.from_pickle("cleaned_mn_profile.pkl")

print(f"\nDoes the loaded profile match the cleaned profile: {loaded_mn_profile == cleaned_mn_profile}")
print(f"\nDoes the loaded profile match the uncleaned profile: {loaded_mn_profile == mn_profile}")



Does the loaded profile match the cleaned profile: True

Does the loaded profile match the uncleaned profile: False


## Running Elections
