<a href="https://colab.research.google.com/github/mggg/Training_Materials/blob/main/notebooks/technical/Tech_3_batching/Single_runs/simple_run_election.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install votekit

# Running an Election in VoteKit

As it turns out, once you know how to generate a `PreferenceProfile` object running an election
is pretty straightforward. Whenever you call the constructor for an `Election` object
the entire election will actually be run, and you will be able to access the results
instantly.

In [4]:
from gerrychain import Graph
import jsonlines as jl
import votekit.elections as elec
from votekit import PreferenceProfile
import votekit.ballot_generator as bg

In [21]:
bloc_voter_prop = {
    "W": 0.2,
    "C": 0.8
}
cohesion_parameters = {
    "W": {
        "W": 0.7,
        "C": 0.3
    },
    "C": {
        "W": 0.7,
        "C": 0.3
    }
}
alphas = {
    "W": {
        "W": 1,
        "C": 1
    },
    "C": {
        "W": 1,
        "C": 1
    }
}
slate_to_candidates = {
    "W": [
        "Whiteman",
        "Whisky",
        "Whence"
    ],
    "C": [
        "Candy",
        "Claire"
    ]
}

In [22]:
profile = bg.slate_PlackettLuce.from_params(
    bloc_voter_prop=bloc_voter_prop,
    cohesion_parameters=cohesion_parameters,
    alphas=alphas,
    slate_to_candidates=slate_to_candidates
).generate_profile(
    number_of_ballots=1000
)

In [39]:
election = elec.STV(profile, m=2)

In [42]:
?elec.STV

[31mInit signature:[39m
elec.STV(
    profile: votekit.pref_profile.pref_profile.PreferenceProfile,
    m: int = [32m1[39m,
    transfer: Callable[[str, float, Union[tuple[votekit.ballot.Ballot], list[votekit.ballot.Ballot]], int], tuple[votekit.ballot.Ballot, ...]] = <function fractional_transfer at [32m0x150028360[39m>,
    quota: str = [33m'droop'[39m,
    simultaneous: bool = [38;5;28;01mTrue[39;00m,
    tiebreak: Optional[str] = [38;5;28;01mNone[39;00m,
)
[31mDocstring:[39m     
STV elections. All ballots must have no ties.

Args:
    profile (PreferenceProfile):   PreferenceProfile to run election on.
    m (int, optional): Number of seats to be elected. Defaults to 1.
    transfer (Callable[[str, float, Union[tuple[Ballot], list[Ballot]], int], tuple[Ballot,...]], optional):
    Transfer method. Defaults to fractional transfer.
        Function signature is elected candidate, their number of first-place votes, the list of
        ballots with them ranked first, and

In any multi-round election, you can then get all of the information about
how the election progressed by accessing the `election_state` attribute of the
`Election` object.

In [40]:
election

              Status  Round
Whisky       Elected      1
Whence       Elected      4
Candy      Remaining      4
Whiteman  Eliminated      3
Claire    Eliminated      2

In [25]:
election.get_elected()

(frozenset({'Whisky'}), frozenset({'Claire'}), frozenset({'Whence'}))

In [26]:
for i in range(4):
    print()
    print(election.election_states[i])
    print(election.election_states[i].elected)
    print(election.election_states[i].remaining)
    print(election.election_states[i].scores)


ElectionState(round_number=0, remaining=(frozenset({'Whisky'}), frozenset({'Claire'}), frozenset({'Whence'}), frozenset({'Whiteman'}), frozenset({'Candy'})), elected=(frozenset(),), eliminated=(frozenset(),), tiebreaks={}, scores={'Whisky': np.float64(393.0), 'Whiteman': np.float64(157.0), 'Whence': np.float64(163.0), 'Candy': np.float64(66.0), 'Claire': np.float64(221.0)})
(frozenset(),)
(frozenset({'Whisky'}), frozenset({'Claire'}), frozenset({'Whence'}), frozenset({'Whiteman'}), frozenset({'Candy'}))
{'Whisky': np.float64(393.0), 'Whiteman': np.float64(157.0), 'Whence': np.float64(163.0), 'Candy': np.float64(66.0), 'Claire': np.float64(221.0)}

ElectionState(round_number=1, remaining=(frozenset({'Claire'}), frozenset({'Whence'}), frozenset({'Whiteman'}), frozenset({'Candy'})), elected=(frozenset({'Whisky'}),), eliminated=(frozenset(),), tiebreaks={}, scores={'Whiteman': np.float64(204.6946564885496), 'Whence': np.float64(220.08905852417303), 'Candy': np.float64(73.94910941475827), 

So, all we now need to do is figure out a good way of generating a lot of samples
from a lot of different settings. Gathering a lot of samples is easy: just
run the ballot generator and the election a bunch of times and then save the results.

In [27]:
from tqdm.notebook import tqdm
with jl.open('election_results.jsonl', 'w') as writer:
    # for _ in range(10):
    for _ in tqdm(range(30)):
        profile = bg.slate_PlackettLuce.from_params(
            bloc_voter_prop=bloc_voter_prop,
            cohesion_parameters=cohesion_parameters,
            alphas=alphas,
            slate_to_candidates=slate_to_candidates
        ).generate_profile(
            number_of_ballots=10000
        )
        election = elec.STV(profile, m=3)

        writer.write({
            "winners": [winner for winner_set in election.get_elected() for winner in winner_set],
        })

  0%|          | 0/30 [00:00<?, ?it/s]

So the question then becomes, how do we make better predictions?
Well, the first thing that we need to do is gather some information about
the location in question. We'll look at our dual graph file in this
notebook, but commonly, you'll need something like census data to agument
your work.

In [28]:
graph = Graph.from_json("../../../../data/gerrymandria.json")

In [29]:
graph.nodes[0]

{'TOTPOP': 1,
 'x': 0,
 'y': 0,
 'county': '1',
 'district': '1',
 'precinct': 0,
 'muni': '1',
 'boundary_node': True,
 'boundary_perim': 1,
 'water_dist': '2',
 'WVAP': 0.8999532942809672,
 'POCVAP': 0.10004670571903285,
 'dem_percent': 0.4409831573065591,
 'rep_percent': 0.5590168426934409}

A good starting point for us here is to just get an estimate on the
state-wide POCVAP and WVAP values.

WARNING!!! You should not do this in general! Talk to the people in the location you are trying to model to get better numbers! This is an (admittedly) ad-hoc way of giving us a starting point. We also anticipate that the turnout proportions are going to be one of the things that we vary as we create models for a specific place.

In [30]:
wvap_total = sum(d["WVAP"] for _, d in graph.nodes(data=True))
pocvap_total = sum(d["POCVAP"] for _, d in graph.nodes(data=True))
total_pop = sum(d["TOTPOP"] for _, d in graph.nodes(data=True))
print(f"Total WVAP: {wvap_total}")
print(f"\tTotal WCAP %: {wvap_total / total_pop * 100:.2f}")
print(f"Total POCVAP: {pocvap_total}")
print(f"\tTotal POCAP %: {pocvap_total / total_pop * 100:.2f}")

Total WVAP: 47.91742737423775
	Total WCAP %: 74.87
Total POCVAP: 16.08257262576225
	Total POCAP %: 25.13


These would be good starting points for some of the parameters for our
ballot generator, namely the `bloc_voter_prop` parameter.

In [31]:
bloc_voter_prop = {
    "W": 0.75,
    "C": 0.25
}
cohesion_parameters = {
    "W": {
        "W": 0.7,
        "C": 0.3
    },
    "C": {
        "W": 0.7,
        "C": 0.3
    }
}
alphas = {
    "W": {
        "W": 1,
        "C": 1
    },
    "C": {
        "W": 1,
        "C": 1
    }
}
slate_to_candidates = {
    "W": [
        "Whiteman",
        "Whisky",
        "Whence"
    ],
    "C": [
        "Candy",
        "Claire"
    ]
}

In [32]:
ballot_generator_kwargs = dict(
    bloc_voter_prop=bloc_voter_prop,
    cohesion_parameters=cohesion_parameters,
    alphas=alphas,
    slate_to_candidates=slate_to_candidates
)

In [33]:
with jl.open('election_results2.jsonl', 'w') as writer:
    # for _ in range(10):
    for _ in tqdm(range(30)):
        profile = bg.slate_PlackettLuce.from_params(
            **ballot_generator_kwargs
        ).generate_profile(
            number_of_ballots=10000
        )
        election = elec.STV(profile, m=3)

        writer.write({
            "winners": [winner for winner_set in election.get_elected() for winner in winner_set],
        })

  0%|          | 0/30 [00:00<?, ?it/s]

Okay, this is a great starting point, but putting together a bunch of `for`
loops in a single notebook is difficult to read, audit, and scale.
So we are going to need a better way to do and organize this if we want
to be able to keep track of all of the information we are generating.