## Choosing Congressional Candidates

We're going to do some work with congressional candidates this semester. One thing we'll need is some work building up the websites (and other covariates) for the candidates. Let's divide and conquer.

In [None]:
# best practice is to do all imports at the beginning. 
from random import sample, choices, seed
from collections import Counter

file_list_of_districts = "district_list.txt"
output_file_name = "district_assignments.txt"

class_members = ["alex","anna","tony","john",
                 "kailey","kaixuan","michelle",
                 "natalie","patrick","bobby",
                 "thomas","will"]

Let's read this into a list. Note the use of `next` in the cell below--let's talk about what it does. 

In [None]:
district_list = list()
state_list = list()

with open(file_list_of_districts) as f :
    next(f)
    for item in f :
        district_list.append(item.strip()) # Question: what does strip do?
        state_list.append(item[:2])

The `Counter` collection gives us an easy way to sum up items in a dictionary. 

In [None]:
state_count = Counter(state_list)

In [None]:
state_count.most_common(10)

Now, we want to give everyone the same number (roughly) of representatives. Let's figure out how many there are. Replace the ?? with code to get the answer.

In [None]:
num_students = len(class_members)
num_districts = len(district_list)

num_per_member = num_districts/num_students

Now that we've got that, let's look at what `sample` does. 

In [None]:
sample(k=round(num_per_member),population=district_list)

---

There are a few different ways we could allocate districts to the members of our class. I'm going to illustrate two of them, just so you can compare and contrast. Both ways just sample our class with replacement, one building it up from scratch, the other using the `choice` function. 

### First Assignment Approach

In [None]:
# Setting the random seed ensures we get the same output
seed(20180903)

In [None]:
district_assignment = [0] * len(district_list) # make an empty list of the approprite length.

for idx in range(len(district_assignment)) :
    district_assignment[idx] = sample(k=1,population=class_members)[0] # sample returns a list of length k

Now we can use `zip` to merge the two. 

In [None]:
assignments = dict(zip(district_list,district_assignment))

Let's look at the first 10 items in this dictionary. We'll use our `enumerate` trick. 

In [None]:
for idx, dist in enumerate(assignments) :
    print("".join([assignments[dist], 
                  " will get information for ",
                  dist]))
    
    if idx == 9 :
        break

And now we'll just write out the results. 

In [None]:
with open(output_file_name,'w') as ofile : # note 'w' flag
    ofile.write("\t".join(["district","class_member"]) + "\n")
    
    for dist, person in assignments.items() :
        ofile.write("\t".join([dist,person]) + "\n")
        

### Second Assignment Approach

We can do do the same thing using `choice` and save some typing. I'll keep the variable names the same so you can trace what's going on. 

In [None]:
district_assignment = choices(k=len(district_list),population=class_members)

assignments = dict(zip(district_list,district_assignment))

with open(output_file_name,'w') as ofile : # note 'w' flag
    ofile.write("\t".join(["district","class_member"]) + "\n")
    
    for dist, person in assignments.items() :
        ofile.write("\t".join([dist,person]) + "\n")

Let's use `Counter` to see how many districts each of us got. 

In [None]:
Counter(district_assignment)

Somewhat clumpy, but not the end of the world. If you've got some free time, try to figure out how you'd randomly assign districts to people with everyone having roughly the same number of people.