In this challenge, you are tasked with helping a small, rural town modernize its vote-counting process. (Up until now, Uncle Cleetus had been trustfully tallying them one-by-one, but unfortunately, his concentration isn't what it used to be.)

You will be give a set of poll data called election_data.csv. The dataset is composed of three columns: Voter ID, County, and Candidate. Your task is to create a Python script that analyzes the votes and calculates each of the following:


The total number of votes cast
A complete list of candidates who received votes
The percentage of votes each candidate won
The total number of votes each candidate won
The winner of the election based on popular vote.


As an example, your analysis should look similar to the one below:


  Election Results
  -------------------------
  Total Votes: 3521001
  -------------------------
  Khan: 63.000% (2218231)
  Correy: 20.000% (704200)
  Li: 14.000% (492940)
  O'Tooley: 3.000% (105630)
  -------------------------
  Winner: Khan
  -------------------------

In addition, your final script should both print the analysis to the terminal and export a text file with the results.

In [1]:
import pandas as pd
df = pd.read_csv("Resources/election_data.csv")

In [2]:
df.head()

Unnamed: 0,Voter ID,County,Candidate
0,12864552,Marsh,Khan
1,17444633,Marsh,Correy
2,19330107,Marsh,Khan
3,19865775,Queen,Khan
4,11927875,Marsh,Khan


In [5]:
df['Candidate'].value_counts()

Khan        2218231
Correy       704200
Li           492940
O'Tooley     105630
Name: Candidate, dtype: int64

In [5]:
print(f"Total votes: {df['Voter ID'].nunique()}")
total_votes = df['Voter ID'].nunique()

Total votes: 3521001


In [6]:
df.Candidate.unique()

array(['Khan', 'Correy', 'Li', "O'Tooley"], dtype=object)

In [7]:
Khan = df.Candidate[df.Candidate == 'Khan'].count()
Correy = df.Candidate[df.Candidate == 'Correy'].count()
Li = df.Candidate[df.Candidate == 'Li'].count()
OTooley = df.Candidate[df.Candidate == "O'Tooley"].count()

In [8]:
print(f"Khan: {round(Khan/total_votes*100, 3)}% ({Khan})")
print(f"Correy: {round(Correy/total_votes*100, 3)}% ({Correy})")
print(f"Li: {round(Li/total_votes*100, 3)}% ({Li})")
print(f"O'Tooley: {round(OTooley/total_votes*100, 3)}% ({OTooley})")

Khan: 63.0% (2218231)
Correy: 20.0% (704200)
Li: 14.0% (492940)
O'Tooley: 3.0% (105630)


In [10]:
candidates = df.Candidate.unique()
candidate_data = [{"name":candidate,"vote_count":df.Candidate[df.Candidate == candidate].count()} for candidate in candidates]
candidate_data

[{'name': 'Khan', 'vote_count': 2218231},
 {'name': 'Correy', 'vote_count': 704200},
 {'name': 'Li', 'vote_count': 492940},
 {'name': "O'Tooley", 'vote_count': 105630}]

In [11]:
def print_info(candidate_data, total_votes):
    print(f"{candidate_data['name']}: {round(candidate_data['vote_count']/total_votes*100, 3)}% ({candidate_data['vote_count']})")

In [12]:
[print_info(candidate, total_votes) for candidate in candidate_data]

Khan: 63.0% (2218231)
Correy: 20.0% (704200)
Li: 14.0% (492940)
O'Tooley: 3.0% (105630)


[None, None, None, None]