## PyPoll

![Vote-Counting](Images/Vote_counting.jpg)

* In this challenge, you are tasked with helping a small, rural town modernize its vote-counting process. (Up until now, Uncle Cleetus had been trustfully tallying them one-by-one, but unfortunately, his concentration isn't what it used to be.)

* You will be give a set of poll data called [election_data.csv](PyPoll/Resources/election_data.csv). The dataset is composed of three columns: `Voter ID`, `County`, and `Candidate`. Your task is to create a Python script that analyzes the votes and calculates each of the following:

  * The total number of votes cast

  * A complete list of candidates who received votes

  * The percentage of votes each candidate won

  * The total number of votes each candidate won

  * The winner of the election based on popular vote.

* As an example, your analysis should look similar to the one below:

  ```text
  Election Results
  -------------------------
  Total Votes: 3521001
  -------------------------
  Khan: 63.000% (2218231)
  Correy: 20.000% (704200)
  Li: 14.000% (492940)
  O'Tooley: 3.000% (105630)
  -------------------------
  Winner: Khan
  -------------------------
  ```

* In addition, your final script should both print the analysis to the terminal and export a text file with the results.

In [1]:
#import required libraries
import pandas as pd
import os
import numpy as np

In [2]:
#set path to current working directory
os.chdir("E:\SAU\ML\Practice\Shiny")
os.getcwd()

'E:\\SAU\\ML\\Practice\\Shiny'

In [3]:
data = pd.read_csv("homework_03-Python_Instructions_PyPoll_Resources_election_data.csv")
data.head()

Unnamed: 0,Voter ID,County,Candidate
0,12864552,Marsh,Khan
1,17444633,Marsh,Correy
2,19330107,Marsh,Khan
3,19865775,Queen,Khan
4,11927875,Marsh,Khan


In [4]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3521001 entries, 0 to 3521000
Data columns (total 3 columns):
Voter ID     int64
County       object
Candidate    object
dtypes: int64(1), object(2)
memory usage: 80.6+ MB


In [5]:
data.describe()

Unnamed: 0,Voter ID
count,3521001.0
mean,15000820.0
std,2886765.0
min,10000000.0
25%,12500720.0
50%,15001980.0
75%,17500100.0
max,20000000.0


In [6]:
data.shape

(3521001, 3)

In [7]:
#The total number of votes cast
total_votes = data['Voter ID'].count()
total_votes

3521001

In [8]:
#A complete list of candidates who received votes
#a = list(filter(lambda x: x != data["Voter ID"].isna(), data["Candidate"]))
candid = list(data["Candidate"].unique())
candid

['Khan', 'Correy', 'Li', "O'Tooley"]

In [9]:
#The percentage of votes each candidate won
percent_votes = (data["Candidate"].value_counts()/data["Voter ID"].count())*100
percentage_votes = dict(percent_votes.round())
percentage_votes

{'Correy': 20.0, 'Khan': 63.0, 'Li': 14.0, "O'Tooley": 3.0}

In [10]:
#The total number of votes each candidate won
votes_per_candidate = dict(data["Candidate"].value_counts())
votes_per_candidate

{'Correy': 704200, 'Khan': 2218231, 'Li': 492940, "O'Tooley": 105630}

In [11]:
#The winner of the election based on popular vote.
win = max(votes_per_candidate.items(), key = lambda k:k[1])
win

('Khan', 2218231)

In [12]:
#desired output  
print("Election Results")
print("------------------------")
print("Total Votes: ",total_votes)
print("------------------------")
print("Percentage votes per candidate: ",percentage_votes)
print("Number of votes for each candidate: ",votes_per_candidate)
print("------------------------")
print("Winner: ",win[0])
print("------------------------")

Election Results
------------------------
Total Votes:  3521001
------------------------
Percentage votes per candidate:  {'Khan': 63.0, 'Correy': 20.0, 'Li': 14.0, "O'Tooley": 3.0}
Number of votes for each candidate:  {'Khan': 2218231, 'Correy': 704200, 'Li': 492940, "O'Tooley": 105630}
------------------------
Winner:  Khan
------------------------


In [13]:
output = pd.DataFrame({"Total Votes" : [total_votes], "Percentage votes per candidate" : [percentage_votes], 
                       "Number of votes for each candidate" : [votes_per_candidate],
                             "Winner of the election" : win[0]})
output

Unnamed: 0,Number of votes for each candidate,Percentage votes per candidate,Total Votes,Winner of the election
0,"{'Khan': 2218231, 'Correy': 704200, 'Li': 4929...","{'Khan': 63.0, 'Correy': 20.0, 'Li': 14.0, 'O'...",3521001,Khan


In [14]:
output.to_csv("Polls_output.txt", index = False, header = True)