# 2011 VA House of Delegates Election

## Election Results

These precinct-level election results come directly from the Virginia Department of Elections, and they require significant cleaning. 

In [44]:
library(sf)
library(ggplot2)
library(dplyr)
library(tibble)
library(magrittr)

df <- read.csv(file = "../data/official-VA-2005-2019/2011-general.csv")
head(df, 1)
print(nrow(df))

Unnamed: 0_level_0,CandidateUid,FirstName,MiddleName,LastName,Suffix,TOTAL_VOTES,Party,WriteInVote,LocalityUid,LocalityCode,...,PrecinctName,DistrictUid,DistrictType,DistrictName,OfficeUid,OfficeTitle,ElectionUid,ElectionType,ElectionDate,ElectionName
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<chr>,<int>,<chr>,<int>,...,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>
1,,,,WRITE IN VOTES,,1,,1,{15B7E141-2D1D-44C2-A50A-AAE021BC9B7D},1,...,# AB - Central Absentee Precinct,,,,{60E9BA28-D184-4DAD-9A44-1F405822F9F4},Commissioner of Revenue,{EB178FD6-875D-4B0D-A295-900A0482F523},General,2011-11-08 00:00:00,2011 November General


[1] 57228


Now, this csv file includes the results by precinct for every single election that took place in Virginia in 2011 at 11, so I have some serious filtering to do. Currently, there are around 57,000 records. 

Filters:
- `DistrictType` = "House of Delegates" -> 7002 records
- `Party` = Democratic or Republican (sorry third parties) -> 3868 records
- `PrecinctName` != "# AB - Central Absentee Precinct" or "## Provisional" -> 3376 records
    - Provisional ballots and absentee ballots aren't assigned a precinct, so I can't use them to measure precinct-level election results

In [45]:
#df <- df[df$DistrictType == "House of Delegates",]
df <- df %>% 
    filter(DistrictType == "House of Delegates") %>%
    filter(Party %in% c("Democratic", "Republican")) %>%
    filter(!(PrecinctName %in% c("# AB - Central Absentee Precinct", "## Provisional")))
print(nrow(df))

[1] 3376


Now I have 3376 records, where each record is one candidate running in one precinct. What I would like to do is produce a pivot table, where:
- index = `PrecinctName`
- columns
    - `G11DHOD` = all votes for Democratic candidates in that precinct
    - `G11RHOD` = all votes for Republican candidates in that precinct

In [46]:
df_precinct <- df %>%
    group_by(PrecinctName) %>%
    summarise(G11DHOD = sum(TOTAL_VOTES[Party == "Democratic"]),
              G11RHOD = sum(TOTAL_VOTES[Party == "Republican"])) %>%
    distinct() %>%
    write.csv("../mcmc/va-official-2011/2011-precinct-results.csv")
print(df_precinct)

`summarise()` ungrouping output (override with `.groups` argument)



NULL


Ok, so now I've calculated the votes for the respective candidates by precinct. The next step will be to add the population and voting-age population by precinct, using the IPUMS. 