# Data Truth & Joining Datasets
### Stephen Alger 
### March 5 2021
---
### Visualisation Objective: Irish General Election Results 2016
> Data Source(s): <br>
> https://data.gov.ie/dataset/candidate-details-for-general-election-2016 <br>
> https://data.gov.ie/dataset/general-election-2016-constituency-details <br>
> https://data.gov.ie/dataset/general-election-2016-count-details <br>
---


In [11]:
#Dependency Installation & Loading
install.packages("sqldf")
install.packages("cowplot")
install.packages("viridis")
install.packages("tidyverse")

library(viridis)
library(tidyverse)
library(cowplot)
library(sqldf)

#Set Environment Plot Size:
options(repr.plot.width=25, repr.plot.height=10)

#Define my Own Theme                           
my_Theme <- theme(plot.title = element_text(face="bold",size=40),
                  plot.subtitle = element_text(face="italic", size = 20),
                  axis.title = element_text(face="bold",size=20),
                  axis.text = element_text(size=20),
                  plot.caption = element_text(size = 16),
                  legend.title = element_text(size = 16),
                  legend.text  = element_text(size = 14))


The downloaded binary packages are in
	/var/folders/tj/n0crv2hj7zj5vxqbn5kjllf80000gn/T//Rtmp7WHzWS/downloaded_packages

The downloaded binary packages are in
	/var/folders/tj/n0crv2hj7zj5vxqbn5kjllf80000gn/T//Rtmp7WHzWS/downloaded_packages

The downloaded binary packages are in
	/var/folders/tj/n0crv2hj7zj5vxqbn5kjllf80000gn/T//Rtmp7WHzWS/downloaded_packages

The downloaded binary packages are in
	/var/folders/tj/n0crv2hj7zj5vxqbn5kjllf80000gn/T//Rtmp7WHzWS/downloaded_packages


---
# Utility Work: Load Data & Investigate The Dataset Charavcteristics


In [12]:
# Input File Names:
candidateFilePath = "./data/GE2016-candidate-details.csv"
constituencyFilePath = "./data/GE2016constituencydetails.csv"
electionCountFilePath = "./data/GE2016-count-details.csv"

In [37]:
#Candidate Data Set CSV into canddf - candidate dataframe
canddf <- read.csv(candidateFilePath, sep = ",", fileEncoding="latin1")
head(canddf)

#Check for NA Values - None Returned
which(is.na(canddf)) 

Unnamed: 0_level_0,Constituency,Surname,First.Name,Gender,Party,Party.Abbreviation,Count.Number,Required.To.Reach.Quota,Required.To.Save.Deposit,Votes,Result,Candidate.Id,Constituency.Number,Constituency.Ainm
Unnamed: 0_level_1,<fct>,<fct>,<fct>,<fct>,<fct>,<fct>,<int>,<int>,<int>,<int>,<fct>,<int>,<int>,<fct>
1,Galway West,Ó Cuív,Éamon,M,Fianna Fáil,F.F.,14,0,0,9539,Elected,165,23,Gaillimh Thiar
2,Louth,Adams,Gerry,M,Sinn Féin,S.F.,11,0,0,10661,Elected,160,31,Lú
3,Cork East,Ahern,Barbara,F,Fianna Fáil,F.F.,10,3781,0,4594,,169,4,Corcaigh Thoir
4,Waterford,Ahmed,Sheik Mohiuddin,M,Non-Party,NON-P.,9,0,2443,140,Excluded,160,38,Port Lairge
5,Dublin Mid-West,Akpoveta,Patrick,M,Non-Party,NON-P.,12,0,1813,288,Excluded,187,643,Baile Átha Cliath Thiar-Meán
6,Tipperary,Ambrose,Siobhán,F,Fianna Fáil,F.F.,7,0,0,4472,Excluded,160,36,Tiobraid Árann


---
# Part One - Candidate Data Observation

In [100]:
# 1-a) SQL Query Style Execution: Get First&Surnames of Candidates in The Wexford Constituency 👍
wexfordCandidates = sqldf('Select Constituency, Surname,"First.Name" from canddf where Constituency = "Wexford"')
wexfordCandidates

Constituency,Surname,First.Name
<fct>,<fct>,<fct>
Wexford,Browne,James
Wexford,Byrne,Aoife
Wexford,Byrne,Malcolm
Wexford,Carthy,Ger
Wexford,D'arcy,Michael
Wexford,Dwyer,John
Wexford,Foxe,Caroline
Wexford,Hogan,Julie
Wexford,Howlin,Brendan
Wexford,Kehoe,Paul


In [75]:
# 1-b) Get The Number of Candidates for the Laois Constituency: 👍
countLaoisCandidates = sqldf('Select COUNT(*) from canddf where Constituency = "Laois"')
cat("The Laois Constituency Had:", countLaoisCandidates[1,1], "Candidates in the 2016 General Election.")

The Laois Constituency Had: 6 Candidates in the 2016 General Election.

In [94]:
# 1-c) Get Total Number of Constituencies
countConstituenciesA = sqldf('Select DISTINCT Constituency from canddf ORDER BY Constituency ASC')
cat("In the 2016 General Election there were", nrow(countConstituenciesA), "Constituencies - canddf")
countConstituenciesA

In the 2016 General Election there were 40 Constituencies - canddf

Constituency
<fct>
Carlow-Kilkenny
Cavan-Monaghan
Clare
Cork East
Cork North-Central
Cork North-West
Cork South-Central
Cork South-West
Donegal
Dublin Bay North


---
# Part Two - Constituency Data Observation

In [101]:
#Constituency Data Set CSV into constdf - constituency dataframe
constdf <- read.csv(constituencyFilePath, sep = ",", fileEncoding="latin1")
head(constdf)

#Check for NA Values - None Returned
which(is.na(canddf)) 

Unnamed: 0_level_0,Constituency.Name,Constituency.Ainm,Count.Number,Date.Of.Election,Number.Of.Candidates,Number.of.Seats,Quota,Required.Save.Deposit,Seats.Filled,Seats.in.Constituency,Spoiled,Total.Electorate,Total.Poll,Valid.Poll,Constituency.Number
Unnamed: 0_level_1,<fct>,<fct>,<int>,<fct>,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<int>
1,Carlow Kilkenny,Ceatharlach-Cill Chainnigh,11,26/02/2016,15,5,11669,2918,5,5,505,107023,70514,70009,1
2,Cavan Monaghan,An Cabhán-Muineachán,10,26/02/2016,15,4,11931,2983,4,4,598,90618,60248,59650,2
3,Clare,An Clár,12,26/02/2016,16,4,11401,2851,4,4,407,83660,57407,57000,3
4,Cork East,Corcaigh Thoir,10,26/02/2016,15,4,10562,2641,4,4,445,83236,53251,52806,4
5,Cork North Central,Corcaigh Thuaidh-Lár,11,26/02/2016,14,4,10235,2559,4,4,516,81609,51690,51174,5
6,Cork North West,Corcaigh Thiar Thuaidh,9,26/02/2016,13,3,11740,2936,3,3,395,67589,47353,46958,6


In [95]:
# Get the number of constituencies in this dataset -> constituency dataset
countConstituenciesB = sqldf('Select DISTINCT "Constituency.Name" from constdf ORDER BY "Constituency.Name" ASC')
countConstituenciesB
cat("In the 2016 General Election there were", nrow(countConstituenciesB), "Constituencies - constdf")

Constituency.Name
<fct>
Carlow Kilkenny
Cavan Monaghan
Clare
Cork East
Cork North Central
Cork North West
Cork South Central
Cork South West
Donegal
Dublin Bay North


In the 2016 General Election there were 40 Constituencies - constdf

In [96]:
dfCompareConstituency$CandidateSource <- countConstituenciesA

ERROR: Error in dfCompareConstituency$CandidateSource <- countConstituenciesA: object 'dfCompareConstituency' not found


In [9]:
#Count Data Set CSV into countdf - election count dataframe
countdf <- read.csv(electionCountFilePath, sep = ",", fileEncoding="latin1")
head(countdf)

Unnamed: 0_level_0,Constituency.Name,Candidate.surname,Candidate.First.Name,Result,Count.Number,Non_Transferable,Occurred.On.Count,Required.To.Reach.Quota,Required.To.Save.Deposit,Transfers,Votes,Total.Votes,Constituency.Number,Candidate.Id
Unnamed: 0_level_1,<fct>,<fct>,<fct>,<fct>,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<int>
1,Louth,Adams,Gerry,,1,0,0,594,0,0,10661,10661,31,160
2,Louth,Adams,Gerry,,2,0,0,481,0,113,10661,10774,31,160
3,Louth,Adams,Gerry,,3,0,0,429,0,52,10661,10826,31,160
4,Louth,Adams,Gerry,,4,0,0,347,0,82,10661,10908,31,160
5,Louth,Adams,Gerry,,5,0,0,159,0,188,10661,11096,31,160
6,Louth,Adams,Gerry,Elected,7,0,6,0,0,0,10661,11278,31,160
