# Programming for Data Analysis - Project
***

This notebook will contain my submission for the project piece of the assessment in the Programming for Data Analysis module, Winter 2021.

<br>

### Modules Required
***

In [1]:
# NumPy for numerical operations
import numpy as np

# Pandas
import pandas as pd

# Statistics
import statistics

<br>

### Problem Statement
***

For this project you must create a data set by simulating a real-world phenomenon of
your choosing. You may pick any phenomenon you wish – you might pick one that is
of interest to you in your personal or professional life. Then, rather than collect data
related to the phenomenon, you should model and synthesise such data using Python.
We suggest you use the numpy.random package for this purpose.

<br>

### Ideas
***

1. Covid hospital admissions with variables gender, age, vaccination status, underlying health condition

2. Voter turnout with possible variables age, socio-economic status, family history of voting, proximity to polling station, day of voting, political mobilisation, type of election, compulsory voting

<br>

### Intention & Scope
***

Simulate a dataset showing the results of whether a person on the register of electors chooses to exercise their right to vote in a general or local election in Ireland. (Note: referendums and European elections are excluded, as British and other non-EU citizens resident in Ireland are excluded from voting, resulting in some people on the register of elector being ineligible to vote in those elections.)

<br>

### Variables to Investigate
***

Main point to investigate: whether a person votes in a particular election or not

Other variables to investigate:  

- age of citizen  
- socio-economic status  
- type of election  
-  proximity to polling station  
- family history of voting  


<br>

### Who is Entitled to Vote?
***

Election: General election (and Presidential)  
Entitled to vote: Citizens 18 and over; British citizens resident  
To calculate estimated Voting-Age Population: Population aged 18 and overmminus non-Irish citizens but including British citizens  

Election: Referendum  
Entitled to vote: Irish citizens aged 18 and over  
To calculate estimated Voting-Age Population: Population aged 18 and overminus all non-Irish citizens (i.e. excluding British citizens resident in Ireland)  

Election: Local elections  
Entitled to vote: All residents (of 12 months) aged 18 or over  
To calculate estimated Voting-Age Population: Population aged 18 and over  

Election: European elections  
Entitled to vote: Irish citizens and all residents who are citizens of EU Member States  
To calculate estimated Voting-Age Population: Population aged 18 and over minus non-EU citizens  

*Ref:* Data oireachtais

<br>

### Investigating Voter Turnout in Ireland
***

From data.oireachtais.ie - Turnout amongst the 18-25 age groups is lower than average in most European countries.
Cross-national data from the European Social Survey, analysed by political scientist James
Sloam, found that average reported turnout since 2000 in 30 Europe countries among the
18-25 age category was 59% compared with 82% reported turnout amongst the population
as a whole. The Survey also found that there is a strong socio-economic dynamic to this
pattern: young people with low levels of educational achievement who are eligible to vote do
so in ‘alarmingly small numbers’ (average of 25% since across the 15 European States).... higher turnout has tended to be associated with middle class areas and lower turnouts with working class areas for general elections up to 2002.... very high turnout levels were associated with mainly middle class
and mainly settled areas (that is experiencing relatively little population in migration
compared with other parts of the city). He also found that turnout was, relatively speaking,
higher in older working class communities which tend to be settled. A key determent of turnout, perhaps equally important to the socio-economic background of an area, appears to
be the extent to which an area is ‘settled’ or experiences regular population change. 

A downward trend in turnout at local elections is clear regardless of which measure is
used. Over the period from 1967 to 1999 turnout fell from 67 per cent to 50 per cent
(REG). This trend was reversed in 2004, when an official turnout of 59% was
recorded, a level almost maintained in 2009 (58 % turnout).
15 However, in 2014, turnout dropped back to 51.6%, the second lowest official turnout level in Irish local
elections...turnout in local elections has tended to be higher in rural areas, the 2014 local elections saw a narrowing of this urban-rural difference...

Year Subject Turnout Result
1937 Draft Constitution 75.8% Yes
1959 PR 58.4% No
1968 Redrawing of constituencies 65.8% No
1968 Constituencies 65.8% No
1972 Accession to the EC 70.9% Yes
1972 Reducing voting age to 18 50.7% Yes
1972 Recognition of specified religions 50.7% Yes
1979 Adoption 28.6% Yes
1979 University representation in Seanad 28.6% Yes
1983 Right to life of unborn 53.7% Yes
1984 Extension of voting rights at Dáil elections 47.5% Yes
1986 Dissolution of marriage 60.8% No
1987 Single European Act 44.1% Yes
1992 Maastricht Treaty 57.3% Yes
1992 Right to life of unborn 68.2% No
1992 Right to travel 68.2% Yes
1992 Right to information 68.1% Yes
1995 Dissolution of marriage 62.1% Yes
1996 Bail 29.2% Yes
1997 Cabinet confidentiality 47.2% Yes
1998 Amsterdam Treaty 56.2% Yes
1998 British-Irish Agreement 56.2% Yes
1999 Local government 51.1% Yes
2001 Death penalty 34.8% Yes
2001 International Criminal Court 34.8% Yes
2001 Treaty of Nice 34.8% No
2002 Protection of life in pregnancy 42.8% No
2002 Treaty of Nice 49.5% Yes
2004 Citizenship 59.9% Yes
2008 Lisbon Treaty 53.1% No
2009 Lisbon Treaty 59.0% Yes
2011 Judges' remuneration 55.9% Yes
2011 Oireachtas inquiries 55.9% No
2012 Stability EMU 50.60% Yes
2012 Children 33.50% Yes
2013 Seanad abolition 39.2% No
2013 Court of Appeal 39.2% Yes
2015 Marriage equality 60.5% Yes
2015 Age of eligibility to be President 60.5% No 

Note that the eligible voting population for Referendums is smaller in size than that for
General Elections as it excludes British citizens resident in Ireland. Yet turnout (REG) is
reported as a proportion of the number of voters on the electoral register for general
elections. Actual turnout is therefore always some percentage points higher than reported
turnout... volatility in the level of turnout at referendums which has ranged from 28.6%
in the 1979 referendum on adoption rights and Seanad university representation to
75.8% in the 1937 referendum on the Constitution.
 Research into the reasons for abstaining in referendums, considered in more detail in
Section 6, suggests that the perceived saliency and profile of the issue affects this
decision with referendums on moral issues and some European issues frequently
having higher turnout...

turnout is lowest amongst students and the
unemployed...In referendums, post-poll surveys undertaken by the Referendum Commission suggests
that non-voters are circumstantial or voluntary (lack of interest) in equal numbers. A
substantial category of non-voters also cite ‘lack of/insufficient understanding’ which, while
slightly different to ‘lack of interest,’ also fits into the intentional rather than the circumstantial
category. The Commission has found a direct relationship between the level of
understanding of the referendum proposal, and the propensity to vote...

### Data for variable: overall turnout  
***

According to the article "Election Turnout in Ireland: measurement, trends and policy implications", published by the Oireachtas Library & Research Service in 2016, voter turnout is measured in two ways. Firstly, it can be measured as a percentage of all voters on the electoral register (REG), or it can be measured e as a percentage of the estimated voting-age population (VAP). Depending on the type of election, one measurement may be considered a more accurate reflection of voter turnout.  

Observations: steadily downward trend across turnout in both general and local elections.  

**Turnout at general elections**  
For general elections, I will be using the VAP data available where possible. VAP is considered a better method of measuring turnout in GEs, as the REG figure is inflated due to the inclusion of deceased people and/or duplicate records.  

1981 81%  
1982 78%  
1987 78%  
1989 73%  
1992 73%  
1997 71%  
2002 68%  
2007 72%  
2011 73%  
2016 69.28%  
2020 67.68%  

*Note:* only REG figures were available for the 2016 & 2020 GEs. In order to estimate the VAP figure, I took the difference between the VAP & REG figures for the previous 9 GEs, and calculated the average difference between the two figures, which was VAP being 4.78% higher than the corresponding REG figure. This was then added to the REG figures for GE 2016 (64.5%) and GE 2020 (62.9%) to get the figures shown above.  

*Ref:* European movement & data oireachtais

In [10]:
# Calculating the mean turnout in the last 10 Irish general elections
gen_results = (78, 78, 73, 73, 71, 68, 72, 73, 69.28, 67.68)
mean_gen_results = statistics.mean(gen_results)
print ("The mean voter turnout in the last 10 Irish general elections was", mean_gen_results,"%")

The mean voter turnout in the last 10 Irish general elections was 72.296 %


**Turnout at local elections**  
I'm going to use the REG figure when looking at turnout in local elections, as this is considered more accurate in this kind of election due to the larger number of people entitled to vote (ref Data oireachtais).  

1967 67%  
1974 62.1%  
1979 63.6%  
1985 59%  
1991 55.6%  
1999 50.2%  
2004 58.6%    
2009 57.8%  
2014 51.7%  
2019 50.2%  

*Ref:* RTE & data oireachtais


In [3]:
# Calculating the average turnout in the last 10 Irish local elections
loc_results = (67, 62.1, 63.6, 59, 55.6, 50.2, 58.6, 57.8, 51.7, 50.2)
ave_loc_results = statistics.mean(loc_results)
print ("The average voter turnout in the last 10 Irish local elections was", ave_loc_results,"%")

The average voter turnout in the last 10 Irish local elections was 57.58 %


<br>

**Turnout at Referendums**  

As was the case for General Elections, the VAP method of measuring turnout may be the better option for Referendums, as not everyone on the register is entitled to vote in them.  

*Note:* only REG figures were available for the 2016 & 2020 GEs. In order to estimate the VAP figure, I took the difference between the VAP & REG figures for the previous 9 GEs, and calculated the average difference between the two figures, which was VAP being 4.78% higher than the corresponding REG figure. This was then added to the REG figures for GE 2016 (64.5%) and GE 2020 (62.9%) to get the figures shown abov

<br>

### Data for variable: voter turnout in different age groups
***

**Voter turnout in 18-25 age group**  
2002 53.3% vs. 76.3% across all ages  
2007 69.2% vs. 79.2%  
2011 75.4% vs. 89.7%  

average reported turnout since 2000 in 30 Europe countries among the 18-25 age category was 59% compared with 82% reported turnout amongst the population as a whole

*Ref:* Data oireachtais

**Distribution of Age Across Irish Population**  

The following breakdown of the Irish population was copied from indexmundi.com:  

0-14 years: 21.15% (male 560,338/female 534,570)  

15-24 years: 12.08% (male 316,239/female 308,872)  

25-54 years: 42.19% (male 1,098,058/female 1,085,794)  

55-64 years: 10.77% (male 278,836/female 278,498)  

65 years and over: 13.82% (male 331,772/female 383,592) (2020 est.)  

_Ref:_ IndexMundi, as of September 2021.

<br>

For this project, we are interested in the following figures:  

15-24 years: 625111  
25-54 years: 2183852  
55-64 years: 557334  
65 years and over: 715364  

<br>

As we are only interested in the population aged 18 and over, we need to remove any persons aged 15, 16 and 17 from the total figure in the 15-24 years age group (those born in 2004, 2005 & 2006).  

61,972 born in 2004  
61,372 born in 2005  
65,425 born in 2006  
Total to remove: 188769  
_Ref:_ CSO

<br>

Therefore, we will we working with the following figures:  

18-24 years: 436342  
25-54 years: 2183852  
55-64 years: 557334  
65 years and over: 715364  

In [4]:
# Creating some variables to represent these figures
age18_24 = 436342
age25_54 = 2183852
age55_64 = 557334
age65_up = 715364
total_register = age18_24 + age25_54 + age55_64 + age65_up
print ("The total number of Irish citizens eligible to vote is", total_register)

The total number of Irish citizens eligible to vote is 3892892


In [5]:
# Breaking down the probability of a randomly drawn citizen being from one of these age groups
p_age18_24 = age18_24 / total_register
p_age25_54 = age25_54 / total_register
p_age55_64 = age55_64 / total_register
p_age65_up = age65_up / total_register

In [6]:
# Generating an array of voter age groups using the probability calculated above
np.random.choice(["18-24", "25-54", "55-64", "65+"], size=(100,), p=[p_age18_24, p_age25_54, p_age55_64, p_age65_up])

array(['25-54', '25-54', '25-54', '18-24', '25-54', '25-54', '18-24',
       '18-24', '25-54', '25-54', '65+', '65+', '25-54', '25-54', '18-24',
       '25-54', '25-54', '65+', '25-54', '25-54', '25-54', '25-54',
       '25-54', '25-54', '25-54', '55-64', '25-54', '25-54', '25-54',
       '55-64', '65+', '25-54', '25-54', '25-54', '25-54', '25-54',
       '25-54', '65+', '25-54', '65+', '55-64', '55-64', '25-54', '25-54',
       '55-64', '55-64', '25-54', '18-24', '25-54', '25-54', '25-54',
       '25-54', '55-64', '25-54', '65+', '25-54', '25-54', '25-54',
       '25-54', '25-54', '18-24', '55-64', '25-54', '25-54', '65+',
       '25-54', '25-54', '65+', '18-24', '55-64', '25-54', '25-54',
       '25-54', '18-24', '25-54', '18-24', '25-54', '25-54', '25-54',
       '25-54', '25-54', '65+', '25-54', '65+', '25-54', '65+', '25-54',
       '65+', '25-54', '25-54', '65+', '25-54', '25-54', '65+', '55-64',
       '18-24', '25-54', '55-64', '55-64', '25-54'], dtype='<U5')

<BR>

### Data for variable: type of election
***

Only two options - General Election or Local Election,  
In the last 40 years, there have been 11 general elections in the history of the state compared with just 7 local elections. Therefore the ratio of local elections to general elections is 1 : 1.57, and 61.1% of the last 18 elections in Ireland were General Elections, while 38.9% were local elections.  

In [7]:
# Lets generste an array of election types using the probability discussed above
np.random.choice(["gen", "loc"], size=(100,), p=[0.61, 0.39])

array(['loc', 'gen', 'gen', 'loc', 'gen', 'gen', 'loc', 'gen', 'gen',
       'gen', 'gen', 'gen', 'gen', 'loc', 'loc', 'gen', 'loc', 'gen',
       'gen', 'loc', 'loc', 'loc', 'loc', 'gen', 'gen', 'gen', 'gen',
       'gen', 'gen', 'gen', 'loc', 'loc', 'gen', 'gen', 'loc', 'loc',
       'gen', 'gen', 'gen', 'gen', 'loc', 'gen', 'gen', 'gen', 'loc',
       'gen', 'gen', 'gen', 'gen', 'gen', 'gen', 'loc', 'gen', 'loc',
       'gen', 'loc', 'loc', 'gen', 'gen', 'gen', 'loc', 'gen', 'gen',
       'gen', 'gen', 'gen', 'gen', 'gen', 'gen', 'gen', 'gen', 'loc',
       'loc', 'gen', 'gen', 'gen', 'gen', 'loc', 'loc', 'loc', 'loc',
       'loc', 'loc', 'gen', 'loc', 'gen', 'loc', 'loc', 'gen', 'gen',
       'loc', 'loc', 'gen', 'loc', 'gen', 'gen', 'gen', 'gen', 'gen',
       'gen'], dtype='<U3')

<br>

### References
***


[] https://adriankavanaghelections.org/  

[] https://www.cso.ie/en/csolatestnews/pressreleases/2006pressreleases/reportonvitalstatistics2004/  

[] https://www.cso.ie/en/csolatestnews/pressreleases/2008pressreleases/reportonvitalstatistics2005/  

[] https://www.cso.ie/en/csolatestnews/pressreleases/2009pressreleases/reportonvitalstatistics2006/  

[] https://www.cso.ie/en/qnhs/qnhsmethodology/voterregistrationandparticipationmodule/

[] https://data.oireachtas.ie/ie/oireachtas/libraryResearch/2016/2016-01-28_l-rs-note-election-turnout-in-ireland-measurement-trends-and-policy-implications_en.pdf

[] https://www.europeanmovement.ie/irish-general-election-february-2020/  

[] https://www.fairvote.org/what_affects_voter_turnout_rates

[] https://www.indexmundi.com/ireland/demographics_profile.html  

[] https://www.ipa.ie/_fileUpload/Documents/LA_Times_Summer_2019.pdf  

[] https://www.maynoothuniversity.ie/research/spotlight-research/getting-out-vote-what-influences-voter-turnout

[] https://python-course.eu/numerical-programming/weighted-probabilities.php  

[] https://www.refcom.ie/previous-referendums/  

[] https://www.researchgate.net/publication/228215776_What_Affects_Voter_Turnout

[] https://www.rte.ie/news/elections-2019/results/#/local  

[] https://www.statista.com/statistics/710767/irish-population-by-age/  

[] https://www.tcd.ie/Political_Science/people/michael_gallagher/Election2016.php  

[] https://towardsdatascience.com/building-a-logistic-regression-in-python-step-by-step-becd4d56c9c8  