# UK Midterm Predictions

We get the [UK Midterm Election](https://commonslibrary.parliament.uk/research-briefings/cbp-7529/) data, including general election result of diff countries within UK, voter turnout, spoilt ballot papers, postal ballots at UK general elections, etc. This includes information beginning on 1918. 

We get the [UK voter turnout](https://commonslibrary.parliament.uk/research-briefings/cbp-7529/) from the same source. 

We get the [UK inflation data](https://www.macrotrends.net/countries/GBR/united-kingdom/inflation-rate-cpi). We will take the December data from the relevant midterm year showing the % change in inflation. This will give us a good sense of how voters were perceiving and experiencing inflation leading up to the voting date. 

We get the [employment data](https://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/earningsandworkinghours/bulletins/earningsandemploymentfrompayasyouearnrealtimeinformationuk/october2022) data by acessing a [millennium of macroeconomic data](https://www.bankofengland.co.uk/statistics/research-datasets). We will access specifically "Monthly administrative unemployment, vacancies and hourses worked 1881-2015" data file. We will iterate through the months of every year starting 1915 (personal choice). We will use seasonally-adjusted and interpolated unemployment rates to provide a more accurate representation of economic activity. This will provide insight as to how an increase or decrease in employment opportunities may impact election results. 

or We will access specifically "Quarterly Employment Estimates, 1924-2016" data file. We will iterate through the quarter of every year starting 1951 (due to gaps in data). This will provide insight as to how an increase or decrease in employment opportunities may impact election results. 

All these data sources are validated.

## Cleaning Midterm Data

In [32]:
import pandas as pd

In [45]:
# use pandas read_csv to read in the midterm data

df_uk_midterm = pd.read_csv('uk_midterm_voteshare.csv')

df_uk_midterm.drop('Total', axis = 1, inplace = True)

df_uk_midterm.head()

Unnamed: 0,Year,CON2,LAB,LD3,PC/SNP,Other
0,1918,0.387,0.208,0.256,0.0,0.149
1,1922,0.385,0.297,0.288,0.0,0.03
2,1923,0.38,0.307,0.297,0.0,0.016
3,1924,0.468,0.333,0.178,0.0,0.021
4,1929,0.381,0.371,0.235,0.0,0.013


## Cleaning Voter Turnout Data ##

In [53]:
# use pandas read_csv to read in the voter turnout data
df_uk_turnout = pd.read_csv('uk_turnout.csv')

drop_lst = ['England', 'Wales', 'Scotland', 'Northern Ireland']
df_uk_turnout.drop(drop_lst, axis = 1, inplace = True)

# voter turnout - valid votes as % of electorate 
df_uk_turnout.head()

Unnamed: 0,Year,United Kingdom
0,1918,57.2%
1,1922,73.0%
2,1923,71.1%
3,1924,77.0%
4,1929,76.3%


## Cleaning Inflation Data ##

In [130]:
from datetime import *

# use pandas read_csv to read in the CPI data
df_uk_inflation_all = pd.read_csv('UK_inflation_rate_cpi.csv')

# add a column that just takes the year of the date
df_uk_inflation_all['Year'] = pd.DatetimeIndex(df_uk_inflation_all['date']).year
df_uk_inflation_all = df_uk_inflation_all.dropna(axis = 1)
df_uk_inflation_all

# takes the first quarter of every year for the years related to midterms
# list of years of election
uk_election_years = [1918, 1922, 1923, 1924, 1929, 1931, 1935, 1945, 1950, 1951, 1959, 1964, 1966, 1980, 1974, 1979, 
                     1983, 1987, 1992, 1997, 2001, 2005, 2010, 2015, 2017, 2019]

# filter to all the years of the election by accessing the 'year' column
df_uk_inflation = df_uk_inflation_all.loc[df_uk_inflation_all['Year'].isin(uk_election_years)]

# reset the indecies to start from 0
# https://pynative.com/pandas-reset-index/
df_uk_inflation.reset_index(drop=True, inplace=True)

drop_inflation_lst = ['Inflation Rate', 'date']

df_uk_inflation = df_uk_inflation.drop(drop_inflation_lst, axis = 1)

df_uk_inflation.head()

KeyError: "['Inflation Rate'] not found in axis"

## Cleaning Employment Data ##

In [127]:
# data about employment growth 

# read the csv files 
df_uk = pd.read_csv('UK_employment_growth.csv')

# cleaned data for visualization
df_uk_employment = df_uk[['Year', 'Employment Growth']]

# takes the first quarter of every year for the years related to midterms
# list of years of election
uk_election_years = [1918, 1922, 1923, 1924, 1929, 1931, 1935, 1945, 1950, 1951, 1959, 1964, 1966, 1980, 1974, 1979, 
                     1983, 1987, 1992, 1997, 2001, 2005, 2010, 2015, 2017, 2019]

# filter to all the years of the election by accessing the 'year' column
df_uk_employ_grow = df_uk_employment.loc[df_uk_employment['Year'].isin(uk_election_years)]

#reset the indecies to start from 0 of new dataframe
df_uk_employ_grow.reset_index(drop=True, inplace=True)

df_uk_employ_grow

Unnamed: 0,Year,Employment Growth
0,1951.0,0.34
1,1959.0,0.29
2,1964.0,0.19
3,1966.0,0.17
4,1974.0,-0.06
5,1979.0,0.22
6,1980.0,-0.12
7,1983.0,-0.44
8,1987.0,0.3
9,1992.0,-0.43
