## PEI Election Data

Using the data from [Elections PEI - Open Data](https://www.electionspei.ca/resources/open-data), and more specifically [Yearly Political Party Contributions - Open Data](https://www.electionspei.ca/yearly-political-party-contributions-open-data), we create some visualizations of the data.

In [1]:
# Dependencies and data.
import pandas as pd

df = pd.read_csv('../resources/election_contributions.csv')
df

Unnamed: 0,Year,Party,Last_Name,First_Name,Business_Name,Address,Community,Postal_Code,Province,Country,Amount
0,2011,Green Party of Prince Edward Island,Lanthier,Peter,,,Mermaid,,,,350
1,2011,Green Party of Prince Edward Island,Lanthier,Darcie,,,Mermaid,,,,400
2,2011,Green Party of Prince Edward Island,Munves,Barbara,,,Charlottetown,,,,500
3,2011,Green Party of Prince Edward Island,Green Party of Canada,,,,Malpeque Association,,,,5300
4,2011,The Island Party of P.E.I.,Ferguson,George,,,Murray River,,,,400
...,...,...,...,...,...,...,...,...,...,...,...
7746,2020,Progressive Conservative Association of Prince...,Walsh,Margaret Ann,,,Watervale,,PE,,"$1,500.00"
7747,2020,Progressive Conservative Association of Prince...,Wellner,William,,,Charlottetown,,PE,,"$3,000.00"
7748,2020,Progressive Conservative Association of Prince...,Wheatley,Ross,,,Stratford,,PE,,$310.26
7749,2020,Progressive Conservative Association of Prince...,Wheeler,Sean,,,Charlottetown,,PE,,$500.00


In [2]:
# Check datatypes.
df.dtypes

Year              int64
Party            object
Last_Name        object
First_Name       object
Business_Name    object
Address          object
Community        object
Postal_Code      object
Province         object
Country          object
Amount           object
dtype: object

In [3]:
# Replace currency with float value.
df.Amount = df.Amount.replace('[\$,]', '', regex=True).astype(float)
df

Unnamed: 0,Year,Party,Last_Name,First_Name,Business_Name,Address,Community,Postal_Code,Province,Country,Amount
0,2011,Green Party of Prince Edward Island,Lanthier,Peter,,,Mermaid,,,,350.00
1,2011,Green Party of Prince Edward Island,Lanthier,Darcie,,,Mermaid,,,,400.00
2,2011,Green Party of Prince Edward Island,Munves,Barbara,,,Charlottetown,,,,500.00
3,2011,Green Party of Prince Edward Island,Green Party of Canada,,,,Malpeque Association,,,,5300.00
4,2011,The Island Party of P.E.I.,Ferguson,George,,,Murray River,,,,400.00
...,...,...,...,...,...,...,...,...,...,...,...
7746,2020,Progressive Conservative Association of Prince...,Walsh,Margaret Ann,,,Watervale,,PE,,1500.00
7747,2020,Progressive Conservative Association of Prince...,Wellner,William,,,Charlottetown,,PE,,3000.00
7748,2020,Progressive Conservative Association of Prince...,Wheatley,Ross,,,Stratford,,PE,,310.26
7749,2020,Progressive Conservative Association of Prince...,Wheeler,Sean,,,Charlottetown,,PE,,500.00


In [4]:
# Check again.
df.dtypes

Year               int64
Party             object
Last_Name         object
First_Name        object
Business_Name     object
Address           object
Community         object
Postal_Code       object
Province          object
Country           object
Amount           float64
dtype: object

In [5]:
# Check some column information.
len(df.Business_Name.unique())

3

In [6]:
# Look at non-null rows.
df.loc[df.Business_Name.notnull()]

Unnamed: 0,Year,Party,Last_Name,First_Name,Business_Name,Address,Community,Postal_Code,Province,Country,Amount
4700,2017,Prince Edward Island Liberal Association,Clement,Gary,TD Securities,,Toronto,,ON,,2000.0
4907,2017,Prince Edward Island Liberal Association,MacLeod,Kenny,Kenny's Backhoeing Ltd.,,Montague,,PE,,400.0


In [7]:
# Not an important column - drop.
df = df.drop(columns='Business_Name')
df

Unnamed: 0,Year,Party,Last_Name,First_Name,Address,Community,Postal_Code,Province,Country,Amount
0,2011,Green Party of Prince Edward Island,Lanthier,Peter,,Mermaid,,,,350.00
1,2011,Green Party of Prince Edward Island,Lanthier,Darcie,,Mermaid,,,,400.00
2,2011,Green Party of Prince Edward Island,Munves,Barbara,,Charlottetown,,,,500.00
3,2011,Green Party of Prince Edward Island,Green Party of Canada,,,Malpeque Association,,,,5300.00
4,2011,The Island Party of P.E.I.,Ferguson,George,,Murray River,,,,400.00
...,...,...,...,...,...,...,...,...,...,...
7746,2020,Progressive Conservative Association of Prince...,Walsh,Margaret Ann,,Watervale,,PE,,1500.00
7747,2020,Progressive Conservative Association of Prince...,Wellner,William,,Charlottetown,,PE,,3000.00
7748,2020,Progressive Conservative Association of Prince...,Wheatley,Ross,,Stratford,,PE,,310.26
7749,2020,Progressive Conservative Association of Prince...,Wheeler,Sean,,Charlottetown,,PE,,500.00


In [8]:
# Check Country information.
len(df.Country.unique())

2

In [9]:
# Look at non-null rows.
df.Country.unique()

array([nan, 'CA'], dtype=object)

In [10]:
# Drop the Country column.
df = df.drop(columns='Country')
df

Unnamed: 0,Year,Party,Last_Name,First_Name,Address,Community,Postal_Code,Province,Amount
0,2011,Green Party of Prince Edward Island,Lanthier,Peter,,Mermaid,,,350.00
1,2011,Green Party of Prince Edward Island,Lanthier,Darcie,,Mermaid,,,400.00
2,2011,Green Party of Prince Edward Island,Munves,Barbara,,Charlottetown,,,500.00
3,2011,Green Party of Prince Edward Island,Green Party of Canada,,,Malpeque Association,,,5300.00
4,2011,The Island Party of P.E.I.,Ferguson,George,,Murray River,,,400.00
...,...,...,...,...,...,...,...,...,...
7746,2020,Progressive Conservative Association of Prince...,Walsh,Margaret Ann,,Watervale,,PE,1500.00
7747,2020,Progressive Conservative Association of Prince...,Wellner,William,,Charlottetown,,PE,3000.00
7748,2020,Progressive Conservative Association of Prince...,Wheatley,Ross,,Stratford,,PE,310.26
7749,2020,Progressive Conservative Association of Prince...,Wheeler,Sean,,Charlottetown,,PE,500.00


In [11]:
# Check Province column information.
len(df.Province.unique())

13

In [12]:
# Look at non-null rows.
df.Province.unique()

array([nan, 'PE  ', 'PE', 'PE ', 'ON', 'NS', 'NB', 'QC', 'AB', 'BC', 'NL',
       'SK', 'PEI'], dtype=object)

In [13]:
# Replace PE values followed by whitespace.
df.Province = df.Province.str.strip()
# Replace PEI with PE to match formatting.
df.Province = df.Province.replace('PEI', 'PE')
df.Province.unique()

array([nan, 'PE', 'ON', 'NS', 'NB', 'QC', 'AB', 'BC', 'NL', 'SK'],
      dtype=object)

In [14]:
# Save this cleaned data to a CSV.
df.to_csv('../resources/transformed_data/election_contributions_transformed.csv')
df

Unnamed: 0,Year,Party,Last_Name,First_Name,Address,Community,Postal_Code,Province,Amount
0,2011,Green Party of Prince Edward Island,Lanthier,Peter,,Mermaid,,,350.00
1,2011,Green Party of Prince Edward Island,Lanthier,Darcie,,Mermaid,,,400.00
2,2011,Green Party of Prince Edward Island,Munves,Barbara,,Charlottetown,,,500.00
3,2011,Green Party of Prince Edward Island,Green Party of Canada,,,Malpeque Association,,,5300.00
4,2011,The Island Party of P.E.I.,Ferguson,George,,Murray River,,,400.00
...,...,...,...,...,...,...,...,...,...
7746,2020,Progressive Conservative Association of Prince...,Walsh,Margaret Ann,,Watervale,,PE,1500.00
7747,2020,Progressive Conservative Association of Prince...,Wellner,William,,Charlottetown,,PE,3000.00
7748,2020,Progressive Conservative Association of Prince...,Wheatley,Ross,,Stratford,,PE,310.26
7749,2020,Progressive Conservative Association of Prince...,Wheeler,Sean,,Charlottetown,,PE,500.00
