Skip to content
R package of datasets related to Australian politicians (1901-2019)
R
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R
data-raw
data
man Minor improvements and rejig some column names and add Sarah Henderson Sep 11, 2019
.Rbuildignore
.gitignore
.travis.yml
AustralianPoliticians.Rproj
DESCRIPTION
LICENSE
LICENSE.md
NAMESPACE
NEWS.md
README.Rmd
README.md

README.md

AustralianPoliticians

Travis build status

I am still developing this package, and there could be breaking changes as part of improving things.

AustralianPoliticians is a collection of datasets related to Australian politicians. The datasets are:

  • all.rda: The main dataset.
  • by_division_mps.rda: Adds information about the division (‘seat’) of the politician.
  • by_party.rda: Adds information about the party of the politician.
  • by_state_senators.rda: Adds information about the state that a senator was representing.
  • list_prime_ministers.rda: Whether the politician was prime minister.
  • uniqueID_to_aphID.rda: A correspondence between the uniqueIDs used in these datasets and the IDs used by the Australian Parliament House.

alt text

The datasets are up-to-date as of 10 September 2019 (i.e. they include the deaths of Tim Fisher and Elaine Darling, and the appointment of Sarah Henderson).

If you have suggestions on how I could improve the datasets, or corrections, please don’t hesitate to get in touch.

Installation

You can install this package from GitHub with:

# install.packages("devtools")
devtools::install_github("RohanAlexander/AustralianPoliticians")

Example

This is a example of how to load the data:

library(tidyverse)

devtools::install_github("RohanAlexander/AustralianPoliticians")

all <- AustralianPoliticians::all %>% as_tibble()
by_division_mps <- AustralianPoliticians::by_division_mps %>% as_tibble()
by_party <- AustralianPoliticians::by_party %>% as_tibble()
by_state_senators <- AustralianPoliticians::by_state_senators %>% as_tibble()
list_prime_ministers <- AustralianPoliticians::list_prime_ministers %>% as_tibble()
uniqueID_to_aphID <- AustralianPoliticians::uniqueID_to_aphID %>% as_tibble()

You could then combine the tables using left_join:

all_individuals_with_their_division <- all %>% 
  left_join(by_division_mps, by = c("uniqueID"))

Monica Alexander has written a brief blog post where she uses the package to look at life expectency of Australian politicians:
https://www.monicaalexander.com/posts/2019-08-09-australian_politicians/

Dataset details

all.rda

This is the main dataset and contains one row per politician, with columns: uniqueID, surname, allOtherNames, firstName, commonName, displayName, earlierOrLaterNames, title, gender, birthDate, birthYear, deathDate, member, senator, wikidataID, wikipedia, adb, and comments.

uniqueID is usually the surname of the politician and the year that they were born, e.g. Abbott1859. In certain cases this is not enough to uniquely identify them and then we add the first name, e.g. AndersonCharles1897 and AndersonGordon1897. In cases where there is punctuation in the surname, e.g. Ashley-Brown or O’Brien, this has been removed but capitalisation has been retained, so those would become AshleyBrown or OBrien, respectively.

commonName is used to highlight the name that the politician tended to be known as e.g. Ted instead of Edward This is used in displayName which is a politicians surname and their common name (if they had one) or first name e.g. Abbott, Richard. In cases where this would not be unique, e.g. Francis Baker, an additional name has been added.

earlierOrLaterNames is mostly used to keep track of women changing their names at marriage. Similarly, title is mostly used to keep track of ‘Dr’, but both have been used inconsistently and should be only used sparingly.

Some politicians don’t have a complete birth date, and instead only have a year of birth. In these cases their entry for birthDate will be empty, but they will have a birthYear. All death dates are complete, but in the case of one politician – John William Croft – this has been inputted, as the circumstances and timing (even year) of his death are unknown.

member and senator are binary indicator variables used to signify whether the politician was in the lower or upper house. Most politicians are only in one or the other, but some were in both. One politician in the dataset was neither a senator nor an MP - Heather Elaine Hill. She remains in the dataset because she was elected to the senate, and the need for this dataset to exactly match the AustralianElections one), however her eligibility was challenged and her election was invalidated, so she was never a senator.

adb is a link to the Australian Dictionary of Biography.

by_division_mps.rda

This dataset adds information about the division (‘seat’) of the politician. One row per division-politician, with columns: uniqueID; mpsDivision; mpsState; mpsEnteredAtByElection; mpsFrom; mpsTo; mpsEndReason; mpsChangedSeat; and mpsComments.

Certain divisions change name. Sometimes this is minor, for instance Kingsford-Smith to Kingsford Smith, and sometimes it is total. In all cases this is being treated as change in division – the politician is treated as finishing with one division and moving to another – but changedSeat can be used to identify these cases and adjust for them if necessary.

byElection is a binary indicator variable as to whether the politician was entering the seat following a by-election.

changedSeat is a binary indicator variable as to whether the politician left a division because they were changing the division, as opposed to losing an election or retiring.

by_party.rda

This dataset adds information about the party of the politician. One row per party-politician, with columns: uniqueID; partyAbbrev; partyName; partyFrom; partyTo; partyChangedName; partySimplifiedName; partySpecificDateInputted; and partyComments.

Party can be a little confusing in cases where a politician changed party. In general, in this dataset, the to/from dates are set-up so that when a politician is in parliament they will have the correct party. However the dataset should not be used to say anything about when they are out of parliament. For instance, some politicians lost their seat, changed party, and then regained a seat in parliament. The dataset does not know when they changed party while they were out of parliament, and it assumes that they changed party either at the same time that they lost their seat or at the same time as they re-gained a seat. Similarly, there are plenty of cases where a politician has ceased being a member after they leave parliament, for instance, Malcolm Fraser left the Liberals. Again, that is not reflected in the dataset.

Certain parties, such as the Nationals, changed their name at various points in time. This is included as a party change for people at that time in partyAbbreviationParlHandbook and partyNameParlHandbook. However, partySimplified abstracts away from that.

Party name changes:

  • The Country Party changed to the National Country Party on 3 May 1975 according to http://nla.gov.au/nla.news-article110636121. It then changed from the National Country Party to the National Party of Australia on 17 October 1982 according to http://nla.gov.au/nla.news-article116476081. And finally, it changed from the National Party of Australia to The Nationals on 11 October 2003 according to the party website.
  • The Nick Xenophon Team changed to Centre Alliance on 10 April 2018, according to ABC news reports.

by_state_senators.rda

This dataset adds information about the state that a senator was representing. The variables are: uniqueID; senatorsState; senatorsFrom; senatorsTo; senatorsEndReason; senatorsSec15Sel; and senatorsComments.

This dataset is fairly similar to by_division_mps, expect that it also has senatorsSec15Sel This is a binary indicator variable and indicates whether the senator has been appointed rather than elected.

list_prime_ministers.rda

This dataset adds information about whether the politician has been prime minister. One row per politician, with columns: uniqueID, wasPrimeMinister.

uniqueID_to_aphID.rda

This dataset adds a correspondence between the unique identifiers used in these datasets and the identifier used by the Australian Parliament House on its website. the main issue with the APH identifier is that it is not clear who it is referring to without looking it up. Additionally, in certain cases it changes from time to time, and it is easy to accidently change the format by opening it in Excel.

TODO

  • all.rda: The most recent entrants have incomplete uniqueIDs because their birthdays haven’t been published yet. This needs to be updated as soon as the birthdays are released.
  • all.rda: Need to go through and update the the titles fields - it’s very inconsistent.
  • list_prime_ministers: Need to add the dates.

Roadmap

  • Add dataset of ministers with dates.
  • Add information about birthplace and education.
  • Add information about relationships, for instance father-son, etc.

Sources

In the first instance, the Parliamentary Handbook was the main source of information. This was augmented with information from Wikipedia, the Australian Dictionary of Biography, and the Senate Biographies wherever possible. Limited information was obtained from other sources, such as state parliaments and newspapers (via Trove), and these have generally been specified in the comments.

The uniqueID_to_aphID dataset was primarily drawn from a dataset put together by Patrick Leslie, and it was checked against a modern dataset from Open Australia, and Tim Sherratt’s Historic Hansard records for the Reps and Senate.

Acknowledgements

Thank you to Ben Readshaw, Edward Howlett, Kelly Lyons, Monica Alexander, Sharla Gelfand, and Simon Munzert, for their help. Thank you to Patrick Leslie who generously donated data.

The icon of parliaments used in the hex sticker was made by Freepik from www.flaticon.com

Citation

If you use AustralianPoliticians, please consider citing:

Alexander, Rohan. (2019). AustralianPoliticians: Datasets on Australian Politicians. Source: https://github.com/RohanAlexander/AustralianPoliticians.

Author information

Rohan Alexander (corresponding author and repository maintainer)
University of Toronto
Information Sciences
140 St George St
Toronto, ON, Canada
Email: rohan.alexander@utoronto.ca

You can’t perform that action at this time.