The goal of
AustralianPoliticians is to provide access to biographical
and political data about Australian federal politicians who served
between 1901 and 2021. This enhances the reproducibility of research
that uses this data. The datasets are:
- all: The main dataset.
- mps: Adds information about the division (‘seat’) of the politician.
- allbyparty: Adds information about the party of the politician.
- senators: Adds information about the state that a senator was representing.
The datasets are up-to-date as of 29 November 2021.
If you are using this for anything other than general interest, then please check the comments column in case there is a flag that could affect your results. You’re also welcome to get in touch so that I can make sure that the aspect you’re interested in is of a good enough quality for your purposes.
If you have suggestions on how I could improve the datasets, or corrections, please don’t hesitate to get in touch.
You can install the released version of
or the development version from GitHub with:
This is a example of how to load the data:
library(tidyverse) devtools::install_github("RohanAlexander/AustralianPoliticians") all <- AustralianPoliticians::get_auspol('all') party <- AustralianPoliticians::get_auspol('allbyparty') mps <- AustralianPoliticians::get_auspol('mps') senators <- AustralianPoliticians::get_auspol('senators')
You could then combine the tables using left_join:
all_individuals_with_their_division <- all %>% left_join(mps, by = c("uniqueID"))
Monica Alexander has written a brief blog post where she uses the
package to look at life expectancy of Australian politicians:
This is the main dataset and contains one row per politician, with columns: uniqueID, surname, allOtherNames, firstName, commonName, displayName, earlierOrLaterNames, title, gender, birthDate, birthYear, birthPlace, deathDate, member, senator, wikidataID, wikipedia, adb, and comments.
head(AustralianPoliticians::get_auspol('all')) #> # A tibble: 6 × 20 #> uniqueID surname allOtherNames firstName commonName displayName #> <chr> <chr> <chr> <chr> <chr> <chr> #> 1 Abbott1859 Abbott Richard Hartley Smith Richard <NA> Abbott, Richard #> 2 Abbott1869 Abbott Percy Phipps Percy <NA> Abbott, Percy #> 3 Abbott1877 Abbott Macartney Macartney Mac Abbott, Mac #> 4 Abbott1886 Abbott Charles Lydiard Aubrey Charles Aubrey Abbott, Aubrey #> 5 Abbott1891 Abbott Joseph Palmer Joseph <NA> Abbott, Joseph #> 6 Abbott1957 Abbott Anthony John Anthony Tony Abbott, Tony #> # … with 14 more variables: earlierOrLaterNames <chr>, title <chr>, #> # gender <chr>, birthDate <date>, birthYear <dbl>, birthPlace <chr>, #> # deathDate <date>, member <dbl>, senator <dbl>, wasPrimeMinister <dbl>, #> # wikidataID <chr>, wikipedia <chr>, adb <chr>, comments <chr>
uniqueID is usually the surname of the politician and the year that they were born, e.g. Abbott1859. In certain cases this is not enough to uniquely identify them and then we add the first name, e.g. AndersonCharles1897 and AndersonGordon1897. In cases where there is punctuation in the surname, e.g. Ashley-Brown or O’Brien, this has been removed but capitalisation has been retained, so those would become AshleyBrown or OBrien, respectively.
commonName is used to highlight the name that the politician tended to be known as e.g. Ted instead of Edward This is used in displayName which is a politicians surname and their common name (if they had one) or first name e.g. Abbott, Richard. In cases where this would not be unique, e.g. Francis Baker, an additional name has been added.
earlierOrLaterNames is mostly used to keep track of women changing their names at marriage. Similarly, title is mostly used to keep track of ‘Dr’, but both have been used inconsistently and should be only used sparingly.
Some politicians don’t have a complete birth date, and instead only have a year of birth. In these cases their entry for birthDate will be empty, but they will have a birthYear. All death dates are complete, but in the case of one politician – John William Croft – this has been inputted, as the circumstances and timing (even year) of his death are unknown. birthPlace is mostly taken from WikiData, with a few updates. There are some issues that need to be addressed here.
member and senator are binary indicator variables used to signify whether the politician was in the lower or upper house. Most politicians are only in one or the other, but some were in both. One politician in the dataset was neither a senator nor an MP - Heather Elaine Hill. She remains in the dataset because she was elected to the senate, and the need for this dataset to exactly match the AustralianElections one), however her eligibility was challenged and her election was invalidated, so she was never a senator.
adb is a link to the Australian Dictionary of Biography.
This dataset adds information about the division (‘seat’) of the politician. One row per division-politician, with columns: uniqueID; mpsDivision; mpsState; mpsEnteredAtByElection; mpsFrom; mpsTo; mpsEndReason; mpsChangedSeat; and mpsComments.
head(AustralianPoliticians::get_auspol('mps')) #> # A tibble: 6 × 9 #> uniqueID division stateOfDivision enteredAtByElect… mpFrom mpTo #> <chr> <chr> <chr> <chr> <date> <date> #> 1 Abbott1869 New England NSW No 1913-05-31 1919-11-03 #> 2 Abbott1886 Gwydir NSW No 1925-11-14 1929-10-12 #> 3 Abbott1886 Gwydir NSW No 1931-12-19 1937-03-28 #> 4 Abbott1891 New England NSW No 1940-09-21 1949-10-31 #> 5 Abbott1957 Warringah NSW Yes 1994-03-26 2019-05-18 #> 6 Abel1939 Evans NSW No 1975-12-13 1977-11-10 #> # … with 3 more variables: mpEndReason <chr>, mpChangedSeat <dbl>, #> # mpComments <chr>
Certain divisions change name. Sometimes this is minor, for instance Kingsford-Smith to Kingsford Smith, and sometimes it is total. In all cases this is being treated as change in division – the politician is treated as finishing with one division and moving to another – but changedSeat can be used to identify these cases and adjust for them if necessary.
byElection is a binary indicator variable as to whether the politician was entering the seat following a by-election.
changedSeat is a binary indicator variable as to whether the politician left a division because they were changing the division, as opposed to losing an election or retiring.
This dataset adds information about the party of the politician. One row per party-politician, with columns: uniqueID; partyAbbrev; partyName; partyFrom; partyTo; partyChangedName; partySimplifiedName; partySpecificDateInputted; and partyComments.
head(AustralianPoliticians::get_auspol('allbyparty')) #> # A tibble: 6 × 9 #> uniqueID partyAbbrev partyName partyFrom partyTo partyChangedName #> <chr> <chr> <chr> <date> <date> <dbl> #> 1 Abbott1859 CP Australian Coun… NA NA NA #> 2 Abbott1869 CP Australian Coun… 1919-11-03 NA NA #> 3 Abbott1869 LIB Liberal Party o… NA 1917-02-17 NA #> 4 Abbott1869 NAT Nationalist Par… 1917-02-17 1919-11-03 NA #> 5 Abbott1877 CP Australian Coun… NA NA NA #> 6 Abbott1886 CP Australian Coun… NA NA NA #> # … with 3 more variables: partySimplifiedName <chr>, #> # partySpecificDateInputted <dbl>, partyComments <chr>
Party can be a little confusing in cases where a politician changed party. In general, in this dataset, the to/from dates are set-up so that when a politician is in parliament they will have the correct party. However the dataset should not be used to say anything about when they are out of parliament. For instance, some politicians lost their seat, changed party, and then regained a seat in parliament. The dataset does not know when they changed party while they were out of parliament, and it assumes that they changed party either at the same time that they lost their seat or at the same time as they re-gained a seat. Similarly, there are plenty of cases where a politician has ceased being a member after they leave parliament, for instance, Malcolm Fraser left the Liberals. Again, that is not reflected in the dataset.
Certain parties, such as the Nationals, changed their name at various points in time. This is included as a party change for people at that time in partyAbbreviationParlHandbook and partyNameParlHandbook. However, partySimplified abstracts away from that.
Party name changes:
- The Country Party changed to the National Country Party on 3 May 1975 according to http://nla.gov.au/nla.news-article110636121. It then changed from the National Country Party to the National Party of Australia on 17 October 1982 according to http://nla.gov.au/nla.news-article116476081. And finally, it changed from the National Party of Australia to The Nationals on 11 October 2003 according to the party website.
- The Nick Xenophon Team changed to Centre Alliance on 10 April 2018, according to ABC news reports.
This dataset adds information about the state that a senator was representing. The variables are: uniqueID; senatorsState; senatorsFrom; senatorsTo; senatorsEndReason; senatorsSec15Sel; and senatorsComments.
head(AustralianPoliticians::get_auspol('senators')) #> # A tibble: 6 × 7 #> uniqueID senatorsState senatorFrom senatorTo senatorEndReason sec15Sel #> <chr> <chr> <date> <date> <chr> <dbl> #> 1 Abbott1859 VIC 1928-12-18 1929-06-30 Retired 1 #> 2 Abbott1869 NSW 1925-11-14 1929-06-30 Defeated 0 #> 3 Abbott1877 NSW 1935-07-01 1941-06-30 Defeated 0 #> 4 Abetz1958 TAS 1994-02-22 NA <NA> 1 #> 5 Adams1943 WA 2005-07-01 2012-03-31 Died 0 #> 6 Adamson1857 QLD 1920-07-01 1922-05-02 Died 0 #> # … with 1 more variable: senatorComments <chr>
This dataset is fairly similar to by_division_mps, expect that it also has senatorsSec15Sel This is a binary indicator variable and indicates whether the senator has been appointed rather than elected.
Add dataset of ministers with dates.
- Add information about education.
- Add information about relationships, for instance father-son, etc.
- Add voting based on JFG’s dataset.
In the first instance, the Parliamentary Handbook was the main source of information. This was augmented with information from Wikipedia, the Australian Dictionary of Biography, and the Senate Biographies wherever possible. Limited information was obtained from other sources, such as state parliaments and newspapers (via Trove), and these have generally been specified in the comments. birthPlace is mostly from WikiData.
The uniqueID_to_aphID dataset was primarily drawn from a dataset put together by Patrick Leslie, and it was checked against a modern dataset from Open Australia, and Tim Sherratt’s Historic Hansard records for the Reps and Senate.
In terms of developing the package I am especially grateful for the advice and help of:
- Ben Readshaw,
- Edward Howlett,
- Julia Haider,
- Kelly Lyons,
- Monica Alexander,
- Sharla Gelfand, and
- Simon Munzert.
Thank you to Patrick Leslie who generously donated data.
Rohan Alexander (corresponding author and repository maintainer)
University of Toronto
Faculty of Information and Department of Statistical Sciences 140 St George St
Toronto, ON, Canada
Paul A. Hodgetts University of Toronto
Faculty of Information 140 St George St
Toronto, ON, Canada