Skip to content

leops95/sots

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

U.S. State of the State transcripts

This repository contains the speech data used in the following manuscript: Figurative and Literal Language in U.S. Governor Speeches (L. Picard & D. Stammbach). It includes raw text files and metadata for 1,296 speeches (State of the State addresses from U.S. governors and State of the Union addresses from U.S. presidents), covering the period 1995–2022.

List of variables from metadata.csv:

Variable Definition
st_name State full name
st_id State two-letter identifier
year Speech year of delivery
filename Speech file name, found in folder speeches_raw/
speaker Speaker full name, in format Lastname_Firstname
party Speaker political party (1 = republican, 2 = democratic, 3 = other)
age_endyear Speaker age, defined at the end of the year
gender Speaker gender (0 = male, 1 = female)
ethnicity Speaker ethnicity (1 = caucasian, 2 = african american, 3 = asian american, 4 = native american, 5 = hispanic, 6 = other)
elec_lastyear Indicator variable for speeches following an election year (1 = yes, 0 = no)
vshare_lastelec Speaker vote share (0-100, or missing if elected through other means, see remarks)
last_term Indicator variable for term-limited speakers, term level (1 = yes, 0 = no)
term_limit Indicator variable for term-limited speakers, year level (1 = yes, 0 = no)
type Speech type (sots = State of the State, sotu = State of the Union, budg = State of the Budget or Budget address, inaug = Inauguration speech, other = Mixed or other type)
quality Speech text quality (as prepared/ocr/bulletpoints/youtube cc/quotes)
source Source of speech, URL link
remarks Optional remarks on election characteristics or governor tenure

About

U.S. State of the State transcripts dataset

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors