Skip to content

digitalhen/speechAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Speech Analysis

A collection of scripts for generic speech (written speech) processing.

State of the Union

A script for processing all the United States State of the Union speeches from 1790 (Washington) to 2012 (Obama), using TF-IDF. This is based on work for an assignment at the JMSC (at the University of Hong Kong) here.

There are a few different ways you can use this script:

  • Output the top terms for a specific year: python <script name> -y 2000
  • Output the top terms grouped by decade beginning with a specific year: python <script name> -d 1900

Additionally, you can specify the number of terms to return, defaulting to 20: python <script name> -d 1900 -t 5

An infographic charting the top 20 terms for decade from 1900 to present was created. It visually demostrates the change of focus from decade to decade. Further, an article was written about the findings.

About

TF-IDF analysis of U.S. State of the Union speeches from 1790-2012, identifying key terms by year and decade

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages