Skip to content
The code used for the blogpost "Text Analysis with tidytext".
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
Transcripts Add R scripts 1 - get text and 2 - prepare text for analysis. Jan 13, 2019
.gitignore Build ggplot2 visualisations and create bigram dataframe. Jan 20, 2019
1 - get text.R
2 - prepare text for analysis.R Update with final code changes. Jan 26, 2019
3 - key themes.R
4 - td-idf.R Perform TF-IDF analysis. Jan 27, 2019
5 - n-grams.R Perform TF-IDF analysis. Jan 27, 2019
6 - sentence structure.R
7 - sentiment analysis.R Perform TF-IDF analysis. Jan 27, 2019
Blogpost-State-of-the-Union-Text-Analysis-with-tidytext.Rproj Rename R project. Jan 13, 2019 Update Jan 27, 2019


This repository contains the code used for the post Text Analysis with tidytext which can be found here.

The repository contains:

  • all the code that is shown explicitly and referred, to in the post, plus additonal code; the scripts follow the structure of the post:
    • 1 - get text.R
    • 2 - prepare text for analysis.R
    • 3 - key themes.R
    • 4 - tf-idf.R
    • 5 - n-grams.R
    • 6 - sentiment analysis.R
    • 7 - sentence structure.R
  • the RStudio Project file (.Rproj file).
  • the plots sub directory contains ggplot2 plots created to analyse the speeches, both those created in the post and more.
  • the Transcripts sub directory contains the raw speech text.

If you see any errors in the code, please let me know. Or if you have any comments about the post, please feel free to contact me.

You can visit my website at

You can’t perform that action at this time.