This repository contains the code used for the post Text Analysis with tidytext which can be found here.
The repository contains:
- all the code that is shown explicitly and referred, to in the post, plus additonal code; the scripts follow the structure of the post:
- 1 - get text.R
- 2 - prepare text for analysis.R
- 3 - key themes.R
- 4 - tf-idf.R
- 5 - n-grams.R
- 6 - sentiment analysis.R
- 7 - sentence structure.R
- the RStudio Project file (.Rproj file).
- the plots sub directory contains ggplot2 plots created to analyse the speeches, both those created in the post and more.
- the Transcripts sub directory contains the raw speech text.
If you see any errors in the code, please let me know. Or if you have any comments about the post, please feel free to contact me.
You can visit my website at www.markrstevenson.com.