In this tutorial you learn how to extract text out of well-formed (digital-born) PDF files and apply tidyverse/tidytext tools to do a bit of text-mining, before learning the basics of sentiment analysis. We will practice with the IPCC (International Panel on Climate Change) Special Report on Climate Change. What sentiments do you expect to find in this sort of document?
- Julia Silge and David Robinson's Text Mining with R
For text mining & analysis stuff:
tidyverse
here
pdftools
textdata
tidytext
ggwordcloud
Note - before lab for sentiment analysis you will need to:
- Attach
tidytext
andtextdata
packages - In the console, Run:
get_sentiments(lexicon = "nrc")
- You should be prompted to install lexicon - choose yes!
- Once that's done, in the console run
get_sentiments(lexicon = "afinn")
- You should be prompted to install lexicon - choose yes!
- IPCC (International Panel on Climate Change) Special Report: Global Warming of 1.5 degrees.
After you grasp sentiment analysis, apply these skills to another document of choice. What do you think is the range of sentiments in the Game of Thrones?