ICS02: 8. Introduction to R
Sunoikisis Digital Classics, Spring 2019
Session 8. Introduction to programming through R
Thursday Feb 28, 16:00 UK = 18:00 EET
Convenors: Christopher Ohge & Gabriel Bodard (University of London)
YouTube link: https://youtu.be/X8iCDZVgWSA
This session will introduce basic programming concepts with the R language. After an introductory lesson on regular expressions, R syntax, and basic R functions, we will use the tidy text library package to perform text analysis tasks.
- Intro: common programming languages in DH
- Regular expressions
- Intro to R and tidytext
Installing R and RStudio
Before the session, make sure to download the R software package from http://www.r-project.org/.
- Click on "download R."
- Choose the appropriate CRAN mirror in your area for downloading (for me it's the UK > Imperial College London link).
- Download and install the appropriate R 3.5.2 binary for your operating system.
Then download the latest version of RStudio at https://www.rstudio.com.
- Click on "Download RStudio."
- Download the RStudio Desktop (free) version.
- Chose the appropriate installer: Most of you will use either RStudio 1.1.463 - Windows Vista/7/8/10 or Mac OS X 10.6+.
- Hawkins, Laura F. 'Computational Models for Analyzing Data Collected from Reconstructed Cuneiform Syllabaries.' Digital Humanities Quarterly 12.1 (2018). Available: http://digitalhumanities.org:8081/dhq/vol/12/1/000368/000368.html (Wayback Machine version)
- Rockwell, G. 'What is Text Analysis, Really?' Literary and Linguistic Computing 18.2 (2003): 209-219. Available: http://www.geoffreyrockwell.com/publications/WhatIsTAnalysis.pdf
- Rydberg-Cox, Jeff. Statistical Methods for Studying Literature in R. Available: https://daedalus.umkc.edu/StatisticalMethods/index.html
- Silge, Julia, and David Robinson. Text Mining with R: A Tidy Approach. Available: https://www.tidytextmining.com/. Especially chapters 1–3.
- Jockers, Matthew. Text Analysis with R for Students of Literature (Springer, 2014). Especially Chapters 1, 2, 6, 7, and 11
- Regex Tester: https://www.regextester.com/
- Regex Quickstart: https://www.rexegg.com/regex-quickstart.html
Create a regular expression to remove those non-words in
Create a regular expression for only dialogue words in
dickens.words.v. Modify the Jockers for loop by confining your word frequency results to only dialogue.
Using the gutenbergr package, load some new text files (more than one, please) that interest you. Create a tidy tibble of the textual data and chose a visualisation method for displaying your results.
Based on your results, posit a new question--or questions--about what you would like to investigate further. Modify a code block(s) from Part I of the R Notebook to answer your question.