Advanced Text Analysis with R

This is the material for the Advanced Text Analysis with R course as part of the City University of Hong Kong Summer School in Social Science Research. I will use this page to publish lecture slides, hand-outs, data sets etc. As the title indicates, the course will be taught almost completely using R. If you don't use R yet, please make sure that you install R and Rstudio on your laptop.

This repository hosts the slides (html and source code). The source code for all handouts is published on my learningR page:

**June 2nd (morning):

Session 1: Organizing and Transforming data in R

In this introductory session you will learn how to use R to organize and transform your data: calculating columns, subsetting, transforming and merging data, and computing aggregate statistics. If time permits, we will also cover basic modelling and/or programming in R as desired.

Session 2: Visualizing and using APIs from R: Twitter, Facebook, NY Times

In this session we will look briefly at visualizing data in R. The main focus of the session is on using APIs from R. We will be looking at the Twitter, Facebook, and NY Times API, and also see how to access arbitrary web resources from R.

Session 3: Querying text with AmCAT and R

This is the first session that directly deals with text analysis. The goal of this session is to learn how to use AmCAT as a document management tool, upload data, and perform queries from R.

Session 4: Corpus Analysis and Text (pre)processing

In this session the focus is on the Document Term Matrix: word clouds, comparison of different corpora, and topic models.

Session 5: Advanced text analysis: Machine learning and sentiment analysis

In this session we will do sentiment analysis using both a dictionary approach and with machine learning. These techniques can also be applied to other forms of automatic content analysis such as determining topic or frame analysis.

Session 6: Advanced text analysis: Semantic Network Analysis and Visualization

In the last session we will look at semantic network analysis with word-window approaches and more advanced visualization techniques using ggplot2, igraph, and gephi.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
README.md		README.md
cath_bazar.jpg		cath_bazar.jpg
demo.R		demo.R
slides.Rproj		slides.Rproj
slides1.Rpres		slides1.Rpres
slides1.html		slides1.html
slides2.Rpres		slides2.Rpres
slides2.html		slides2.html
slides3.Rpres		slides3.Rpres
slides3.html		slides3.html
slides4.Rpres		slides4.Rpres
slides4.html		slides4.html
slides5.Rpres		slides5.Rpres
slides5.html		slides5.html
slides6.Rpres		slides6.Rpres
slides6.html		slides6.html

vanatteveldt/hk2016

Folders and files

Latest commit

History

Repository files navigation

Advanced Text Analysis with R

Session 1: Organizing and Transforming data in R

Session 2: Visualizing and using APIs from R: Twitter, Facebook, NY Times

Session 3: Querying text with AmCAT and R

Session 4: Corpus Analysis and Text (pre)processing

Session 5: Advanced text analysis: Machine learning and sentiment analysis

Session 6: Advanced text analysis: Semantic Network Analysis and Visualization

About

Resources

Stars

Watchers

Forks

Languages