Skip to content

kwartler/GSERM_Text_Remote_student

Repository files navigation

GSERM Text Mining Course for Remote Participation

Each day, please perform a git pull to get the most up to date files and lessons.

Welcome Video

  • This class session, we will not have virtual machines set up so please forgive my mistake in the video. You will have to install R, R-studio and git locally on your laptop. Welcome to the class video.

Recorded Lesson URLS

Day URL Topic
Monday * Setup*: R, R-Studio & git
* Vid2**
* Vid3**
Setup, R Basics, String Manipulation, Text Organizations
Tuesday Vid1 Vid2 Vid3 Text Mining Visuals
Wednesday Vid1 Vid2 vid3 Vid4 Vid5 Sentiment Analysis, Unsupervised Methods
Thursday Vid1 Vid2 Vid3 Vid4 Supervised Methods
Friday Vid1 Vid2 Vid3 Ethics, Data Sources, Syntactic Parsing & Lemmatization

*video1 has been replaced with a presentation for r, r-studio & git setup.
**due to an editing error, you will have to download these instead of streaming

Live Session Vids; will be deleted July 31, 2022

Day Url
Monday * 1st day
Tuesday * 2nd day only second half :(
Wednesday * 3rd day
Thursday * 4th day
Friday * 5th day

Lesson Structure

Each day's lesson is contained in the lesson folder. Each individual lesson folder will contain the following files and folders.

  • slides - A copy of the presentation covered in the recording. Provided because some students print the slides and take notes.
  • data sub folder - contains the data we will work through together
  • scripts - commented scripts to demonstrate the lesson's concepts
  • HW - the daily homework will be in this folder.

Environment Setup

  • You must install R, R-Studio & Git locally on your laptop or if you have the knowledge to set it up, you can work from a server instance with all software. (www.rstudio.cloud)[www.rstudio.cloud] is another option but the free tier has significant time limitations. Part of day 1 will be devoted to ensuring people's instances work correctly.
  • If you encounter any errors during set up don't worry! Please request technical help from Prof K. The qdap library is usually the trickiest because it requires Java and rJava. So if you get any errors, try removing that from the code below and rerunning. This will take a long time, so if possible please run prior to class, and at a time you don't need your computer ie at night. We will work to resolve any issues prior to class or during Monday's live session.

R Packages

# Easiest Method to run in your console
install.packages('pacman')
pacman::p_load(ggplot2, ggthemes, stringi, hunspell, qdap, spelling, tm, dendextend,
wordcloud, RColorBrewer, wordcloud2, pbapply, plotrix, ggalt, tidytext, textdata, dplyr, radarchart, 
lda, LDAvis, treemap, clue, cluster, fst, skmeans, kmed, text2vec, caret, glmnet, pROC, textcat, 
xml2, stringr, rvest, twitteR, jsonlite, docxtractr, readxl, udpipe, reshape2, openNLP, vtreat, e1071,
lexicon, echarts4r, lsa, yardstick, textreadr, pdftools, tesseract, mgsub, mapproj, ggwordcloud)

# Additionally we will need this package from a different repo
install.packages('openNLPmodels.en', repo= 'http://datacube.wu.ac.at/')

# You can install packages individually such as below if pacman fails.
install.packages('tm')

# Or using base functions use a nested `c()`
install.packages(c("lda", "LDAvis", "treemap"))

Installing rJava (needed for Qdap) on MAC!

For most students these two links have helped them install java, and then make sure R/Rstudio can find it when loading qdap. Keep in mind, you don't have to install qdap, to earn a good grade This is primarily for the use of some functions including polarity().

Once java is installed this command from terminal often resolves the issue:

sudo R CMD javareconf

If this causes hardship, don't worry! Its only a small bit of our overall learning and I will cover an alternative in the live session.

Homework & Case Due dates

HW Covered in Class. Due
HW1 Monday Tuesday
HW2 Tuesday Wednesday
HW3 Wednesday Thursday
HW4 Thursday Friday
Case NA July 1

Prerequisite Work

About

GSERM Remote Class

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published