Skip to content

love-borjeson/tm_ws_cloud

Repository files navigation

  1. Topic Modeling in R
  2. Learning and Modeling Philosophy
  3. Preparations
  4. License and Contact

Topic Modeling in R

Topic modeling workshop in R, data and scripts.

The workshop is hosted in RStudio cloud which means only lilliputian preparations for participants (no installation of packages etc).

The workshop goes through topic modeling; (tweaking) the Gibbs sampler; using and editing a stoplist; linguistically inform the model using Part-of-Speech, Lemmatization and key-words; finding the appropriate number of topics (hello K!); and, finally, exporting model results to the extra-R world (if there is such a world). Bonus scripts (6-7) include code to build an app that will let you interact with the model and the original data at the same time and some eye-candy if you need to impress someone with your results.

A very brief introduction to topic modeling can be found here.

Learning and Modeling Philosophy

In this workshop we adopt the learning philosophy of Fast AI. Rather than starting off with the typical “Hello World” and building from ground up (which would take years of training), we start in the other end with state-of-the-art modeling, using very practical, (re-)usable examples (Writing a master's thesis using topic modeling? These scripts should get you a head start!).

Many of the finer details of both the scripts and the underlying statistical “machinery” will, with this approach, be hard to get immediately, but that is ok: scripts and data are written and organized in such a way that each participant can return to whatever section of the workshop that has been unclear to gain a better understanding by themselves. The scripts are plentifully commented and the only command ever needed is ctrl+enter.

This is a friendly, inclusive, workshop. We believe that trying is the right thing to do, even when you fail. We thus encourage everyone who is interested to participate, regardless of prior knowledge. Should you feel that you need more preparations (theoretical, technical, or otherwise), that will not be the end of the world.

For this approach to work, however, we need participants to have confidence in their own cognitive abilities and to take responsibility for their own independent learning, during and after the workshop.

Preparations

You can take part in the workshop in two ways, in the cloud or locally:

Option 1. Run in RStudio Cloud (recommended for inexperienced users)

Get an account for free, here: https://rstudio.cloud

By the time of the wokshop, navigate to the project, here: https://rstudio.cloud/project/1285007

That's it.

Option 2. Run in your local IDE (full user control and functionality)

Switch to this repo: https://github.com/love-borjeson/tm_ws/

License and Contact

All material is unlicensed, i.e. donated to the public domain. For R, Rstudio and packages, additional licenses may apply. Please feel free to give credit where credits due.

Responsible for this workshop: Love Börjeson, Director of KBLab at the National Library of Sweden and Chris Haffenden, Research Coordinator at KBLab at the National Library of Sweden.

About

Topic modeling repo for Rstudio Cloud IDE

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages