Skip to content

aelshehawy/text-as-data-computational-text-analysis-oxford

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text as Data: Computational Text Analysis

This is a course offered in Trinity Term 2021 at Department of Politics and International Relations, University of Oxford
Course Convener: Ashrakat Elshehawy

We will be using Python as a programming language during this course. The course starts with a short Python refresher. The course will then introduce text processing mechanisms. We are going to learn about mechanisms like tokenization, stemming, lemmatization, part-of-speech tagging, and named-entity recognition. The course will also provide insight in methods of managing and manipulating text data in Python. We will then cover aspects of numerical representation of text, for example like word-embedding, and also discuss metrics of text similarity. After that, we will focus on methods of unsupervised machine learning like clustering and topic modelling, as well as, supervised machine learning methods, with a focus on classification techniques and sentiment analysis.

All Python sheets used in class, excercises, excercise solutions, and slides will be updated here on a weekly basis.

Quickly Access our Python Sheets:

  • Python Refresher 1 Link
  • Python Refresher 2 Link
  • Week 1 - Introduction in NLP, Peparing Corpora, and Text-Preprocessing Link
  • Week 2 - Building an NLP Pipeline, Lemmatization, Stemming, POS-Tagging, NER. An introduction in dictionary methods and the use of counting for computational text-analysis Link
  • Week 3 - Vector Space Representation and Unsupervised Techniques Link
  • Week 4 - Supervised Techniques - Classification and Sentiment Analysis Link

About

Trinity Term 2021 - Department of Politics and International Relations - University of Oxford

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published