Skip to content


Folders and files

Last commit message
Last commit date

Latest commit


Repository files navigation

CC6205 - Natural Language Processing

This is a course on natural language processing.


This course aims to provide a comprehensive introduction to Natural Language Processing (NLP) by covering essential concepts. We strive to strike a balance between traditional techniques, such as N-gram language models, Naive Bayes, and Hidden Markov Models (HMMs), and modern deep neural networks, including word embeddings, recurrent neural networks (RNNs), and transformers.

The course material draws from various sources. In many instances, sentences from these sources are directly incorporated into the slides. The neural network topics primarily rely on the book Neural Network Methods for Natural Language Processing by Goldberg. Non-neural network topics, such as Probabilistic Language Models, Naive Bayes, and HMMs, are sourced from Michael Collins' course and Dan Jurafsky's book. Additionally, some slides are adapted from online tutorials and other courses, such as Manning's Stanford course.


  1. Introduction to Natural Language Processing | (tex source file), video 1, video 2
  2. Vector Space Model and Information Retrieval | (tex source file), video 1, video 2
  3. Probabilistic Language Models | (tex source file), notes, video 1, video 2, video 3, video 4
  4. Text Classification and Naive Bayes | (tex source file) , notes, video 1, video 2, video 3
  5. Linear Models | (tex source file), video 1, video 2, video 3, video 4
  6. Neural Networks | (tex source file), video 1, video 2, video 3, video 4
  7. Word Vectors | (tex source file) video 1, video 2, video 3
  8. Sequence Labeling and Hidden Markov Models | (tex source file), notes, video 1, video 2, video 3, video 4
  9. MEMMs and CRFs | (tex source file), notes 1, notes 2, video 1, video 2, video 3 (optional)
  10. Convolutional Neural Networks | (tex source file), video
  11. Recurrent Neural Networks | (tex source file), video 1, video 2,
  12. Sequence to Sequence Models and Attention | (tex source file), video 1, video 2
  13. Transformer Architecture | (tex source file), video 1
  14. Contextualized Embeddings and Large Language Models, video 1, video 2, video 3
  15. Large Language Models Usage and Evaluation Patterns, video

NLP Libraries and Tools

  1. NLTK: Natural Language Toolkit
  2. Gensim
  3. spaCy: Industrial-strength NLP
  4. Torchtext
  5. AllenNLP: Open source project for designing deep leaning-based NLP models
  6. HuggingFace Transformers
  7. ChatGPT
  8. Google Bard
  9. Stanza - A Python NLP Library for Many Human Languages
  10. FlairNLP: A very simple framework for state-of-the-art Natural Language Processing (NLP)
  11. WEFE: The Word Embeddings Fairness Evaluation Framework
  12. WhatLies: A library that tries help you to understand. "What lies in word embeddings?"
  13. LASER:a library to calculate and use multilingual sentence embeddings
  14. Sentence Transformers: Multilingual Sentence Embeddings using BERT / RoBERTa / XLM-RoBERTa & Co. with PyTorch
  15. Datasets: a lightweight library with one-line dataloaders for many public datasets in NLP
  16. RiverText: A Python Library for Training and Evaluating Incremental Word Embeddings from Text Data Streams

Notes and Books

  1. Speech and Language Processing (3rd ed. draft) by Dan Jurafsky and James H. Martin.
  2. Michael Collins' NLP notes.
  3. A Primer on Neural Network Models for Natural Language Processing by Joav Goldberg.
  4. Natural Language Understanding with Distributed Representation by Kyunghyun Cho
  5. A Survey of Large Language Models
  6. Natural Language Processing Book by Jacob Eisenstein
  7. NLTK book
  8. Embeddings in Natural Language Processing by Mohammad Taher Pilehvar and Jose Camacho-Collados
  9. Dive into Deep Learning Book
  10. Contextual Word Representations: A Contextual Introduction by Noah A. Smith

Other NLP Courses

  1. CS224n: Natural Language Processing with Deep Learning, Stanford course
  2. Deep Learning in NLP: slides by Horacio Rodríguez
  3. David Bamman NLP Slides @Berkley
  4. CS 521: Statistical Natural Language Processing by Natalie Parde, University of Illinois
  5. 10 Free Top Notch Natural Language Processing Courses


  1. Natural Language Processing MOOC videos by Dan Jurafsky and Chris Manning, 2012
  2. Natural Language Processing MOOC videos by Michael Collins, 2013
  3. Natural Language Processing with Deep Learning by Chris Manning and Richard Socher, 2017
  4. CS224N: Natural Language Processing with Deep Learning | Winter 2019
  5. Computational Linguistics I by Jordan Boyd-Graber University of Maryland
  6. Visualizing and Understanding Recurrent Networks
  7. BERT Research Series by Chris McCormick
  8. Successes and Challenges in Neural Models for Speech and Language - Michael Collins
  9. More on Transforemers: BERT and Friends by Jorge Pérez

Other Resources

  1. ACL Portal
  2. Awesome-nlp: A curated list of resources dedicated to Natural Language Processing
  3. NLP-progress: Repository to track the progress in Natural Language Processing (NLP)
  4. Corpora Mailing List
  5. 🤗 Open LLM Leaderboard
  6. Real World NLP Book: AllenNLP tutorials
  7. The Illustrated Transformer: a very illustrative blog post about the Transformer
  8. Better Language Models and Their Implications OpenAI Blog
  9. Understanding LoRA and QLoRA — The Powerhouses of Efficient Finetuning in Large Language Models
  10. RNN effectiveness
  11. SuperGLUE: an benchmark of Natural Language Understanding Tasks
  12. decaNLP The Natural Language Decathlon: a benchmark for studying general NLP models that can perform a variety of complex, natural language tasks.
  13. Chatbot and Related Research Paper Notes with Images
  14. Ben Trevett's torchtext tutorials
  15. PLMpapers: a collection of papers about Pre-Trained Language Models
  16. The Illustrated GPT-2 (Visualizing Transformer Language Models)
  17. Linguistics, NLP, and Interdisciplinarity Or: Look at Your Data, by Emily M. Bender
  18. The State of NLP Literature: Part I, by Saif Mohammad
  19. From Word to Sense Embeddings:A Survey on Vector Representations of Meaning
  20. 10 ML & NLP Research Highlights of 2019 by Sebastian Ruder
  21. Towards a Conversational Agent that Can Chat About…Anything
  22. The Super Duper NLP Repo: a collection of Colab notebooks covering a wide array of NLP task implementations
  23. The Big Bad NLP Database, a collection of nearly 300 well-organized, sortable, and searchable natural language processing datasets
  24. A Primer in BERTology: What we know about how BERT works
  25. How Self-Attention with Relative Position Representations works
  26. Deep Learning Based Text Classification: A Comprehensive Review
  27. Teaching NLP is quite depressing, and I don't know how to do it well by Yoav Goldberg
  28. The NLP index
  29. 100 Must-Read NLP Papers