Repository for DATS 6450 - Modern Natural Language Processing
Monday 6:10PM - 8:40PM, 1957 E St, Rm 112
Thursday 6:10PM - 8:40PM, Monroe Hall, Rm 115
Instructor Name: Tim Stacey
Office hours: Wednesday, 5:30-7:30 Corcoran Hall Suite 426a
Teaching Assistant: TBA
Natural Language Processing (NLP) is a quickly developing field, becoming incredibly crucial as many struggle to keep up with the deluge of text data created on the web. Text mining, as it is sometimes referred to, has taken many forms over the last ten years, and the field has experienced a seismic shift in methodology as deep learning and neural nets become more ubiquitous.
As a result of completing this course, students will be able to:
- Understand how to apply natural language processing to real world problems
- Understand the basics of neural network architecture for NLP
- Be able to build a neural network to execute a variety of NLP related tasks
- Gain experience with popular Python libraries for both NLP and deep learning
You are expected to have a basic understanding of Python, linear algebra, and probability.
This is a tentative schedule, and subject to change. Please check here regularly for updates.
Week Number | Topics | Readings | Assignments | |
---|---|---|---|---|
1 | Introduction, regex, tf-idf | SLP: Chapter 2, 15 | Movie review classification (Due on Lecture 4) | |
2 | Using tf-idf for classification and clustering | SLP: Chapter 15 | ||
3 | Introduction to tensorflow, logistic regression review | http://cs231n.github.io/optimization-1/ | ||
4 | Feed forward networks (Quiz 1) | http://cs231n.github.io/neural-networks-1/, http://cs231n.github.io/linear-classify/ | ||
5 | Word2vec, Glove representations | SLP: Chapter 16, https://www.tensorflow.org/versions/r0.12/tutorials/word2vec/ | Word embedding assignment (Due on Lecture 8) | |
6 | Classification, other word representations | |||
7 | Recurrent Neural Networks, LSTM (Quiz 2) | This series is probably the best reference: http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/ | Project 1 assigned | |
8 | Dependency Parsing | SLP: Chapter 10 | ||
9 | Language models (Quiz 3) | |||
10 | Seq2Seq, Attention | |||
11 | Translation | Project 1 due | ||
12 | Convolutional Neural Networks (Quiz 4) | Project 2 assigned | ||
13 | QA Systems I | |||
14 | QA Systems II (Quiz 5) | Project 2 due during finals week |
Class slack channel - https://gwunlp.slack.com
Textbook - Speech and Language Processing (3rd ed. draft) by Dan Jurafsky and James H. Martin https://web.stanford.edu/~jurafsky/slp3/
A great introduction to machine learning: http://www-bcf.usc.edu/~gareth/ISL/
NLP newsletter by Sebastian Ruder - http://newsletter.ruder.io
Lab41 - https://gab41.lab41.org/
Data science links - www.datatau.com
These assessments will be used to check learning and give feedback on areas for improvement. Reading prior to class, class attendance, and participation in activities are essential for success on this part of the course.
Details on requirements will be given during class periods. Most assignments will be due the next class period and can be submitted via blackboard. We will work to provide feedback at the next class session.
Two group projects will be assigned over the semester to give students practice on applying NLP principles and methods to various problems. Students will also build teamwork, communication, and technical skills.
Five "quizzes" will be given to ensure students are gaining knowledge - quizzes are put in quotations because these will be mini-assignments completable in class, with potential help from outside resources.
Your final grade will be determined by:
- Assignments (25%)
- Quizzes (25%)
- Project I (25%)
- Project II (25%)
- 93-100 A
- 90-92 A-
- 87-89 B+
- 83-86 B
- 80-82 B-
- 77-79 C+
- 73-76 C
- 70-72 C-
- <70 F
In accordance with University policy, students should notify faculty during the first week of the semester of their intention to be absent from class on their day(s) of religious observance. For details and policy, see: students.gwu.edu/accommodations-religious-holidays.
Academic dishonesty is defined as cheating of any kind, including misrepresenting one's own work, taking credit for the work of others without crediting them and without appropriate authorization, and the fabrication of information. For details and complete code, see: studentconduct.gwu.edu/code-academic-integrity
In the case of an emergency, if at all possible, the class should shelter in place. If the building that the class is in is affected, follow the evacuation procedures for the building. After evacuation, seek shelter at a predetermined rendezvous location.
DISABILITY SUPPORT SERVICES (DSS) Any student who may need an accommodation based on the potential impact of a disability should contact the Disability Support Services office at 202-994-8250 in the Rome Hall, Suite 102, to establish eligibility and to coordinate reasonable accommodations. For additional information see: disabilitysupport.gwu.edu/
MENTAL HEALTH SERVICES 202-994-5300 The University's Mental Health Services offers 24/7 assistance and referral to address students' personal, social, career, and study skills problems. Services for students include: crisis and emergency mental health consultations confidential assessment, counseling services (individual and small group), and referrals. For additional information see: counselingcenter.gwu.edu/