Practical Approaches to Data Science with Text
Emory University / QTM 340 / Fall 2019
What does it mean to turn text into data? What are the data science techniques that are commonly employed in order to analyze text? How are they applied in the humanities and social sciences? How are they applied in the world? This course explores these questions by focusing on how existing methods of text analysis can be used in new and creative ways. These methods include text parsing, natural language processing, language models, and vector space models, as well as statistical approaches including cluster analysis and supervised and unsupervised learning. Contemporary topics including data ethics, data justice, and issues with “humans in the loop” are also discussed. Introductory courses in computer science and probability and statistics are recommended as perquisites for QTM 340. All class exercises and homework assignments are done in Python. Students are expected to participate in class discussion and present their final projects at the end of the semester. Some short writing assignments are also required.