Skip to content

Materials based on Jacob Eisenstein's NLP course at Georgia Tech

Notifications You must be signed in to change notification settings

xhan77/uw-cse447-a1-public

Repository files navigation

Assignment 1

This assignment is designed to be completed in 3 weeks. Please start early!

The entry point and spec for this project is pset1.ipynb. It is a Jupyter notebook and we recommend installing JupyterLab to run it. You can find a preview of the notebook by clicking the file on GitLab in a browser. To setup the computing resources and environments, see this document.

Due: April 22, 11:59 PM PST

Summary: In this problem set, you will build a system for automatically classifying song lyrics comments by era. You will:

  • Do some basic text processing, tokenizing your input and converting it into a bag-of-words representation
  • Build a machine learning classifier based on the generative model, using Naive Bayes
  • Evaluate your classifiers and examine what they have learned
  • Build a machine learning classifier based on the discriminative model, using Perceptron
  • Build a logistic regression classifier using PyTorch
  • Implement techniques to improve your classifier

Submission: To submit this assignment, please refer to submission_guide.md for further details. Please don't forget to tag your submission. Failing to do so may result in a zero or some deductions on your project.

Late Policy: Each student will be granted 5 late days to use over the duration of the semester. You can use a maximum of 3 late days on any one project. Weekends and holidays are also counted as late days. Late submissions are automatically considered as using late days. Using late days will not affect your grade. However, projects submitted late after all late days have been used will receive no credit. Be careful!

Academic honesty: Homework assignments are to be completed individually and you should not share your code. E.g., it is not allowed to fork the public GitHub repo and push your progress there since others may see it. Verbal collaboration on homework assignments is acceptable, as well as re-implementation of relevant algorithms from research papers, but everything you turn in must be your own work, and you must note the names of anyone you collaborated with on each problem and cite resources that you used to learn about the problem. Suspected violations of academic integrity rules will be handled in accordance with the UW guidelines on Student Conduct.

About

Materials based on Jacob Eisenstein's NLP course at Georgia Tech

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published