This is a repository with some of the homeworks and projetcs I worked on for my NLP class.
HW1 consists of implementing some cleaning and pre-processing methods on movie lines to extract the dialogue, tokenize the text and count unigrams, bigrams and trigrams.
HW3 uses the Neural Network BERT to classify movies into genres based on their plots.
HW4 explores the gerative powers of GPT2 to create sentences.
In the final project, I try to use the summarization technique with GPT2 to create title for research papers based on their abstracts. This assumes that a title is just a condensed summary of the abstract.