Skip to content

alex-bogatu/guess_the_FRIEND

Repository files navigation

guess_the_FRIEND

FRIENDS quote prediction using FRIENDS embeddings (an Udacity Data Science Nanodegree project).

Installation

Appart from the common Numpy and Pandas libraries, the project relies on Python 3.* and:

Data

Data comes from here and contains more than 90K FRIENDS lines from all 10 seasons.

Project Overview

Starting from this data, conveniently curated and merged into a single csv file here, the project uses part of the lines available in the data to create an LSTM-based language model for each FRIENDS character starting from pre-traing FastText word embeddings. With these models in hand, a classification model is then created that is able to accurately predict the friend who said a given line.

A more detailed description of the project can be found here.

Available files

  • Pre-trained language models for each FRIENDS character trained with quotes from Seasons 1-8.
  • Jupyter Notebook with code for downloading the data, creating the above mentioned models and training and testing a quote classification model on data from Seasons 9 and 10.

Instructions for running the project:

Pre-trained FRIENDS embeddings for each character are provided, but there is code available to train your own, maybe using a different architecture. Running the entire notebook will download FastText vectors, create the language models and train the quote calssifier.

Licensing, Authors, Acknowledgements

Feel free to use the code here as you would like! Thank you shilpibhattacharyya for making the curated data available.

About

FRIENDS quote prediction using FRIENDS embeddings

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published