This following scripts explore different aspects of Natural Language Processing using the transcription of the The Office series. It was created with educational purposes. I have used this dataset The Office (US) - Complete Dialogue/Transcript.
- Character Wordcloud generator using the wordcloud library.
- Get a character n-grams from his/her text lines.
- Analyse a character sentiments expressed in The Office using the VADER (Valence Aware Dictionary and sEntiment Reasoner) library. We quantify sentiment polarity (positive/negative) on a scale from -1 to 1.
- Analyse a character sentiments expressed in The Office using the TextBlob library. We quantify sentiment polarity (positive/negative) on a scale from -1 to 1 and subjectivity (objective/subjective) on a scale from 0 to 1.
- Research affinity between characters from the the sentiment analysis of their lines during the scene where they appear.
- Python 3.8.5
python nubePalabras.py Pam
python ngram_TheOffice.py Jim
python SentimentAnalysis_TextBlob.py
python SentimentAnalysis_Vader.py
Representing polarity and subjectivity obteined from Pam and Jim's lines in scenes where both appear.
python afinidadPersonajes.py