Elm: Emotional Language Model for Personalized Tweets

Elm is an optimistic language model based on your favorite Twitter accounts. We trained our language model on social contributors who exemplify humor and optimism with a willingness to be vulnerable.

Business Problem

The digital social community is said to be tailored to you - but it’s filled with hate. Try as we might, we cannot escape the outrage of the attention economy. We understand for most people, a social media exodus is not an option.

Enter Elm.

Elm is an optimistic language model based on your favorite Twitter accounts. For our first test casae, we trained our language model on social contributors who exemplify humor and optimism with a willingness to be vulnerable.

The intention is for you to create a Twitter account to your taste. To personalize your daily dose of goodness, enter your favorite Twitter accounts and topics, and Elm will provide interesting, entertaining, and heart-warming content with your growth in mind.

The Data

Data was sourced from Twitter using Twint. The accounts were selected for their ecletic humor, inspirational quotes, and emotional intelligence.

The following accounts were sourced to fine-tune our GPT-2 language model:

Simon Sinek
Brene Brown
Conan O'Brien
Lex Fridman
Tim Siedell
Dalai Lama
Pourmecoffee
Steve Martin

Methods

Elm is a languaged model based on the GPT-2 Simple library by Max Woolf. Before fine-tuning, Twitter data was cleaned, removing links, emojis, and hashtags. This data was then fed into GPT-2 Simple which has its own tokenizer for processing text data. The 355M paramater model was selected as we felt the improved performance as worth the longer training time. The model was trained using Google Colab's NVIDIA processor.

After training for 10,000 steps, we generated 100,000 tweets with a temperature setting of 0.8. We then performed extensive EDA on the real and generated tweets to determine similarities and differences.

Normalized Top Words - Real vs. Generated Tweets:

Validation

Following EDA, we decided to use an adhoc classification method of validating performance. The idea is if the classifiers perform at or near 50%, our generated tweets are good replicas of the real things. For the classification modeling, we randomly sampled 15,000 tweets from the real and generated datasets for an entire set of 30,000. We tested both Naive Bayes and Random Forest Classifiers.

We were pleased with a classification accuracy of 62.52% for the random forest classifier after tuning via grid-search. The pre-tuned classifiers overfit the training set performing at or near perfect accuracy. After examiningg the feature importances, we realized this is likely because certain words appear in either the real or generated tweets that don't appear in the other (ex. emptywheel).

Random Forest Classifier: Feature Importance

We then dove into a comparison of correctly versus misclassified data to understand the differences.

As you can see, the correct and incorrect classified tweets show similar top word frequencies. Tweets about Donald Trump are over represented in the generated tweets. Topic modeling through LDA showed that the political (especially Trump-centric) topic was larger in the generated tweets.

Findings

As mentioned earlier, the classifier performed 12.52% better than random-guessing. Though this shows there are differences between real and generated tweets, the differences are case by case rather than indicative of the breadth of generated tweets.

GPT-2 is known to fall into repetitive ruts, which you can see happen by scrolling through the 100,000 tweets. There is still manual work necessary to comb through and select the best ones. To some extent, this could be automated by narrowing down "viable tweets" as those that passed the classification test (were misclassified).

Enjoy some of the examples below!

Conclusion

Our goal was to create an optimistic and witty language model to produce Tweets catered to our preferences. Though there were some goofy tweets (as GPT-2 is prone to do), we were very happy with the overall performance. You can see when the model would get into a "groove" writing more witty or Dalai Lama-esque tweets. Though the model may not possess consciousness, it wrote some inspirational messages unique to the personalities it was trained on.

As a next step, we'll see how users react to these generated tweets. Getting access to GPT-3 would obviously improve performance as well.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Appendix		Appendix
Data		Data
Scraped-Tweets		Scraped-Tweets
Visuals		Visuals
.DS_Store		.DS_Store
.gitignore		.gitignore
Elm_P4Project.pdf		Elm_P4Project.pdf
README.md		README.md
Tweet-Topic-Modeling.ipynb		Tweet-Topic-Modeling.ipynb
frequency_dist.py		frequency_dist.py
gpt-2-lm-lex.ipynb		gpt-2-lm-lex.ipynb
tweet-data.ipynb		tweet-data.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Appendix

Appendix

Data

Data

Scraped-Tweets

Scraped-Tweets

Visuals

Visuals

.DS_Store

.DS_Store

.gitignore

.gitignore

Elm_P4Project.pdf

Elm_P4Project.pdf

README.md

README.md

Tweet-Topic-Modeling.ipynb

Tweet-Topic-Modeling.ipynb

frequency_dist.py

frequency_dist.py

gpt-2-lm-lex.ipynb

gpt-2-lm-lex.ipynb

tweet-data.ipynb

tweet-data.ipynb

Repository files navigation

Elm: Emotional Language Model for Personalized Tweets

Business Problem

The Data

Methods

Validation

Findings

Conclusion

About

Releases

Packages

Languages

Stenke/Elm-GPT-2-Language-Model

Folders and files

Latest commit

History

Repository files navigation

Elm: Emotional Language Model for Personalized Tweets

Business Problem

The Data

Methods

Validation

Findings

Conclusion

About

Resources

Stars

Watchers

Forks

Languages