Skip to content

Zbebbington/CS585

Repository files navigation

CS585

CODE THAT WE WROTE:

analyze.py

Contains possible useful functions for analyzing tweets. i.e. to lowercase, word counts, removes empty tweets, get twitter attributes, etc.

most_freq_plt.py

Test code for getting bag of words data

perplexity.py

A modification of the IBM article ngram model. used for getting the perplexity for the ngram models. Put into a folder in the NGRAM models with tweets.txt and ngram count JSON files. Run for designated ngram in the code. (default 7)

retrieveTweets.py

Code that helped getting the tweet json file into a text file or csv.

tweet_downloader

Code we wrote using the twitter API to download tweets from users Prunes out retweets. Max 3200 tweets. stores in tweets folder. The tweets that we used are in the tweets_to_use folder in the root.


Models we used:

PytorchCharRNN source in folder

TextGenRNN source in folder

Tweet Generator Markov source in folder

bidirectional lstm - too big to upload onto github.

ngram-markov-chars-words https://developer.ibm.com/articles/cc-patterns-artificial-intelligence-part3/


notes: ignore ngram-markov (just a copy of the words folder)

some of the models worked as is with our Twitter data. others we made slight modifications so that we could feed it tweets. individual readmes will be in model folders

Python 3.6 for the ngram ones. Breaks on 3.7 because of an iteration issue.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •