- wikipedia-extractor 154 This is a mirror of the script by Giuseppe Attardi, and contains history before the official repo started: https://github.com/attardi/wikiextractor --- Extracts and cleans text from Wikipedia datab…
- causeofwhy 46 The goal of this project is to implement a Question Answering (QA) system that answers causal type questions. We use Wikipedia as a knowledge base, extracting answers to user questions from the art…
- twitter-corpus 15 Collects all tweets from the sample Public stream using Twitter's streaming API, and saves them to a file for later use as a corpus.
- infertweet 9 Infer information from Tweets. Useful for human-centered computing tasks, such as sentiment analysis, location prediction, authorship profiling and more!
- haikupy 4 An English language haiku generator that uses the 5-7-5 syllable pattern.