Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
MediaTakeOut Headline Generator from Robert Elwell I love MediaTakeOut. So I decided to make a language model for it. I ain't copywriting shit, all this open-source software is legit. I don't mind if you look around the code a little bit. mto-scrape.py Run this first. It grabs headlines from MediaTakeOut. It's really that easy. Takes <1hr. mto-analyze.py Run this to get some statistics about what you pulled. You need the sentences.prepped file it generates to run your language model. mto-languagemodel.py This actually builds out the fake headlines. Because I'm lazy and I'm using NLTK's Text and NGram language model classes. I'm just writing to a file from stdout to get my sentences. Options are provided in this, so take a look. generated.txt Here's an example of headlines. Take a look at it in action at http://robertelwell.info/mediatakeout-headline-generator/ Requires NLTK with the punkt and stopwords dictionaries downloaded.