Deep Learning Game of Thrones Subtitles
Data is coming. Game of Thrones (GOT) is getting more and more interesting and I thought that a machine should not miss out on all the fun. So here’s the way I taught a machine the Game of Phrases –
- Scraped all the subtitle (.srt) files from the internet till Season 5.
- Lords of Text Analytics processed all the text to make some sense out of it.
- One True Algorithm was implemented to teach Mr. Machine all he should know.
I was itching to use word2vec on some dataset and this looked like the perfect match. Here’s the Wikipedia definition – "Word2vec is a group of related models that are used to produce so-called word embedding. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic contexts of words: the network is shown a word, and must guess which words occurred in adjacent positions in an input text"
The full blog can be found at http://machinelearningblogs.com/2016/12/game-of-thrones-analytics/