Word Vector Text Inventor
The repository contains the code and data necessary to invent responses to an assertion from Baudelaire's Enivrez-vous. Each word in the invented text is based on an analogy with the word pair bien / mal and on word vectors derived from 30 volumes written by Gustave Flaubert.
documents and executes a method of algorithmic rhetorical invention.
A quick explanation of what's under the hood
Using gensim to build a word2vec model based on a corpus of French texts , the code takes a pair of words (e.g. "homme" and "femme") and a text as parameters to generate a new text. Each word in the original text is replaced by a word that is "most similar" to it according to the word pair. For instance, if "roi" is a word in the original text, it would be replaced by analogy:
>>> model.most_similar(positive=['femme', 'roi'], negative=['homme'], topn=1) [(u'reine', 0.8085041046142578)]
If the word vector model is unable to complete an analogy, the word from the asserted text does not change in the invented text.