Skip to content

mbwolff/Word-Vector-Text-Modulator

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 

Word Vector Text Modulator

A contribution to 2016 NaNoGenMo

This repository contains the code and data necessary to generate Madame Bovary modulée, a novel based on the text by Flaubert but modified with word vectors derived from over 1300 nineteenth-century French texts.

Run the following command to produce the novel:

./transformText.py homme femme FlaubertMadameBovary.txt > MadameBovaryModulée.txt

A quick explanation of what's under the hood

Using gensim to build a word2vec model based on over 1300 French texts from the nineteenth century, the code takes a pair of words (e.g. "homme" and "femme") and a text as parameters to generate a modulated text. Each word in the original text is replaced by a word that is "most similar" to it according to the word pair. For instance, if "roi" is a word in the original text, it would be replaced thusly:

>>> model.most_similar(positive=['femme', 'roi'], negative=['homme'], topn=1)
[(u'reine', 0.8085041046142578)]

Handling verb conjugations and adjective agreements computationally in French is tricky but the code produces a mostly readable text needing grammatical polishing (a good exercise for students). The code can modulate any text against any pair of words.

About

A contribution to 2016 NaNoGenMo

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages