Generates poetry from images using convolutional and recurrent neural networks
Python HTML
Latest commit 6fbd5fc Jan 13, 2016 @rossgoodwin chdir issue
Permalink
Failed to load latest commit information.
char-rnn added karpathy libs Jan 12, 2016
models creative commons Jan 12, 2016
neuralsnap chdir issue Jan 13, 2016
neuraltalk2 added karpathy libs Jan 12, 2016
.gitignore obj oriented Jan 13, 2016
LICENSE license Jan 12, 2016
README.md readme html Jan 12, 2016

README.md

"All the time the sun / Is wheeling out of a dark bright ground."

A close up of a red and white striped umbrella and the streets are still still and the sun is still a little and the sun is still and the sun is still and the sun is still and the sun is gone.

The sun is singing and the sun is still and the sun is looking for the sun.

The sun is still there and the sun is still and the sun is down and the sun is still and the sun is gone.

The sun is still there and the sun is gone.

The sun is still and the sun is still there.

The sun is dead.

The sun is still, and the sun is shining in the wind.

A close up of a red and white striped umbrella, and the post office was a fire of stone in the middle of the bar. The streetcar remains of the police. The bar was a little strange and sharp as the man who had the strength to stay away from the problem. The little girl was a big fish with a green shoe which had hardly a shot and the party of the storm was being considered a strange story.

A close up of a red and white striped umbrella, and the mountains are lighted and still there is no definition in the morning of a little of morning to the first thing that makes the moon sink under the space that will be gone from the stars.

I want to see it like this has been so awkward and unwanted. I shall see it all out of the sea of the past. Only the sound of the trees are coming up like a bone to be put on a strange ship.

A close up of a red and white striped umbrella, and the living and the dark and the red with the sleep of a hand, and the wind remains the sun and the high leaves.

The sun is beautiful, and there is no desire to feel the mind off the sea.

In the middle of the place, the moon is opened with the human form.

The world is all the streets are all there.

The land of the broken light of life is a procession

Of many million men in lonely arms

On the page of the barn. All the time the sun

Is wheeling out of a dark bright ground.

(Painting by Mark Rothko, poetry generated by NeuralSnap)

NeuralSnap

A new year, a new word camera: NeuralSnap uses artificial neural networks to generate poetry from images.

By Ross Goodwin, 2016

More Output Samples

Intent

This project, in many ways a follow-up on word.camera, was created on the shoulders of two spectacular open source contributions by Andrej Karpathy: NeuralTalk2 and Char-RNN, both of which run in Torch. The code I've provided in this repository is a modest Python wrapper for a few of Karpathy's scripts, and a means to experiment with a few models that I've trained on Nvidia K80 GPUs using the High Performance Computing facilities at NYU.

In my research, I am developing tools that I hope will serve to augment human creativity. These are the first neural network models to emerge from my explorations, and I've decided to make them available to others:

$ wget https://s3.amazonaws.com/rossgoodwin/models/2016-01-12_char-rnn_model_01_rg.t7
$ wget https://s3.amazonaws.com/rossgoodwin/models/2016-01-12_char-rnn_model_02_rg.t7
$ wget https://s3.amazonaws.com/rossgoodwin/models/2016-01-12_neuraltalk2_model_01_rg.t7

NOTE: These models are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. That means if you use them you must attribute me (Ross Goodwin), you cannot use these models for commercial purposes, and any derivative work you produce must be licensed under the same terms.

Creative Commons License
NeuralSnap Models by Ross Goodwin are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://github.com/rossgoodwin/neuralsnap.

How It Works

NeuralTalk2 uses convolutional and recurrent neural networks to caption images. I trained my own model on the MSCOCO dataset, using the general guidelines Karpathy outlines in his documentation, but made adjustments to increase verbosity.

I then trained a recurrent neural network using Char-RNN on about 40MB of (mostly) 20th-century poetry from a variety of writers (and a variety of cultures) around the world.

In the output examples above, the red text is the image caption, and the poetry-trained net generated the rest of the text. The stanzas iterate through different RNN "temperature" values, an input that controls the riskiness of the model's predictions. Lower temperature results will be more repetitive and strictly grammatical, while higher temperature results contain more variety but may also contain more errors.

How To Run It

I have tested this software on Ubuntu 14.04 and 15.10. You can try other OS options at your own risk -- in theory, it should run on anything that can run Torch and Python, although I've heard that NeuralTalk2 does not play nice with the Raspberry Pi.

You'll need to clone the NeuralTalk2 and Char-RNN repos into the main folder, then follow Karpathy's README instructions to install the dependencies for NeuralTalk2 -- you don't need to worry about any of the GPU or training stuff, unless you want to use your own models. (The models I've provided are calibrated to run on CPUs.) Thankfully, the dependencies of Char-RNN are a subset of those required for NeuralTalk2.

Usage

$ python neuralsnap.py <output_title> <ntalk_model_filepath> <rnn_model_filepath> <image_folder_filepath>

e.g.

$ python neuralsnap.py testing123 /path/to/neuraltalk2/model/2016-01-12_neuraltalk2_model_01_rg.t7 /path/to/char-rnn/model/2016-01-12_char-rnn_model_02_rg.t7 /path/to/image/folder

Software License Information

As noted above, my trained models are available under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. However, I have licensed the code in this repository under GPLv3.

NeuralSnap image-to-text poetry generator
Copyright (C) 2016  Ross Goodwin

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.

You may contact Ross Goodwin via email at ross.goodwin@gmail.com or
address physical correspondence to:

Ross Goodwin c/o ITP
721 Broadway, 4th Floor
New York, NY 10003