Skip to content
NYU ITP 2019 Thesis. An interactive experience to see how machine interpret one thing differently from human
Python JavaScript HTML
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
readme
source_img
static
templates
.DS_Store
.gitattributes
README.md
app.py
categories_full.json
draw_strokes.py
drawsketch.py
sample.jpg
sample.svg
thesis_final.pdf

README.md

Lost in Translation

NYU ITP 2019 Thesis
An interactive experience to see how machine interpret one thing differently from human.

Presentation Video in ITP Thesis Week 2019

Introduction

The project has a recursive process for human and machine to interpret each other’s results. Human needs to come up with a sentence to describe an image generated by machine and the machine will do multiple machine learning translations from the description from human to a sketch and then to an image in each round of process.

Inspiration

Telephone Game

An example of multiple translations
Drawception - Picture Telephone Drawing Game

Closed Loop

A project uses machine learning to do feedback loop on images and texts.
Jake Elwes - Closed Loop

Implementation

  • Python Server with Flask
  • Javascript Client
  • Generate a sentence from an image by im2txt
  • Find word tags and get nouns by SpaCy
  • Word Vector similarity by SpaCy
  • Draw doodles by SketchRNN
  • Generate new images by AttnGan

app.py

Server code
Coordinate and process most of the data.
Use http connection to communicate with Runway and Client.

static/client.js

Client Code
Present the result and collect user input.

categories.json

A Json file that store all sketch categories

draw_strokes.py

functions to draw sketch

drawSketch.py

a test function to draw sketch

im2txt

A machine learning model that can generate a sentence based on an image.
The model is originated from models/research/im2txt. A pre-trained model is provided in Runway.

SketchRNN

A machine learning model that can generate doodle in specific categories.
The doodle data is from Quick, Draw! The Data and the model detail is from Magenta - SketchRNN.
It is downloaded from Google Cloud Platform.

AttnGan

The model is from GitHub - taoxugit/AttnGAN.
A machine learning model that can generate image from a sentence.
A pre-trained model is provided in Runway.

You can’t perform that action at this time.