Code for ACM MM'17 paper "Learning Fashion Compatibility with Bidirectional LSTMs"
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data Some fixes. Oct 10, 2017
model Create README.md Oct 9, 2017
polyvore Fill in the blank of SiamNet Dec 13, 2017
results Add files via upload Nov 27, 2017
.gitignore Initial commit Oct 9, 2017
Dockerfile Chagne official Dockerfile Apr 12, 2018
LICENSE Create LICENSE Oct 9, 2017
README.md Change README.md Apr 12, 2018
extract_feature.sh Update extract_feature.sh Apr 29, 2018
fill_in_blank.sh Fill in the blank of SiamNet Dec 13, 2017
outfit_generation.sh Add outfit generation code Nov 26, 2017
predict_compatibility.sh Add compatibility prediction code Oct 10, 2017
query.json
train.sh Add training and feature extraction code of SiamNet, VSE, and Bi-LSTM… Dec 11, 2017

README.md

Bi-LSTM model for learning fashion compatibility.

Code for ACM MM'17 paper "Learning Fashion Compatibility with Bidirectional LSTMs" [paper].

Parts of the code are from an older version of Tensorflow's im2txt repo GitHub.

The corresponding dataset can be found on GitHub or Google Drive.

Contact

Author: Xintong Han

Contact: xintong@umd.edu

Polyvore.com

Polyvore.com is a popular fashion website, where user can create and upload outfit data. Here is an exmaple.

Required Packages

I actually used some version between r0.10 to r0.11 as the first commit of Tensorflow's im2txt, you might need to install r0.11 and modify some functions to run the code. Newer versions of Tensorflow prevent me from doing inference with my old code and restoring my models trained using this version. However, I have a commit that supports training using TensorFlow 1.0 or greater idd1e03e. I will create a new repo supporting TensorFlow version >= 1.0.

Recommended Setup

excute the below command at this repository root:

docker build -t tensorflow:0.11 .
  • run container
docker run -it \
    --runtime=nvidia \
    -p 8888:8888 \
    -p 6006:6006 \
    -v $CURRENT:/root/workdir \
	tensorflow:0.11

Prepare the Training Data

Download the dataset and put it in the ./data folder:

  1. Decompress polyvore.tar.gz into ./data/label/
  2. Decompress plyvore-images.tar.gz to ./data/, so all outfit image folders are in ./data/images/
  3. Run the following commands to generate TFRecords in ./data/tf_records/:
python data/build_polyvore_data.py

Download the Inception v3 Checkpoint

This model requires a pretrained Inception v3 checkpoint file to initialize the network.

This checkpoint file is provided by the TensorFlow-Slim image classification library which provides a suite of pre-trained image classification models. You can read more about the models provided by the library here.

Run the following commands to download the Inception v3 checkpoint.

# Save the Inception v3 checkpoint in model folder.
wget "http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz"
tar -xvf "inception_v3_2016_08_28.tar.gz" -C ${INCEPTION_DIR}
rm "inception_v3_2016_08_28.tar.gz"

Training

./train.sh

The models will be saved in model/bi_lstm

Inference

Trained model

Download the trained models from the final_model folder on Google Drive and put it in ./model/final_model/model.ckpt-34865.

Extract features of test data

To do all three kinds of tasks mentioned in the paper. We need to first extract the features of test images:

./extract_features.sh

And the image features will be in data/features/test_features.pkl.

You can also perform end-to-end inference by modifying the corresponding code. For example, input a sequence of images and output a compatibility score.

Fashion fill-in-the-blank

./fill_in_blank.sh

Note that we further optimized some design choices in the released model. It can achieve 73.5% accuracy, which is higher than the number reported in our paper.

Compatibility prediction

./predict_compatibility.sh

Different from the training process where the loss is calculated in each mini batch, during testing, we get the loss againist the whole test set. This is pretty slow, maybe a better method could be used (e.g., using distance between LSTM predicted representation and the target image embedding).

Outfit generation

./outfit_generation.sh

It generates an outfit given the image/text query in query.json, and saves the results in the results dir. For demo purposes, the query.json only contains one example:

where green boxes indicate the image query, and the text query is "blue".

Some notes

We found that a late fusion of different single models (Bi-LSTM w/o VSE + VSE + Siamese) can achieve superior results on all tasks. These models are also available in the same folder on Google Drive.

Todo list

  • Add multiple choice inference code.
  • Add compatibility prediction inference code.
  • Add image outfit generation code. Very similar to compatibility prediction, you can try to do it yourself if in a hurry.
  • Release trained models.
  • Release Siamese/VSE models.
  • Polish the code.

Citation

If this code or the Polyvore dataset helps your research, please cite our paper:

@inproceedings{han2017learning,
  author = {Han, Xintong and Wu, Zuxuan and Jiang, Yu-Gang and Davis, Larry S},
  title = {Learning Fashion Compatibility with Bidirectional LSTMs},
  booktitle = {ACM Multimedia},
  year  = {2017},
}