MSJE: Learning TFIDF Enhanced Joint Embedding for Recipe-Image Cross-Modal Retrieval Service

This repository contains the code to train and evaluate models from the paper:
Learning TFIDF Enhanced Joint Embedding for Recipe-Image Cross-Modal Retrieval Service

Introduction

Recipe-to-Image retrieval task

Given a recipe query which contains the recipe title, a list of ingredients and a sequence of cooking instructions, the goal is to train a statistical model to retrieve the associated image. For the recipe query, we list the top 5 images retrieved by JESR, ACME and our MSJE model.

Installation

We use the environment with Python 3.7.6 and Pytorch 1.4.0. Run pip install --upgrade cython and then install the dependencies with pip install -r requirements.txt. Our work is an extension of im2recipe.

Recipe1M Dataset

The Recipe1M dataset is available for download here, where you can find some code used to construct the dataset and get the structured recipe text, food images, pre-trained instruction featuers and so on.

Vision models

This current version of the code uses a pre-trained ResNet-50.

Out-of-the-box training

To train the model, you will need to create following files:

data/train_lmdb: LMDB (training) containing skip-instructions vectors, ingredient ids and categories.
data/train_keys: pickle (training) file containing skip-instructions vectors, ingredient ids and categories.
data/val_lmdb: LMDB (validation) containing skip-instructions vectors, ingredient ids and categories.
data/val_keys: pickle (validation) file containing skip-instructions vectors, ingredient ids and categories.
data/test_lmdb: LMDB (testing) containing skip-instructions vectors, ingredient ids and categories.
data/test_keys: pickle (testing) file containing skip-instructions vectors, ingredient ids and categories.
data/text/vocab.txt: file containing all the vocabulary found within the recipes.

Recipe1M LMDBs and pickle files can be found in train.tar, val.tar and test.tar. here

It is worth mentioning that the code is expecting images to be located in a four-level folder structure, e.g. image named 0fa8309c13.jpg can be found in ./data/images/0/f/a/8/0fa8309c13.jpg. Each one of the Tar files contains the first folder level, 16 in total.

The pre-trained TFIDF vectors for each recipe, image category feature for each image and the optimized category label for each image-recipe pair can be found in id2tfidf_vec.pkl, id2img_101_cls_vec.pkl and id2class_1005.pkl respectively.

Word2Vec

Training word2vec with recipe data:

Download and compile word2vec
Train with:

./word2vec -hs 1 -negative 0 -window 10 -cbow 0 -iter 10 -size 300 -binary 1 -min-count 10 -threads 20 -train tokenized_text.txt -output vocab.bin

The pre-trained word2vec model can be found in vocab.bin.

Training

Train the model with:

CUDA_VISIBLE_DEVICES=0 python train.py

We did the experiments with batch size 100, which takes about 11 GB memory.

Testing

Test the trained model with

CUDA_VISIBLE_DEVICES=0 python test.py

The results will be saved in results, which include the MedR result and recall scores for the recipe-to-image retrieval and image-to-recipe retrieval.
Our best model trained with Recipe1M (TSC paper) can be downloaded here.

Contact

For any questions or suggestions you can use the issues section or reach us at zhongweixie@gatech.edu.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
assets		assets
data/text		data/text
README.md		README.md
args.py		args.py
data_loader.py		data_loader.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
trijoint.py		trijoint.py
triplet_loss.py		triplet_loss.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

data/text

data/text

README.md

README.md

args.py

args.py

data_loader.py

data_loader.py

requirements.txt

requirements.txt

test.py

test.py

train.py

train.py

trijoint.py

trijoint.py

triplet_loss.py

triplet_loss.py

Repository files navigation

MSJE: Learning TFIDF Enhanced Joint Embedding for Recipe-Image Cross-Modal Retrieval Service

Contents

Introduction

Recipe-to-Image retrieval task

Installation

Recipe1M Dataset

Vision models

Out-of-the-box training

Word2Vec

Training

Testing

Contact

About

Releases

Packages

Languages

Kevinnest/MSJE

Folders and files

Latest commit

History

Repository files navigation

MSJE: Learning TFIDF Enhanced Joint Embedding for Recipe-Image Cross-Modal Retrieval Service

Contents

Introduction

Recipe-to-Image retrieval task

Installation

Recipe1M Dataset

Vision models

Out-of-the-box training

Word2Vec

Training

Testing

Contact

About

Resources

Stars

Watchers

Forks

Languages