Skip to content

Kilichbek/artemis-speaker-tools-b

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code
This branch is 19 commits ahead, 2 commits behind aimagelab:master.

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

ArtEmis Speaker Tools B

This repo contains following things related to [2]:

  1. User Interfaces used in human studies for MTurk Experiments
  2. Evaluation Tools
  3. Neural Speakers (nearest neighbor baseline, basic & grounded versions of M2 transformers)

Data preparation

Please, prepare annotations and detection features files for the ArtEmis dataset to run the code:

  1. Download Detection-Features and unzip it to some folder. Features are computed with the code provided by [1].
  2. Download pickle file which contains [<image_name>, <image_id>], and put it in the same folder where you have extracted detection features.
  3. Download ArtEmis dataset.
  4. Download vocabulary files 1, 2

Some bounding box visualizations for art images:

BBox Features

Environment Setup

Clone the repository and create the artemis-m2 conda environment using the environment.yml file:

conda env create -f environment.yml
conda activate artemis-m2

Then download spacy data by executing the following command:

python -m spacy download en

Training procedure

Run python train.py using the following arguments:

Argument Possible values
--exp_name Experiment name
--batch_size Batch size (default: 10)
--workers Number of workers (default: 0)
--m Number of memory vectors (default: 40)
--head Number of heads (default: 8)
--warmup Warmup value for learning rate scheduling (default: 10000)
--resume_last If used, the training will be resumed from the last checkpoint.
--resume_best If used, the training will be resumed from the best checkpoint.
--features_path Path to detection features file
--annotation_folder Path to folder with COCO annotations
--use_emotion_labels If enabled, emotion labels will be used (default: "False")
--logs_folder Path folder for tensorboard logs (default: "tensorboard_logs")

To train grounded-version of the model, include additional parameter --use_emotion_labels=1.

python train.py --exp_name <exp_name> --batch_size 50 --m 40 --head 8 --warmup 10000 --features_path /path/to/features --annotation_folder /path/to/annotations/artemis.csv --workers 4 --logs_folder /path/to/logs/folder [--use_emotion_labels=1]

Pretrained Models

Download our pretrained models and put them under saved_models folder:

Run python test.py using the following arguments:

Argument Possible values
--batch_size Batch size (default: 10)
--workers Number of workers (default: 0)
--features_path Path to detection features file
--annotation_folder Path to folder with COCO annotations
python test.py --exp_name <exp_name> --features_path /path/to/features --annotation_folder /path/to/annotations/artemis.csv --workers 4 [--use_emotion_labels=1]

Some generations from the neural speakers:

M2 outputs

References

[1] Faster R-CNN with model pretrained on Visual Genome
[2] ArtEmis: Affective Language for Visual Art (Panos Achlioptas, Maks Ovsjanikov, Kilichbek Haydarov, Mohamed Elhoseiny, Leonidas Guibas)
[3]Meshed Memory Transformer.

Releases

No releases published

Packages

No packages published

Languages

  • HTML 90.2%
  • Python 9.8%