TCE-CVSE

Code for the MMM 2023 paper: Textual Concept Expansion with Commonsense Knowledge to Improve Dual-Stream Image-Text Matching.

Requirements and Installation

Python 3.6
PyTorch 1.1.0
NumPy (>1.12.1)
TensorBoard
torchtext
pycocotools
pytorch-lightning 1.2.8
transformers 4.5.1

Data

Download the dataset created by SCAN:

wget https://iudata.blob.core.windows.net/scan/data.zip
wget https://iudata.blob.core.windows.net/scan/vocab.zip

First, please download the stanford-postagger and put it into './stanford-postagger'. The concepts vocabulary and other train data are generated by generate_vocab.py:

python generate_vocab.py

The train data for multi-label classifer is generate by ML-Classifier/data/coco_data.py

Train the model

Train our model on SLURM: sbatch tce_run_with_slurm_das_train.sh.

DATA_PATH="./data/coco_annotations/Concept_annotations_coco_vocab"
LOG_PATH="./runs/coco_new/"

python train_coco.py --data_path "../../data/scan_data/data" \
  --batch_size 512 \
  --num_attribute 300 \
  --model_name "$LOG_PATH/CVSE_COCO_data_omp_train_top_300_five_caption/" \
  --concept_name "$DATA_PATH/data_omp_train_top_300_five_caption/category_concepts.json" \
  --inp_name "$DATA_PATH/data_omp_train_top_300_five_caption/coco_concepts_glove_word2vec.pkl" \
  --resume "none" \
  --adj_file "$DATA_PATH/data_omp_train_top_300_five_caption/coco_adj_concepts.pkl" \
  --adj_gen_mode "ReComplex" \
  --t 0.3 \
  --alpha 0.9 \
  --attribute_path "$DATA_PATH/data_omp_train_top_300_five_caption/" \
  --test_on "five" \
  --re_weight 0.2 \
  --logger_name "$LOG_PATH/CVSE_COCO/data_omp_train_top_300_five_caption_prediction/"

Evaluation

Test our model on SLURM. Generate the Textual Concept Expansion by: sbatch ml_run_with_slurm_das_test.sh

We upload our model here. You can download them and put them into the model directory. Test the model by: sbatch tce_run_with_slurm_das_test.sh

Citation

@inproceedings{liang2023textual,
  title={Textual Concept Expansion with Commonsense Knowledge to Improve Dual-Stream Image-Text Matching},
  author={Liang, Mingliang and Liu, Zhuoran and Larson, Martha},
  booktitle={MultiMedia Modeling: 29th International Conference, MMM 2023, Bergen, Norway, January 9--12, 2023, Proceedings, Part I},
  pages={421--433},
  year={2023},
  organization={Springer}
}
}

We borrow the code from "CVSE" and "Multi-label Text Classification with BERT and PyTorch Lightning"

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
ML-Classifier		ML-Classifier
data		data
figures		figures
runs		runs
util		util
vocab		vocab
.DS_Store		.DS_Store
README.md		README.md
data.py		data.py
evaluate.py		evaluate.py
evaluation.py		evaluation.py
ml_run_with_slurm_das_test.sh		ml_run_with_slurm_das_test.sh
model_CVSE.py		model_CVSE.py
tce_run_with_slurm_das_test.sh		tce_run_with_slurm_das_test.sh
tce_run_with_slurm_das_train.sh		tce_run_with_slurm_das_train.sh
train_coco.py		train_coco.py
train_f30k.py		train_f30k.py
vocab.py		vocab.py

Anastasiais-ml/TCE-CVSE

Folders and files

Latest commit

History

Repository files navigation

TCE-CVSE

Requirements and Installation

Data

Train the model

Evaluation

Citation

About

Resources

Stars

Watchers

Forks

Languages