Skip to content

heendung/vs-zsl

Repository files navigation

Revisiting Document Representations for Large-Scale
Zero-Shot Learning

Official implementation for the paper Revisiting Document Representations for Large-Scale Zero-Shot Learning
by Jihyung Kil, Wei-Lun Chao, NAACL 2021.

[Update 03/20/22]: Add environment, visual features and labels of our split, and codes for weighted average semantic represenations and DeViSE*.

Environment

Import our conda environment:

conda env create -f ZSL_fv.yaml
conda activate ZSL_fv

Dataset

Wikipedia Documents

The (non) filtered Wikipedia sentences are available on here. Please refer to the related README for more details.

Semantic Representations

Extract the semantic representations from the (non) filtered sentences:

CUDA_VISIBLE_DEVICES=0 python3 get_sem_rep.py --wiki_set data/21k_true_wiki_sents_vis_sec_clu --pool avg_pool --flt vis_sec_clu --max_seq_len 64 --max_sent all

Visual Features

We use the ResNet visual features (He et al., 2016) provided by (Xian et al., 2018a).

Visual features and labels of our 1K/2-Hop/3-Hop/ALL split.

Visual Attributes

For AwA2 and aPY, we use visual attributes provided by (Xian et al., 2018a).

Data Split

Please refer to README on here how to split ImageNet into our settings (i.e., 2-Hop, 3-Hop, ALL).

For AwA2 and aPY, we follow the proposed split provided by (Xian et al., 2018a).

Code

We leverage three Zero-Shot Learning models in our experiments:

  • DeViSE (Frome et al., 2013): DeViSE and DeViSE* are based on the implementation from here.
  • EXEM (Changpinyo et al., 2020): We use its official implementation.
  • HVE (Liu et al., 2020): The official implementation can be found on here.

Weighted Average Semantic Represenations (ac):

python3 train_b_psi_comb_fv.py --output  fine_tune_b_psi --tau 0.96 --type pre_trained --train True --discri discriminate --epochs 5 --split all_data --batch_size 512 --eps 0.95 --lr 1e-4 --sent_rep vis_sec_clu_sem_rep_pre_trained_fv.pt --avg_rep vis_sec_clu_avg_sem_rep_pre_trained_fv/bert_sem.pt
  • Obtain the weighted average semantic represenations (ac) after training:
python3 train_b_psi_comb_fv.py --output fine_tune_b_psi --tau 0.96 --type pre_trained --train False --discri discriminate --epochs 1  --split all_data --batch_size 768 --eps 0.95 

Train DeViSE*:
python3 DeVise_star.py --data_dir /local/scratch/jihyung --output devise_result --tau 0.96 --type pre_trained --eps 0.95 --bs 768 --split all_data --marg 0.2 --lr 0.0004 --num_epochs 50 --sem_type bert_p_w --sem_rep fine_tune_b_psi_eps_0.95_tau_0.96_pre_trained_fv/all_data_semantic_rep_after_train_epochs_1.pt

Citation

If you find the code and data useful, please cite the following paper:

@inproceedings{kil2021revisiting,
  title={Revisiting Document Representations for Large-Scale Zero-Shot Learning},
  author={Kil, Jihyung and Chao, Wei-Lun},
  booktitle={Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
  pages={3117--3128},
  year={2021}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages