MICO

This is the code repo for EMNLP2022 MICO: a multi-alternative contrastive learning framework for commonsense knowledge representation

Data Preparation

Training Data

Download ATOMIC19 and CN-82k and put the dataset_only folder under ./preprocess Prepare the training data

cd ./preprocess
python mapping_train_name.py

Evaluation Data

cd ./CSQA_eval

COPA Transform the original data into the form sastifying the format of MICO.

cd ./datasets/copa/
python transform.py

CommmonsenseQA (CSQA) and SocialIQA (SIQA)

Training

cd ./scripts

CUDA_VISIBLE_DEVICES=0 python main.py \
    --temp 0.07 \
    --save_folder ./ckpts_atomic/k2/roberta_large \
    --batch_size 196 \
    --max_seq_length 32 \
    --learning_rate 0.000005 \
    --epochs 10 \
    --save_freq 3 \
    --model roberta-large \
    --tokenizer_name roberta-large \
    --trainfile ../preprocess/ATOMIC-Ind-train.txt \
    --valfile ../preprocess/ATOMIC-Ind-valid.txt \
    --dropout \
    --k 2

Evaluation for Zero-shot Commonsense Question Answering

Use pre-trained LMs to evaluate the CSQA tasks.

cd ./LM_baseline
sh eval_baseline.sh

It will report the accuracy score on the three tasks and generate the prediction file.

Use MICO to evaluate the CSQA tasks. For example, use checkpoint trained on ATOMIC to test SIQA

cd ../CSQA_eval

CUDA_VISIBLE_DEVICES=0 python evaluate_socialiqa.py --save_folder ../scripts/ckpts_atomic/k2/roberta_large \
    --max_seq_length 64 \
    --temp 0.07 \
    --model roberta-large \
    --tokenizer_name roberta-large \
    --testfile ../dataset/SIQA/socialiqa-train-dev/dev.jsonl \
    --testlabel ../dataset/SIQA/socialiqa-train-dev/dev-labels.lst

More evaluation refers to eval.sh. Trained models with RoBERTa-large and k=2 can be downloaded.

Evaluation for Inductive CSKG Completion

Use trained models to generate feature first and then retrieve, calculate the MRR and rank top10 score.

First extract feature of training dataset and test dataset. This step will generate pickle files for CSKG completion

cd ./scripts
sh eval_tail.sh

Then for ATOMIC19

python eval_mrr_atomic.py

and for ConceptNet

python eval_mrr_cn.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MICO

Data Preparation

Training Data

Evaluation Data

Training

Evaluation for Zero-shot Commonsense Question Answering

Evaluation for Inductive CSKG Completion

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
CSQA_eval		CSQA_eval
LM_baseline		LM_baseline
preprocess		preprocess
scripts		scripts
README.md		README.md

HKUST-KnowComp/MICO

Folders and files

Latest commit

History

Repository files navigation

MICO

Data Preparation

Training Data

Evaluation Data

Training

Evaluation for Zero-shot Commonsense Question Answering

Evaluation for Inductive CSKG Completion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages