Skip to content

EMNLP '21 : Adversarial Examples for Evaluating Math Word Problem Solvers

Notifications You must be signed in to change notification settings

kevivk/MWP_Adversarial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Adversarial Examples for Evaluating Math Word Problem Solvers

Accepted at EMNLP Findings 2021

In this work we aim to evaluate the robustness of existing state-of-the-art MWP solvers by devising adversarial attacks.

We conduct our tests on three major solvers Graph2Tree, GTS and Seq2Seq models over two benchmark datasets present in English language namely AS-Div and MaWPS.

We observe on average a decline of over 40 pp in accuracy of these solvers. These evaluations were performed in two different settings using BERT and training from scratch.

TRAINING COMMANDS:

For graph2tree:

cd code/Graph2Tree

a) Train from Scratch python -m src.main -mode train -gpu 0 -embedding_size 300 -hidden_size 256 -depth 2 -lr 0.0002 -emb_lr 8e-6 -batch_size 4 -epochs 50 -full_cv -run_name run_mawps_fold1 -save_model

b) Using BERT Embeddings:

python -m src.main -mode train -gpu 0 -embedding -bert -bert_embedding_directory -embedding_size 300 -hidden_size 256 -depth 2 -lr 0.0002 -emb_lr 8e-6 -batch_size 4 -epochs 50 -full_cv -run_name run_mawps_fold1 -save_model

For GTS:

cd code/GTS

python -m src.main -mode train -gpu 0 -embedding -embedding_size 300 -hidden_size 256 -depth 2 -lr 0.0002 -emb_lr 8e-6 -batch_size 4 -epochs 50 -full_cv -run_name run_mawps_g2tee_fold1 -save_model

python -m src.main -mode train -gpu 0 -embedding -bert -bert_embedding_directory -embedding_size 300 -hidden_size 256 -depth 2 -lr 0.0002 -emb_lr 8e-6 -batch_size 4 -epochs 50 -full_cv -run_name run_mawps_g2tee_fold0 -save_model

TEST COMMANDS:

Graph2Tree:

cd code/Graph2Tree

python -m src.main -mode test -gpu 0 -embedding -embedding_size 300 -hidden_size 256 -depth 2 -lr 0.0002 -emb_lr 8e-6 -batch_size 4 -epochs 50 -no-full_cv -run_name run_mawps_g2tee_fold1_fold1

python -m src.main -mode test -gpu 0 -bert -bert_embedding_directory -embedding_size 300 -hidden_size 256 -depth 2 -lr 0.0002 -emb_lr 8e-6 -batch_size 4 -epochs 50 -no-full_cv -run_name run_mawps_g2tee_fold1_fold1

GTS :

python -m src.main -mode test -gpu 0 -embedding -embedding_size 300 -hidden_size 256 -depth 2 -lr 0.0002 -emb_lr 8e-6 -batch_size 4 -epochs 50 -no-full_cv -run_name run_mawps_fold1_fold1

python -m src.main -mode test -gpu 0 -embedding -bert -bert_embedding_directory -embedding_size 300 -hidden_size 256 -depth 2 -lr 0.0002 -emb_lr 8e-6 -batch_size 4 -epochs 50 -no-full_cv -run_name run_mawps_fold1_fold1

Comment test_batch = final_adv_pair[0] line for simple evaluation.

Adversarial Examples of both Question reordering and Sentence Paraphrasing would be formed as a new output folder in Model directory.

We have utilized model implementations in parts of MWP solvers from their respective repositories:

GTS: https://github.com/ShichaoSun/math_seq2tree

Graph2Tree: https://github.com/2003pro/Graph2Tree

Seq2Seq: https://github.com/arkilpatel/SVAMP

We have also referred to the repository https://github.com/arkilpatel/SVAMP and found it immensely useful.

Feel free to open a issue or mail us at vivek269611@gmail.com for any queries.

If you find our work useful please consider citing it using:

@misc{kumar2021adversarial,
      title={Adversarial Examples for Evaluating Math Word Problem Solvers}, 
      author={Vivek Kumar and Rishabh Maheshwary and Vikram Pudi},
      year={2021},
      eprint={2109.05925},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

About

EMNLP '21 : Adversarial Examples for Evaluating Math Word Problem Solvers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published