benchmark-corruptions

Public repo containing supplementary material for the work submitted to EMNLP-19

This repository hosts the corrupted test sets that were generated to benchmark the robustness of the FastText and Bi-LSTM models as described in the paper.Benchmarking Popular Classification Models' Robustness to Random and Targeted Corruptions These test sets were derived by applying the two corrupted strategies and the different corruption methods to the test sets of the following datasets: SST2, IMDB, YELP and DBPEDIA

The repository consists of two folders each corresponding to the respective models used. The filenames follow the convention: [DATASET][CORRUPTION STRATEGY][CORRUPTION METHOD][MODEL]_n

DATASET is the base dataset that was used
CORRUPTION STRATEGY One of Random or LIME based corruption
CORRUPTION METHOD Corresponds to how the selected word was modified (Deleted, Spelling Error Introduced, Text Noise introduced, Replaced by Synonym)
MODEL The model used for LIME explanations
n The number of words changed as per the strategy

PS: For the fasttext model, the files are in the format LABEL[TAB]TEXT For the Bi-LSTM model, we use a self-explanatory JSON format to be compatible with AllenNLP

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
bi-lstm		bi-lstm
fasttext		fasttext
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bi-lstm

bi-lstm

fasttext

fasttext

README.md

README.md

Repository files navigation

benchmark-corruptions

About

Releases

Packages

constraint-solvers/benchmark-corruptions

Folders and files

Latest commit

History

Repository files navigation

benchmark-corruptions

About

Resources

Stars

Watchers

Forks