# Introduction

This notebook can be used to evaluate the different transformations and filters present in the NL-Augmenter.

# Setting up the repository

**Clone the repository!**

In [None]:
!git clone https://github.com/GEM-benchmark/NL-Augmenter.git

In [None]:
%cd NL-Augmenter

## Installation of requirements

By default, all the filters (both light and heavy) and the light transformations will be installed (through base requirements). To make sure all the heavy transformations are also installed, run the following commands.

In [None]:
TRANSFORMATIONS_DIR = 'transformations'

import os

# Install tesseract for OCR transformation
!sudo apt install -y libleptonica-dev libtesseract-dev tesseract-ocr{,-eng,-osd}

# Install base project requirements
!pip install -r requirements.txt

# English Spacy model
!python -m spacy download en_core_web_sm

# Install requirements for every transformation (this is necessary to run the evaluation script)
for transformation_dir in os.listdir(TRANSFORMATIONS_DIR):
  transformation_path = os.path.join(TRANSFORMATIONS_DIR, transformation_dir)
  if os.path.isdir(transformation_path) and 'requirements.txt' in os.listdir(transformation_path):
    transformation_reqs = os.path.join(transformation_path, 'requirements.txt')
    !pip install -r "$transformation_reqs"


**Note:** 
The requirements for some transformations and filters may have been disabled (during the merging of PR). If your transformation or filter has disabled requirements, please install them separately by uncommenting the below command and adding the relevant transformation or filter name.

In [None]:
# ! pip install transformations_or_filters_folder/transformation_or_filter_name/requirements-disabled.txt

# Transformations

Each transformation may support multiple task types. Depending on the task types used in the transformation, the datasets and settings of the model to perform evaluation may differ. Please refer to the [evaluation](https://github.com/GEM-benchmark/NL-Augmenter/tree/main/evaluation) page for different settings. 

If you want to use any dataset other than the ones shown in the notebook for your task type, you can add that to the evaluation engine by following the instructions specified [here](https://github.com/GEM-benchmark/NL-Augmenter/tree/main/evaluation#evaluation-guideline-and-scripts).

## Sentence Operation

If your transformation is a SentenceOperation one, evaluate it by runnning the 4 cells below and paste the numbers in the excel sheet. (Note that you only need to change the model name: each of the below setting represents 4 different settings of models and datasets - to confirm what you are testing, you can check the [evaluation](https://github.com/GEM-benchmark/NL-Augmenter/tree/main/evaluation) page.)

In [None]:
!python evaluate.py -t GeoNamesTransformation -task "TEXT_CLASSIFICATION" -m "textattack/roberta-base-imdb" -d "imdb" -p 20

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package cmudict to /root/nltk_data...
[nltk_data]   Unzipping corpora/cmudict.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package universal_tagset to /root/nltk_data...
[nltk_data]   Unzipping taggers/universal_tagset.zip.
Downloading: 100% 908/908 [00:00<00:00, 679kB/s]
Downloading: 100% 1.63G/1.63G [00:45<00:00, 35.6MB/s]
Downloading: 100% 899k/899k [00:01<00:00, 829kB/s]
Downloading: 100% 456k/456k [00:00<00:00, 626kB/s]
Downloading: 100% 1.36M/1.36M [00:01<00:00, 1.06MB/s]
Downloading: 100% 26.0/26.0 [00:00<00:00, 20.0kB/s]
Downloading: 100% 665/665 [00:00<00:00, 441kB/s]
Downloading: 100% 548M/548M [0

In [None]:
!python evaluate.py -t GeoNamesTransformation -task "TEXT_CLASSIFICATION" -m "textattack/roberta-base-SST-2" -d "sst2" -p 20

In [None]:
!python evaluate.py -t GeoNamesTransformation -task "TEXT_CLASSIFICATION" -m "textattack/bert-base-uncased-QQP" -d "qqp" -p 20

In [None]:
!python evaluate.py -t GeoNamesTransformation -task "TEXT_CLASSIFICATION" -m "roberta-large-mnli" -d "multi_nli" -p 20

## QuestionAnswer Operation

If your transformation is a question answering one, run the below command with your transformation name.

In [None]:
!python evaluate.py -t QuestionInCaps -task "QUESTION_ANSWERING" -m "mrm8488/bert-tiny-finetuned-squadv2" -d "squad" -p 20

## Tagging Operation

If your transformation uses tagging operation, run the below command with your transformation name.

In [None]:
!python evaluate.py -t LongerLocationNer -task "TEXT_TAGGING" -m "dslim/bert-base-NER" -p 20

# Filters

Each filter may support multiple task types. Depending on the task types used in the filter, the datasets and settings of the model to perform evaluation may differ. Please refer to the [evaluation](https://github.com/GEM-benchmark/NL-Augmenter/tree/main/evaluation) page for different settings.

If you want to use any dataset other than the ones shown in the notebook for your task type, you can add that to the evaluation engine by following the instructions specified [here](https://github.com/GEM-benchmark/NL-Augmenter/tree/main/evaluation#evaluation-guideline-and-scripts).

## Sentence Operation

If your filter is a SentenceOperation one, evaluate it by runnning the 4 cells below and paste the numbers in the excel sheet. (Note that you only need to change the model name: each of the below setting represents 4 different settings of models and datasets - to confirm what you are testing, you can check the evaluation page.)

In [None]:
!python evaluate.py -f TextContainsNumberFilter -task "TEXT_CLASSIFICATION" -m "textattack/roberta-base-imdb" -d "imdb" -p 20

In [None]:
!python evaluate.py -f TextContainsNumberFilter -task "TEXT_CLASSIFICATION" -m "textattack/roberta-base-SST-2" -d "sst2" -p 20

In [None]:
!python evaluate.py -f TextContainsNumberFilter -task "TEXT_CLASSIFICATION" -m "textattack/bert-base-uncased-QQP" -d "qqp" -p 20

In [None]:
!python evaluate.py -f TextContainsNumberFilter -task "TEXT_CLASSIFICATION" -m "roberta-large-mnli" -d "multi_nli" -p 20

python3: can't open file 'evaluate.py': [Errno 2] No such file or directory


## QuestionAnswer Operation

If your filter is a question answering one, run the below command with your filter name.

In [None]:
!python evaluate.py -f  NumericQuestion -task "QUESTION_ANSWERING" -m "mrm8488/bert-tiny-finetuned-squadv2" -d "squad" -p 20

Traceback (most recent call last):
  File "evaluate.py", line 56, in <module>
    implementation = get_implementation(args.filter, "filters")
  File "/content/NL-Augmenter/TestRunner.py", line 235, in get_implementation
    for operation in OperationRuns.get_all_operations(search):
  File "/content/NL-Augmenter/TestRunner.py", line 215, in get_all_operations
    t_py = import_module(f"{search}.{folder}")
  File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/content/N

**Note:**

If there are any issues or error while running the notebook, please feel free to raise an issue [here](https://github.com/GEM-benchmark/NL-Augmenter/issues).