In [1]:
%load_ext autoreload
%autoreload 2

import logging
logging.getLogger("allennlp").setLevel("ERROR")

# Overview

This notebook demonstrates how to create perturbations with Tailor.

After reading the notebook, you will learn:
- How to perturb sentences with Tailor in a single line
- How to detect and select possible perturbations available on a sentence
- How to use additional controls in Tailor
- How to combine Tailor with additional keywords

In [57]:
# initiate a wrapper.
from tailor.tailor_wrapper import Tailor
tl = Tailor()

In [58]:
# The base sentence 
text = "In the operation room, the doctor comforted the athlete."

# perturb the sentence with one line:
# When running it for the first time, the wrapper will automatically
# load related models, e.g. the generator and the perplexity filter.
perturbations = tl.perturb(text)
perturbations

2022-03-14 00:44:36,662 - INFO - cached_path - cache of https://storage.googleapis.com/allennlp-public-models/structured-prediction-srl-bert.2020.12.15.tar.gz is up-to-date


[', comforted in the operation room',
 'the athlete was comforted by the doctor .',
 '- the athlete was comforted by the doctor .',
 'Having comforted the doctor , the athlete',
 "the athlete 's comforted by the doctor",
 'In which case , the doctor comforted the athlete.',
 'Having comforted the athlete , the doctor was.',
 'Having comforted the doctor , the athlete was.',
 'Having comforted the doctor , the athlete was.',
 '- In the case , the doctor comforted the athlete.']

## Perturbation with control

One advantage of Tailor is that it allows various kinds of controls based on semantic roles. Here, we provide examples for different perturbation strategies, and demonstrate how to invoke these changes in the package.

| Original      | Description |
| ----------- | ----------- |
| change_tense      | In the operation room, the doctor **comforts** the athlete. |
| change_voice   | In the operation room, the athlete **was comforted by** the doctor. |
| swap_core   | In the operation room, **the athlete** comforted **the doctor**. |
| add_detail   | **Under the dim light** in the operation room, the doctor comforted the athlete. |
| delete_detail   | **Under the dim light** in the operation room, the doctor comforted the athlete. |

In [59]:
# To perturb with more controls, can first detect what changes may exist

# As can be seen below, once we set the selected span, the system will try to only return perturbations
# related to the selection.
perturb_strategies = tl.detect_possible_perturbs(
    sentence=text,
    selected_span = "In the operation room",
    # print the possible change types
    verbalize=True
)

DETECTED POSSIBLE CHANGES

SENTENCE: In the operation room, the doctor comforted the athlete.
	| [change_content] [LOCATIVE: In the operation room]
	| [add_details] [LOCATIVE: In the operation room]
	| [change_role] [LOCATIVE: In the operation room]
	| [delete_text] [LOCATIVE: In the operation room]




In [107]:
from tailor.common.utils.detect_perturbations import get_common_keywords_by_tag
keywords = get_common_keywords_by_tag(
    data_path="/home/wtshuang/sourcetree/label-contrast-generation/label_contrast/srl/data/orig/train.json",
    nlp=tl.spacy_model)

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 90856/90856 [27:55<00:00, 54.24it/s]


 He met twice with President Chen Shui - bian and had numerous contacts with a wide range of other leading members of Taiwan society , including Tang Fei -LRB- premier -RRB- , Tsai Ing - wen -LRB- Mainland Affairs Council chairwoman -RRB- , the three MAC vice chairmen , Shi Huei - yow -LRB- Straits Exchange Foundation vice chairman -RRB- , Chuang Ming - yao -LRB- National Security Council secretary - general , Chiou I - jen -LRB- SEF deputy secretary - general -RRB- , Lee Yuan - tseh -LRB- chairman of the Presidential Office 's inter-party task force for policy coordination -RRB- , and Nita Ing -LRB- chairwoman of Taiwan High - Speed Rail Corporation -RRB- .
 He met twice with President Chen Shui - bian and had numerous contacts with a wide range of other leading members of Taiwan society , including Tang Fei -LRB- premier -RRB- , Tsai Ing - wen -LRB- Mainland Affairs Council chairwoman -RRB- , the three MAC vice chairmen , Shi Huei - yow -LRB- Straits Exchange Foundation vice chairman

 If you ask the shopkeepers in the main streets or back alleys , they will tell you that the favorite figure of the moment is the Maneki Neko , or Japanese " beckoning cat , " whose Chinese name is literally " wealth - attracting cat " -LRB- -LRB - c - RRB-?Ìö o ? .
 If you ask the shopkeepers in the main streets or back alleys , they will tell you that the favorite figure of the moment is the Maneki Neko , or Japanese " beckoning cat , " whose Chinese name is literally " wealth - attracting cat " -LRB- -LRB - c - RRB-?Ìö o ? .
 If you ask the shopkeepers in the main streets or back alleys , they will tell you that the favorite figure of the moment is the Maneki Neko , or Japanese " beckoning cat , " whose Chinese name is literally wealth - attracting cat " -LRB- -LRB - c - RRB-?Ìö o ? .
 For instance , because the first character in the Chinese compound for " coffin " -LRB- guan : ÛDÌ ’ -RRB- is a homonym for a character that means " high official " -LRB- -LRB - c - RRB - x -RRB- , an

In [56]:
import json
with open("common_keywords_by_tag.json", "w") as f:
    json.dump(keywords, f)

In [71]:
# The same variable `selected_span` exists in `tl.perturb`.
perturbations = tl.perturb(
    sentence=text,
    selected_span = "In the operation room",
    # can filter perturbations by their change type, as printed above.
    allowed_perturbs=["change_content"],
    # can reuse the detected strategies
    candidate_inputs = perturb_strategies,
    # filter out degeneration with gpt-2 perplexity score. If None, then this step is skiped.
    perplex_thred=50,
    # max number of perturbations to return.
    num_perturbs=10
)
perturbations

["In case of an injury , the doctor 's comforted the athlete.",
 "In case of a fatal accident , the doctor 's comforted the athlete.",
 "In case of a bruised hand , the doctor 's comforted the athlete."]

## Perturbation with context

With Tailor taking semantic controls, it can also be combined with external keyword. There are three variables. For now, the library only allows setting one keyword as non-None. If you set more than one, the system won't be able to recognize it.

- `to_content`: Keywords that should occur in the generation.
- `to_semantic_role`: Randomly select some phrase in the current sub-span as keyword, but change the generated semantic role. Accepted list includes `
            ['PURPOSE', 'AGENT', 'DISCOURSE', 'MODAL', 'PREDICATE', 'ATTRIBUTE', 
            'PATIENT', 'GOAL', 'END', 'ARG2', 'DIRECTIONAL', 'CAUSE', 'EXTENT', 
            'COMITATIVE', 'TEMPORAL', 'MANNER', 'NEGATION', 'ADVERBIAL', 
            'LOCATIVE', 'VERB']`
- `to_tense`: (specific to verbs) change the tense (future, present, past).

In [102]:
perturbations = tl.perturb_with_context(
    "In the operation room, the doctor comforted the athlete.", 
    "In the operation room",
    to_content="bridge",
    verbalize=True
)
perturbations


SENTENCE: In the operation room, the doctor comforted the athlete.
	| [change_content] [LOCATIVE: In the operation room]
	| [VERB+active+past: comfort | LOCATIVE+partial: bridge] <extra_id_0> , the doctor <extra_id_1> the athlete.




["Under the bridge , the doctor 's comforted the athlete.",
 "Under a bridge , the doctor 's comforted the athlete."]

In [106]:
perturbations = tl.perturb_with_context(
    "In the operation room, the doctor comforted the athlete.", 
    "In the operation room",
    to_semantic_role="TEMPORAL",
    verbalize=True
)
perturbations


SENTENCE: In the operation room, the doctor comforted the athlete.
	| [change_role] [LOCATIVE: In the operation room]
	| [VERB+active+past: comfort | TEMPORAL+partial: the operation room] <extra_id_0> , the doctor <extra_id_1> the athlete.




['When the doctor came into the operation room , the physician comforted the athlete.',
 'When the doctor arrived in the operation room , the physician comforted the athlete.',
 "While the doctor was in the operation room , the physician 's comforted the athlete."]

In [97]:
perturbations = tl.perturb_with_context(
    "In the operation room, the doctor comforted the athlete.", 
    "comforted",
    to_tense="future",
    verbalize=True
)
perturbations


SENTENCE: In the operation room, the doctor comforted the athlete.
	| [change_tense] [VERB: comforted]
	| [VERB+active+future: comfort | MODAL: *] In the operation room, the doctor <extra_id_0> the athlete.




['In the operation room , the doctor will comfort the athlete.',
 "In the operation room , the doctor 's will comfort the athlete."]