### Define the Prompt

We define a prompt that can perform two tasks given a topic and an argument <br>
1) classify whether the argument is a fallacy, and if so what fallacy the  argument contains <br>
2) decide whether the argument supports or refutes the topic <br>
Prompts templates are defined by a string where ```{}``` is a placeholder for where the inputs for a sample are to be and ```<mask>``` is filled in as the predicted label. <br>
*Multiple masked tokens in a single prompt is currently not supported*

In [3]:
templates = {"fallacy": "fallacy task. Topic: {} Text: {} This contains the fallacy: <mask>", 
             "stance": "procon task. Topic: {} Text: {} Has the relation: <mask>"}

def fallacy_policy(pred):
  fallacies = {'AppealtoEmotion', 'RedHerring', 'NoFallacy', 'IrrelevantAuthority','AdHominem','HastyGeneralization'}
  if pred not in fallacies: return 'UNKNOWN'
  return pred
def stance_policy(pred):
  if pred not in {"support", "contradict"}: return "UNKNOWN"
  return pred

policies = {"fallacy": fallacy_policy, "stance": stance_policy}

argument_prompt = Prompt(templates, policies)

RobertaPrompt uses this class to convert a sample into it's desire prompt - for example

In [4]:
argument_prompt.test_sample(["Should we allow animal testing?", "Animal testing abuses animals and should be dis-continued"], "fallacy")

'fallacy task. Topic: Should we allow animal testing? Text: Animal testing abuses animals and should be dis-continued This contains the fallacy: <mask>'

In [5]:
argument_prompt.train_sample(["Should we allow animal testing?", "Animal testing abuses animals and should be dis-continued"], "NoFallacy", "fallacy")

'fallacy task. Topic: Should we allow animal testing? Text: Animal testing abuses animals and should be dis-continued This contains the fallacy: NoFallacy'

In [6]:
print(argument_prompt)

=== Prompts ===
Task: fallacy, Template: fallacy task. Topic: {} Text: {} This contains the fallacy: <mask>
Task: stance, Template: procon task. Topic: {} Text: {} Has the relation: <mask>


### Load Model
We load a model we have already trained for this task. Some sample predictions are also displayed

In [None]:
pmodel = RobertaPrompt(model='/content/drive/MyDrive/Laidlaw Research Project/models/prompt_combined', device = torch.device('cuda'), prompt = argument_prompt)

In [None]:
print(pmodel)

/content/drive/MyDrive/Laidlaw Research Project/models/prompt_combined

Task: fallacy, Template: fallacy task. Topic: {} Text: {} This contains the fallacy: <mask>
Task: stance, Template: procon task. Topic: {} Text: {} Has the relation: <mask>



In [None]:
fallacy = pmodel.infer(["Should we allow animal testing?", "Animal testing abuses animals and should be dis-continued"], "fallacy")
stance = pmodel.infer(["Should we allow animal testing?", "Animal testing abuses animals and should be dis-continued"], "stance")
print("Fallacy: {}\nStance: {}".format(fallacy, stance))

Fallacy: NoFallacy
Stance: contradict


In [None]:
fallacy = pmodel.infer(["Should we allow animal testing?", "Your stupid for bringing this up, animal testing is horrible"], "fallacy")
stance = pmodel.infer(["Should we allow animal testing?", "Your stupid for bringing this up, animal testing is horrible"], "stance")
print("Fallacy: {}\nStance: {}".format(fallacy, stance))

Fallacy: AdHominem
Stance: contradict


In [None]:
fallacy = pmodel.infer(["Should we allow animal testing?", "My Dad had a dog once, he says animal testing should be allowed"], "fallacy")
stance = pmodel.infer(["Should we allow animal testing?", "My Dad had a dog once, he says animal testing should be allowed"], "stance")
print("Fallacy: {}\nStance: {}".format(fallacy, stance))

Fallacy: IrrelevantAuthority
Stance: support


In [None]:
fallacy = pmodel.infer(["Should we allow animal testing?", "Everyone has a favorite animal, what is yours?"], "fallacy")
stance = pmodel.infer(["Should we allow animal testing?", "Everyone has a favorite animal, what is yours?"], "stance")
print("Fallacy: {}\nStance: {}".format(fallacy, stance))

Fallacy: RedHerring
Stance: support


In [None]:
print(pmodel.test("/content/drive/MyDrive/Laidlaw Research Project/data/test_samples.tsv"))

macro f1: 0.7863247863247864
micro f1: 0.7933884297520661
weighted f1: 0.7943516538557861
              precision    recall  f1-score   support

  contradict       0.73      0.77      0.75        48
     support       0.84      0.81      0.83        73

    accuracy                           0.79       121
   macro avg       0.78      0.79      0.79       121
weighted avg       0.80      0.79      0.79       121



## Train Model

In [4]:
pmodel = RobertaPrompt(model='roberta-large', device = torch.device('cuda'), prompt = argument_prompt)
print(pmodel)

Downloading vocab.json:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading merges.txt:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading config.json:   0%|          | 0.00/482 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

roberta-large

Task: fallacy, Template: fallacy task. Topic: {} Text: {} This contains the fallacy: <mask>
Task: stance, Template: procon task. Topic: {} Text: {} Has the relation: <mask>



In [5]:
pmodel.train("/content/drive/MyDrive/Laidlaw Research Project/data/stance/training.tsv", "/content/drive/MyDrive/Laidlaw Research Project/data/stance/val.tsv", output_dir="/content/drive/MyDrive/Laidlaw Research Project/data/stance/model", epochs=3)




Training...
  Batch    10  of     62.    Elapsed: 0:00:11.
  Batch    20  of     62.    Elapsed: 0:00:19.
  Batch    30  of     62.    Elapsed: 0:00:27.
  Batch    40  of     62.    Elapsed: 0:00:35.
  Batch    50  of     62.    Elapsed: 0:00:43.
  Batch    60  of     62.    Elapsed: 0:00:51.

  Average training loss: 2.71
  Training epcoh took: 0:00:52

Running Validation...
SAVING NEW MODEL ... 
  Validation Loss: 0.01
  Validation took: 0:00:20

Training...
  Batch    10  of     62.    Elapsed: 0:00:08.
  Batch    20  of     62.    Elapsed: 0:00:16.
  Batch    30  of     62.    Elapsed: 0:00:24.
  Batch    40  of     62.    Elapsed: 0:00:32.
  Batch    50  of     62.    Elapsed: 0:00:40.
  Batch    60  of     62.    Elapsed: 0:00:48.

  Average training loss: 0.01
  Training epcoh took: 0:00:49

Running Validation...
SAVING NEW MODEL ... 
  Validation Loss: 0.00
  Validation took: 0:00:09

Training...
  Batch    10  of     62.    Elapsed: 0:00:08.
  Batch    20  of     62.    Elaps

{1: {'Training Loss': 2.710969789103875,
  'Valid. Loss': 0.00683596107410267,
  'Training Time': '0:00:52',
  'Validation Time': '0:00:20'},
 2: {'Training Loss': 0.0069421825389708244,
  'Valid. Loss': 0.004463053366634995,
  'Training Time': '0:00:49',
  'Validation Time': '0:00:09'},
 3: {'Training Loss': 0.004445496866772432,
  'Valid. Loss': 0.0037573204608634114,
  'Training Time': '0:00:49',
  'Validation Time': '0:00:09'}}

In [6]:
print(pmodel.test("/content/drive/MyDrive/Laidlaw Research Project/data/stance/test_samples.tsv"))

macro f1: 0.7639353400222966
micro f1: 0.768595041322314
weighted f1: 0.7707878419340869
              precision    recall  f1-score   support

  contradict       0.68      0.79      0.73        48
     support       0.85      0.75      0.80        73

    accuracy                           0.77       121
   macro avg       0.76      0.77      0.76       121
weighted avg       0.78      0.77      0.77       121

