# **GPT-2 and Implicit Causality**
*Authors:* Hien Huynh, Tom Lentz and Emiel van Miltenburg




### ***Experiment 1:***
Quantitative evaluation of the GPT-2 model behavior by assessing its next-word surprisal computation

**DATA PRE-PROCESSING** \
\
*Step 1: Uploading data to Google Colab* \
* The following block of codes allows you to upload the data to Google: 

```
from google.colab import files
uploaded = files.upload()
```
* As you run it, a new window appears and asks you to select a file.
* Select the file ```IC_mismatch.csv``` from the ```data``` folder.

\

*Step 2: Processing the dataframe*
* For each verb and each pair of gender-opposite protagonists, two stimulus sentences are created, one of which would have a male subject (thus a female object) while the other would have a female subject.
* The pronouns 'she' and 'he' (i.e. the first words after 'because' will be removed. For each stimulus sentence, the GPT-2 model is tasked to automatically assign next-word surprisal values to both 'she' and 'he' for each and every stimulus sentence.



In [None]:
# Install Transformers Library
!pip install transformers

Collecting transformers
  Downloading transformers-4.16.2-py3-none-any.whl (3.5 MB)
[K     |████████████████████████████████| 3.5 MB 5.4 MB/s 
Collecting sacremoses
  Downloading sacremoses-0.0.47-py2.py3-none-any.whl (895 kB)
[K     |████████████████████████████████| 895 kB 49.7 MB/s 
Collecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)
[K     |████████████████████████████████| 596 kB 63.0 MB/s 
[?25hCollecting tokenizers!=0.11.3,>=0.10.1
  Downloading tokenizers-0.11.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.5 MB)
[K     |████████████████████████████████| 6.5 MB 42.0 MB/s 
Collecting huggingface-hub<1.0,>=0.1.0
  Downloading huggingface_hub-0.4.0-py3-none-any.whl (67 kB)
[K     |████████████████████████████████| 67 kB 4.4 MB/s 
Installing collected packages: pyyaml, tokenizers, sacremoses, huggingface-hub, transformers
  Attempting uninstall: pyyaml
    Foun

In [None]:
# Import Packages
from google.colab import files
from google.colab import drive
import pandas as pd
import numpy as np
import io


import torch
torch.set_grad_enabled(False)
import torch.nn.functional as F

from transformers import GPT2LMHeadModel
from transformers import GPT2Tokenizer

In [None]:
# Upload IC_mismatch.csv data to Google Colab
uploaded = files.upload()

Saving IC_mismatch.csv to IC_mismatch.csv


In [None]:
# Format the data uploaded as a Pandas dataframe
df = pd.read_csv(io.BytesIO(uploaded['IC_mismatch.csv']))

In [None]:
# Print the first 10 rows to see how the original dataframe looks like
pd.set_option('display.max_columns', None)
df.head(10)

Unnamed: 0,exp,verb,item,sent,bias,isHigh,gender
0,ic_mismatch,abandoned,0,The man abandoned the woman because he,33,1,m
1,ic_mismatch,abandoned,0,The man abandoned the woman because she,33,0,f
2,ic_mismatch,abandoned,0,The woman abandoned the man because he,33,0,m
3,ic_mismatch,abandoned,0,The woman abandoned the man because she,33,1,f
4,ic_mismatch,acclaimed,0,The man acclaimed the woman because he,-58,1,m
5,ic_mismatch,acclaimed,0,The man acclaimed the woman because she,-58,0,f
6,ic_mismatch,acclaimed,0,The woman acclaimed the man because he,-58,0,m
7,ic_mismatch,acclaimed,0,The woman acclaimed the man because she,-58,1,f
8,ic_mismatch,accompanied,0,The man accompanied the woman because he,-48,1,m
9,ic_mismatch,accompanied,0,The man accompanied the woman because she,-48,0,f


*Columns in the dataframe:*
* ```exp```: name of the experiment (```ic_mismatch```), indicating that the genders of the two protagonists in a stimulus sentence are opposite (i.e. male vs. female and vice versa). Since all of the sentences in this data are used for the same experiment, this column does not provide useful information and will be removed.
* ```verb```: interpersonal verbs (in past tense) used in the stimulus sentences.
* ```item```: numbered from 0 to 13, indicating 14 pairs of gender-mismatched protagonists in the sentences.
* ```sent```: the actual stimulus sentences as inputs for the GPT-2 model.
* ```isHigh```: whether the pronoun ('she' or 'he') following the word 'because' references the subject in the previous clause. The column will not be used because we will eliminate these pronouns and task the model to automatically compute surprisal values for both 'he' and 'she' for each stimulus sentence.
* ```gender```: whether the pronoun after the word 'because' is male or female. Again, this column will not be in use because the first words after 'because' will be eliminated.



In [None]:
# EDA
# Print number of items (i.e. pairs of gender-mismatched nouns)
print("The dataset includes",len(df['item'].unique().tolist()),"pairs of gender-mismatched subjects and objects.")

# Print number of verbs
print("The dataset includes",len(df['verb'].unique().tolist()),"interpersonal verbs.")

# Which nouns are used in the stimulus sentences?
male_nouns = df['sent'].apply(lambda x: x.split(" ")[1]).unique()[::2].tolist()
female_nouns = df['sent'].apply(lambda x: x.split(" ")[1]).unique()[1::2].tolist()
print("List of male protagonists used in the stimuli: \n",male_nouns)
print("List of female protagonists used in the stimuli: \n",female_nouns)


The dataset includes 14 pairs of gender-mismatched subjects and objects.
The dataset includes 246 interpersonal verbs.
List of male protagonists used in the stimuli: 
 ['man', 'boy', 'father', 'uncle', 'husband', 'actor', 'prince', 'waiter', 'lord', 'king', 'son', 'nephew', 'brother', 'grandfather']
List of female protagonists used in the stimuli: 
 ['woman', 'girl', 'mother', 'aunt', 'wife', 'actress', 'princess', 'waitress', 'lady', 'queen', 'daughter', 'niece', 'sister', 'grandmother']


In [None]:
# Clean the dataframe (1)
'''
Remove duplicate first clauses. For example: in the pairs "The man abandoned the woman because he" 
and "The man abandoned the woman because she", the latter will be removed. 
'''
df = df.iloc[::2]
df = df.reset_index(drop=True)
print(df.head(10))

           exp         verb  item                                      sent  \
0  ic_mismatch    abandoned     0    The man abandoned the woman because he   
1  ic_mismatch    abandoned     0    The woman abandoned the man because he   
2  ic_mismatch    acclaimed     0    The man acclaimed the woman because he   
3  ic_mismatch    acclaimed     0    The woman acclaimed the man because he   
4  ic_mismatch  accompanied     0  The man accompanied the woman because he   
5  ic_mismatch  accompanied     0  The woman accompanied the man because he   
6  ic_mismatch      accused     0      The man accused the woman because he   
7  ic_mismatch      accused     0      The woman accused the man because he   
8  ic_mismatch      admired     0      The man admired the woman because he   
9  ic_mismatch      admired     0      The woman admired the man because he   

   bias  isHigh gender  
0    33       1      m  
1    33       0      m  
2   -58       1      m  
3   -58       0      m  
4   -

In [None]:
# Clean the dataframe (2)
'''
In the column 'sent', remove the last word (i.e. 'he') for all sentences.
'''
def remove_final_word(sentence):
  # Split the string from the right (rsplit) into a list.
  # 1 indicates the first word from the right will be separated from the rest.
  sentence = sentence.rsplit(" ",1)
  
  # Return the first element from the list
  # i.e. the first clause up until 'because'
  return sentence[0]

df['sent'] = df['sent'].apply(lambda x: remove_final_word(x))
print(df['sent'].head(10))

0      The man abandoned the woman because
1      The woman abandoned the man because
2      The man acclaimed the woman because
3      The woman acclaimed the man because
4    The man accompanied the woman because
5    The woman accompanied the man because
6        The man accused the woman because
7        The woman accused the man because
8        The man admired the woman because
9        The woman admired the man because
Name: sent, dtype: object


In [None]:
# Clean the dataframe (3)
df.rename(columns={'isHigh':'subject_gender'},inplace=True)
df = df[['verb','item','sent','bias','subject_gender']]
print(df.head(10))

          verb  item                                   sent  bias  \
0    abandoned     0    The man abandoned the woman because    33   
1    abandoned     0    The woman abandoned the man because    33   
2    acclaimed     0    The man acclaimed the woman because   -58   
3    acclaimed     0    The woman acclaimed the man because   -58   
4  accompanied     0  The man accompanied the woman because   -48   
5  accompanied     0  The woman accompanied the man because   -48   
6      accused     0      The man accused the woman because     2   
7      accused     0      The woman accused the man because     2   
8      admired     0      The man admired the woman because   -92   
9      admired     0      The woman admired the man because   -92   

   subject_gender  
0               1  
1               0  
2               1  
3               0  
4               1  
5               0  
6               1  
7               0  
8               1  
9               0  


* In the column ```subject_gender```, ```0``` indicates a female subject in the stimulus sentence while ```1``` indicates a male subject.


---



**NEXT-WORD SURPRISAL COMPUTATION**

\
*Step 1: Initiate the GPT-2 model from the Transformers library.* \
*Step 2: Create and execute a function that computes the surprisal value for the final word of a sentence.* \
*Step 3: Estimate the difference in the surprisal values between 'he' and 'she', yielding the subject-preference scores.* \
*Step 4: Append the surprisal values and the subject-preference scores as new columns to the dataframe.* \
*Step 5: Save the final dataframe as* ```experiment1_surprisals.csv``` *file. This file will be used for analyses in* ```R```.

In [None]:
# Initiate the GPT-2 model from the Transformers library
'''
For the current research, due to some computational limitation, the small variant
of the GPT-2 model is chosen.
'''
model_name = "gpt2"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

Downloading:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/523M [00:00<?, ?B/s]

In [None]:
# Create the function that takes a stimulus sentence and a target pronoun i.e. 'he' or 'she'
# and tasks the GPT-2 model to compute surprisal for the selected pronoun accordingly
def surprisals(sequence,pronoun):
  sequence = sequence+" "+pronoun
  sent_surprisals = []
  # Tokenize sentence
  encoded = tokenizer.encode(sequence, add_special_tokens=True)
  input_ids = torch.tensor(encoded).unsqueeze(0)

  # Get model outputs
  outputs = model(input_ids)

  # Get logit values
  predictions = outputs.logits

  # Calculate surprisals
  surps = torch.log2(torch.exp(-1*torch.nn.functional.log_softmax(predictions, -1)))

  # Get next word surprisals:
  for y in range(len(input_ids[0])-1):
    target_id = input_ids[0][y+1]
    input_id = input_ids[0][y]
    target_word = tokenizer.decode([target_id]).replace(' ','')
    surp = float(surps[0, y, int(target_id)].data)
    sent_surprisals.append((target_word, surp))
  return sent_surprisals[-1][-1]

In [None]:
# Apply the above function to all sentences in the dataframe
pronoun='he'
df['surprisal_he'] = df['sent'].apply(lambda x: surprisals(x,pronoun))
pronoun='she'
df['surprisal_she'] = df['sent'].apply(lambda x: surprisals(x,pronoun))

# Print the first 10 results
print(df.head(10))

          verb  item                                   sent  bias  \
0    abandoned     0    The man abandoned the woman because    33   
1    abandoned     0    The woman abandoned the man because    33   
2    acclaimed     0    The man acclaimed the woman because   -58   
3    acclaimed     0    The woman acclaimed the man because   -58   
4  accompanied     0  The man accompanied the woman because   -48   
5  accompanied     0  The woman accompanied the man because   -48   
6      accused     0      The man accused the woman because     2   
7      accused     0      The woman accused the man because     2   
8      admired     0      The man admired the woman because   -92   
9      admired     0      The woman admired the man because   -92   

   subject_gender  surprisal_he  surprisal_she  
0               1      1.430030       1.416414  
1               0      1.402682       1.514996  
2               1      1.639298       2.250924  
3               0      1.749075       1.7088

In [None]:
# Compute subject-preference scores
subject_preference = []

for i in range(len(df)):
  if df["subject_gender"][i] == 1:
    subject_preference += [df["surprisal_she"][i]-df["surprisal_he"][i]]
  elif df["subject_gender"][i] == 0:
    subject_preference += [df["surprisal_he"][i]-df["surprisal_she"][i]]

df["subject_preference"] = subject_preference

In [None]:
# How does the final dataframe look like?
print(df.head(10))

          verb  item                                   sent  bias  \
0    abandoned     0    The man abandoned the woman because    33   
1    abandoned     0    The woman abandoned the man because    33   
2    acclaimed     0    The man acclaimed the woman because   -58   
3    acclaimed     0    The woman acclaimed the man because   -58   
4  accompanied     0  The man accompanied the woman because   -48   
5  accompanied     0  The woman accompanied the man because   -48   
6      accused     0      The man accused the woman because     2   
7      accused     0      The woman accused the man because     2   
8      admired     0      The man admired the woman because   -92   
9      admired     0      The woman admired the man because   -92   

   subject_gender  surprisal_he  surprisal_she  subject_preference  
0               1      1.430030       1.416414           -0.013616  
1               0      1.402682       1.514996           -0.112314  
2               1      1.639298  

In [None]:
# Save dataframe as a csv file
drive.mount('drive')
df.to_csv('experiment1_surprisals.csv')

Mounted at drive


In [None]:
# Download the csv file to your local computer
# The file can be used for analyses (in R)
files.download('experiment1_surprisals.csv')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

\

### ***Experiment 2:***
Qualitative evaluation of the GPT-2 model behavior by assessing the outputs of GPT-2's pipeline for text generation.

**DATA PRE-PROCESSING** 

\
*Step 1: Upload the data to Google Colab*
* Similar to Step 1 in Experiment 1.
* However, we would use the file ```experiment1_surprisals.csv``` we obtained from Experiment 1, as the sentences in this file have already been pre-processed in the way we want.

\
*Step 2: Drop columns that are irrelevant to Experiment 2*
* ```surprisal_he```, ```surprisal_she```, and ```subject_preference```

In [None]:
# Upload IC_mismatch.csv data to Google Colab
uploaded = files.upload()

In [None]:
# Format the data uploaded as a Pandas dataframe
df_2 = pd.read_csv(io.BytesIO(uploaded['experiment1_surprisals.csv']))

In [None]:
# Print the columns that df includes
print(df_2.columns.to_list())

In [None]:
# Keep columns which are relevant to Experiment 2
df_2 = df_2[['verb','item','sent','bias']]

**GPT-2'S PIPELINE FOR TEXT GENERATION**

\
* In the following blocks of codes, a function is created in which the model is tasked to complete sentences, given the sequences in the column ```sent``` of ```df_2```.
* We limit the number of words the model can to 50 words.
* For each model output, we keep the sequence up until the first period ('.'). This means that for the analyses, only the sequences that the model outputs to complete the stimulus sentences will be taken in consideraction.
* Save the outputs as ```experiment2_textgeneration.csv``` for analyses.

In [None]:
# Function that tasks the GPT-2 model to complete 
def sentence_completion(sequence):
  input_1 = tokenizer.encode(sequence, return_tensors="pt")
  
  # Parameter max_length is set to 50. For each sequence, the model will
  # produce a text of maximum 50 words to complete this sequence.
  output_1 = model.generate(input_1, max_length = 50)
  output_1 = tokenizer.decode(output_1[0], skip_special_tokens = True)
  
  # Split the text generated by the first period (i.e. punctuation '.')
  # Keep only the first element before the full stop for further analyses.
  output_1 = output_1.partition('.')[0] + '.'
  return output_1

In [None]:
# Apply the above function to the whole dataframe
df_2['output'] = df_2['sent'].apply(lambda x: sentence_completion(x))

In [None]:
# Print first few sentences that the model completed
df_2['output'].head(15)

In [None]:
# Save dataframe as a csv file
drive.mount('drive')
df_2.to_csv('experiment2_textgeneration.csv')

In [None]:
# Download the csv file to your local computer
# The file can be used for analyses (in R)
files.download('experiment2_textgeneration.csv')