In [None]:
import openai
import pandas as pd
import numpy as np
import math

# Are informal comments more toxic?

In this notebook we'll use Marianna Apidianaki's method of calculating interpretable dimensions in semantic vector space on the fly using seed pairs. To start, we want to look at the same dimensions: formality and complexity. But we want to look at the sentence level rather than the word level. 

## Step 1: Generating formality seed pairs

We want sevenish pairs of sentences, or really two symmetrical groups of sentences, that can be used to calculate a dimension. 

In [None]:
sentences = """Last week I got into a car accident.
She had some amazing news to share but nobody to share it with.
Sometime you just have to give up and win by cheating.
They desperately needed another drummer since the current one only knew how to play bongos.
The bread dough reminded her of Santa Clause’s belly.
He realized there had been several deaths on this road, but his concern rose when he saw the exact number.
Trash covered the landscape like sprinkles do a birthday cake."""
sentences = sentences.split("\n")
sentences

### Step 1: Load and use GPT to generate sentences

In [None]:
from openai import OpenAI
client = OpenAI() # OPENAI_API_KEY environment variable must be set. see quickstart tutorial here: https://platform.openai.com/docs/quickstart?context=python



Try an example completion

In [None]:
sentence = sentences[0]

messages=[
    {"role": "system", "content": "You are a rewording assistant, skilled in transforming a statement to express more or less of a given quality or property."},
    {"role": "user", "content": "Rephrase the following statement to use language that is more complex: \"{}\" .".format(sentence)}
  ]


In [None]:
completion = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=messages
)

print(completion.choices[0].message)

In [None]:
completion.choices[0]

We'll feed this output back to the api 

In [None]:
messages.append({'role': 'system', 'content': completion.choices[0].message.content})
messages.append({"role": "user", "content": "Good. Rephrase the sentence again to use language that is even more complex."})
messages

In [None]:
def complete(messages):
    completion = client.chat.completions.create(
      model="gpt-3.5-turbo",
      messages=messages,
      seed=42
    )
    return completion.choices[0].message.content

complete(messages)

### prompt templates

In [None]:
# dictionary of the adjectives we use (property adjective and antonym) to create prompts

property_dict = {
    'complexity':   ('complex', 'simple'),
    'emotion':      ('emotional', 'emotionless')
}


We will generate sentences from a series of templates. For each sentence, we want to generate 'more x', 'even more x', as well as 'less x' and 'even less x'. Because the model often produces longer sentences for 'more' prompts, we also prompt for rephrasings using an antonymous adjective. So, for example, we ask for rephrasings that are "more complex" as well as rephrasings that are "less simple". We then use all of these rephrasings to calculate the complexity dimension.

In [None]:
# TODO save 5 responses instead of 1


adj, antonym = property_dict['complexity']

data = []

for sent in sentences:

    for i, x in enumerate([adj, antonym]):
        print(i)
        print(x)
        messages=[
            {"role": "system", "content": "You are a rewording assistant, skilled in transforming a statement to express more or less of a given quality or property."},
        ]

        
        # more
        more_messages = messages + [{"role": "user", "content": "Rephrase the following statement to use language that is more {}: \"{}\" .".format(x,sent)}]
        more = complete(more_messages)
        print(more)
        
        # less
        less_messages = messages + [{"role": "user", "content": "Rephrase the following statement to use language that is less {}: \"{}\" .".format(x,sent)}]
        less = complete(less_messages)
        print(less)
        
        row = {
             'sentence': sent,
             'text1': more,
             'text2': less,
             'more': 1,
             'even_more': 0,
             'less': 1,
             'even_less':  0,
             'property': 'complexity',
             'adjective': x,
             'antonym?': 0 if i == 0 else 1 # the second in the pair is the antonym
        }
        data.append(row)
                
        # even more
        even_more_messages = more_messages + [{"role": "system", "content": more}] + [{"role": "user", "content": "Good. Rephrase the sentence again to use language that is even more {}.".format(x)}]
        even_more = complete(even_more_messages)
        print(even_more)
        
        # even less
        even_less_messages = less_messages + [{"role": "system", "content": less}] + [{"role": "user", "content": "Good. Rephrase the sentence again to use language that is even less {}.".format(x)}]
        even_less = complete(even_less_messages)
        print(even_less)
        
        row = {
             'sentence': sent,
             'text1': even_more,
             'text2': even_less,
             'more': 0,
             'even_more': 1,
             'less': 0,
             'even_less':  1,
             'property': 'complexity',
             'adjective': x,
             'antonym?': 0 if i == 0 else 1 # the second in the pair is the antonym
        }
        data.append(row)

        # TODO even even more


    
df = pd.DataFrame.from_records(data)
df

Save so we don't have to query the api every time

In [None]:
df.to_csv('make_it_more_complexity_pilot_seed_sentences.csv')

## Step 2: Calculating the formality dimension



In [None]:
df = pd.read_csv('make_it_more_complexity_pilot_seed_sentences.csv')

Now that we have our seed sentences for the complexity dimension, we need to get the vector differences for the seed pairs.

We generated 8 sentences for each original seed sentence, meaning we have four seed pairs.

The formulas for the four seed pairs are as follows:

- ( adjective + more ) - (adjective + less)
- ( adjective + even more ) - (adjective + even less)
- ( antonym + less ) - (antonym + more )
- ( antonym + even less ) - (antonym + even more )

First we get an embedding for each sentence. Then, for each seed sentence we calculate these four formulae to get the vector differences, storing those in a separate list. And then we average those together. 

--NOPE__So now that we have our seed sentences for the complexity dimension, we need to split them into negative and positive sentences. The generated sentences should be divided as follows.

Positive
- adjective + more
- adjective + even more
- antonym + less
- antonym + even less

Negative
- adjective + less
- adjective + even less
- antonym + more
- antonym + even more

After we split them into positive and negative examples, we embed them using SBERT--

In [None]:
# positive = df[df['antonym?']==0][df['more']==1]['text'].to_list() + df[df['antonym?']==0][df['even_more']==1]['text'].to_list() + df[df['antonym?']==1][df['less']==1]['text'].to_list() + df[df['antonym?']==1][df['even_less']==1]['text'].to_list() 
# negative = df[df['antonym?']==0][df['less']==1]['text'].to_list() + df[df['antonym?']==0][df['even_less']==1]['text'].to_list() + df[df['antonym?']==1][df['more']==1]['text'].to_list() + df[df['antonym?']==1][df['even_more']==1]['text'].to_list() 

# print(positive)
# print()
# print(negative)

Obviously we run into the problem where vectors are word level and we want sentence-level representations. The absolute simplest thing I can think of to do here is to use SentenceBERT, which we will download from huggingface.

After initializing the model, we generate vector representations for each sentence in the informal list and for each corresponding sentence in the formal list. We subtract the vectors from one another and then take the average, leaving us with a vector that represents the formality dimension. We can rate any sentence vector(s) on the formality dimension by giving them (as a list) to the function predict_scalarproj along with the dimension itself. 

In [None]:
# load sbert
!pip install -U sentence-transformers

In [None]:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')

sentences = df['text'].to_list()

#Sentences are encoded by calling model.encode()
embeddings = model.encode(sentences)


df = df.assign(embedding=embeddings.tolist())

df.head(5)

# #Print the embeddings
# for sentence, embedding in zip(positive[:5], pos_embeddings[:5]):
#     print("Sentence:", sentence)
#     print("Embedding:", embedding[:100])
#     print("")

In [None]:
difference_vecs = []

def more(df):
    df = df[df['adjective']=='complex'] 
    df = df[df['more']==1]
    return df

def less(df):
    df = df[df['adjective']=='complex'] 
    df = df[df['more']==1]
    return df['embedding'].values[0]

for sentence in df['sentence'].unique():
    # there are 8 seeds with this sentence
    print(len(df[df['sentence']==sentence]))
    
    seeds = df[df['sentence']==sentence]
    
    # now we want to calculate the four different formulae
    
    #( adjective + more ) - (adjective + less)
    a = seeds[seeds['adjective']=='complex'] 
    a = seeds[seeds['more']==1] ['embedding'].values[0]
    b = seeds[seeds['adjective']=='complex'] 
    b = seeds[seeds['less']==1] ['embedding'].values[0]
    diff_vec = np.asarray(a) - np.asarray(b)
    difference_vecs.append(diff_vec)
    
    #( adjective + even more ) - (adjective + even less)
    a = seeds[seeds['adjective']=='complex'] 
    a = seeds[seeds['even_more']==1] ['embedding'].values[0]
    b = seeds[seeds['adjective']=='complex'] 
    b = seeds[seeds['even_less']==1] ['embedding'].values[0]
    diff_vec = np.asarray(a) - np.asarray(b)
    difference_vecs.append(diff_vec)
    
    #( antonym + less ) - (antonym + more )
    a = seeds[seeds['adjective']=='simple'] 
    a = seeds[seeds['less']==1] ['embedding'].values[0]
    b = seeds[seeds['adjective']=='simple'] 
    b = seeds[seeds['more']==1] ['embedding'].values[0]
    diff_vec = np.asarray(a) - np.asarray(b)
    difference_vecs.append(diff_vec)
    
    #( antonym + even less ) - (antonym + even more )
    a = seeds[seeds['adjective']=='simple'] 
    a = seeds[seeds['even_less']==1] ['embedding'].values[0]
    b = seeds[seeds['adjective']=='simple'] 
    b = seeds[seeds['even_more']==1] ['embedding'].values[0]
    diff_vec = np.asarray(a) - np.asarray(b)
    difference_vecs.append(diff_vec)
    
print(len(difference_vecs))



In [None]:
print(difference_vecs[23])

In [None]:
dimvec = np.mean(difference_vecs[:8], axis = 0)
dimvec

In [None]:
#### from marianna + katrin
# seed-based method
# averaging over seed pair vectors
# def dimension_seedbased(seeds_pos, seeds_neg, space, paired = False):
#     diffvectors = [ ]
    
#     for negword, posword in _make_seedpairs(seeds_pos, seeds_neg, paired = paired):
#         diffvectors.append(space[posword] - space[negword])

#     # average
#     dimvec = np.mean(diffvectors, axis = 0)
#     return dimvec


In [None]:
def dimension_seedbased():
    return dimvec

In [None]:
complexity_dimension = dimension_seedbased()

In [None]:
# vector scalar projection (from marianna + katrin)
def predict_scalarproj(veclist, dimension):
    dir_veclen = math.sqrt(np.dot(dimension, dimension))
    return [np.dot(v, dimension) / dir_veclen for v in veclist]

# Step 3: validating the formality dimension

does it behave the same way as a standard classifier?


We load a regular classifier

We run this prediction method and the formality classifier on the formality dataset. 

We compare. Is the dimension-based method that much worse?

We load a formality dataset - perhaps the word-based one that Marianna uses.

We order the entries by their complexity rating and look at where they fall on our complexity axis.

## Step 4: Rating Toxicity Datasets for formality

We'll start with the 1000-length parallel dataset from the text detoxification paper. 

We load it in

We SBERTize the sentences

We pass them to the prediction method. 

We observe: do toxic and nontoxic comments differ wrt formality?

In [None]:
!pip install datasets

In [None]:
from datasets import load_dataset

dataset = load_dataset("civil_comments")

In [None]:
dataset["train"][0]

In [None]:
dataset["train"][:10]['text']

In [None]:
###################################
#########
# predicting ratings on a dimension

# ...
# when we only have the dimension:
# vector scalar projection
def predict_scalarproj(veclist, dimension):
    dir_veclen = math.sqrt(np.dot(dimension, dimension))
    return [np.dot(v, dimension) / dir_veclen for v in veclist]

SBERtize the Comments

In [None]:
sentence_embs = [model.encode(row) for row in dataset["train"][:100]['text']]


In [None]:
Calculate complexity 

In [None]:
complexity_dimension

In [None]:
complexities = predict_scalarproj(sentence_embs, dimvec)

# for i, emb in enumerate(sentence_embs):
#     dataset["train"][i]['complexity_computed'] = sentence_embs[i]
#     complexities.append( sentence_embs[i] )

#dataset["train"][:5]
complexities[:5]

In [None]:
dataset["train"][:10]['text']

In [None]:
import numpy as np
import scipy.stats

scipy.stats.pearsonr(complexities, scores)    # Pearson's r

In [None]:
complexities