In [39]:
import openai
import pandas as pd
import numpy as np
import math

# Are informal comments more toxic?

In this notebook we'll use Marianna Apidianaki's method of calculating interpretable dimensions in semantic vector space on the fly using seed pairs. To start, we want to look at the same dimensions: formality and complexity. But we want to look at the sentence level rather than the word level. 

## Step 1: Generating formality seed pairs

We want sevenish pairs of sentences, or really two symmetrical groups of sentences, that can be used to calculate a dimension. 

In [2]:
sentences = """Last week I got into a car accident.
She had some amazing news to share but nobody to share it with.
Sometime you just have to give up and win by cheating.
They desperately needed another drummer since the current one only knew how to play bongos.
The bread dough reminded her of Santa Clause’s belly.
He realized there had been several deaths on this road, but his concern rose when he saw the exact number.
Trash covered the landscape like sprinkles do a birthday cake."""
sentences = sentences.split("\n")
sentences

['Last week I got into a car accident.',
 'She had some amazing news to share but nobody to share it with.',
 'Sometime you just have to give up and win by cheating.',
 'They desperately needed another drummer since the current one only knew how to play bongos.',
 'The bread dough reminded her of Santa Clause’s belly.',
 'He realized there had been several deaths on this road, but his concern rose when he saw the exact number.',
 'Trash covered the landscape like sprinkles do a birthday cake.']

### Step 1: Load and use GPT to generate sentences

In [3]:
from openai import OpenAI
client = OpenAI() # OPENAI_API_KEY environment variable must be set. see quickstart tutorial here: https://platform.openai.com/docs/quickstart?context=python



Try an example completion

In [4]:
sentence = sentences[0]

messages=[
    {"role": "system", "content": "You are a rewording assistant, skilled in transforming a statement to express more or less of a given quality or property."},
    {"role": "user", "content": "Rephrase the following statement to use language that is more complex: \"{}\" .".format(sentence)}
  ]


In [5]:
completion = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=messages
)

print(completion.choices[0].message)

ChatCompletionMessage(content='During the preceding week, I experienced involvement in a vehicular collision.', role='assistant', function_call=None, tool_calls=None)


In [6]:
completion.choices[0]

Choice(finish_reason='stop', index=0, message=ChatCompletionMessage(content='During the preceding week, I experienced involvement in a vehicular collision.', role='assistant', function_call=None, tool_calls=None), logprobs=None)

We'll feed this output back to the api 

In [7]:
messages.append({'role': 'system', 'content': completion.choices[0].message.content})
messages.append({"role": "user", "content": "Good. Rephrase the sentence again to use language that is even more complex."})
messages

[{'role': 'system',
  'content': 'You are a rewording assistant, skilled in transforming a statement to express more or less of a given quality or property.'},
 {'role': 'user',
  'content': 'Rephrase the following statement to use language that is more complex: "Last week I got into a car accident." .'},
 {'role': 'system',
  'content': 'During the preceding week, I experienced involvement in a vehicular collision.'},
 {'role': 'user',
  'content': 'Good. Rephrase the sentence again to use language that is even more complex.'}]

In [8]:
def complete(messages):
    completion = client.chat.completions.create(
      model="gpt-3.5-turbo",
      messages=messages,
      seed=42
    )
    return completion.choices[0].message.content

complete(messages)

'In the course of the antecedent week, I found myself implicated in an unfortunate incident of vehicular collision.'

### prompt templates

In [9]:
# dictionary of the adjectives we use (property adjective and antonym) to create prompts

property_dict = {
    'complexity':   ('complex', 'simple'),
    'emotion':      ('emotional', 'emotionless')
}


We will generate sentences from a series of templates. For each sentence, we want to generate 'more x', 'even more x', as well as 'less x' and 'even less x'. Because the model often produces longer sentences for 'more' prompts, we also prompt for rephrasings using an antonymous adjective. So, for example, we ask for rephrasings that are "more complex" as well as rephrasings that are "less simple". We then use all of these rephrasings to calculate the complexity dimension.

In [10]:
# TODO save 5 responses instead of 1


adj, antonym = property_dict['complexity']

data = []

for sent in sentences:

    for i, x in enumerate([adj, antonym]):
        print(i)
        print(x)
        messages=[
            {"role": "system", "content": "You are a rewording assistant, skilled in transforming a statement to express more or less of a given quality or property."},
        ]

        
        # more
        more_messages = messages + [{"role": "user", "content": "Rephrase the following statement to use language that is more {}: \"{}\" .".format(x,sent)}]
        more = complete(more_messages)
        row = {
             'sentence': sent,
             'text': more,
             'more': 1,
             'even_more': 0,
             'less': 0,
             'even_less':  0,
             'property': 'complexity',
             'adjective': x,
             'antonym?': 0 if i == 0 else 1 # the second in the pair is the antonym
        }
        data.append(row)
        print(more)
                         
        # even more
        even_more_messages = more_messages + [{"role": "system", "content": more}] + [{"role": "user", "content": "Good. Rephrase the sentence again to use language that is even more {}.".format(x)}]
        even_more = complete(even_more_messages)
        row = {
             'sentence': sent,
             'text': even_more,
             'more': 0,
             'even_more': 1,
             'less': 0,
             'even_less':  0,
             'property': 'complexity',
             'adjective': x,
             'antonym?': 0 if i == 0 else 1 # the second in the pair is the antonym
        }
        data.append(row)
        print(even_more)

        # TODO even even more

        # less
        less_messages = messages + [{"role": "user", "content": "Rephrase the following statement to use language that is less {}: \"{}\" .".format(x,sent)}]
        less = complete(less_messages)
        row = {
             'sentence': sent,
             'text': less,
             'more': 0,
             'even_more': 0,
             'less': 1,
             'even_less':  0,
             'property': 'complexity',
             'adjective': x,
             'antonym?': 0 if i == 0 else 1 # the second in the pair is the antonym
        }
        data.append(row)
        print(less)

        # even less
        even_less_messages = less_messages + [{"role": "system", "content": less}] + [{"role": "user", "content": "Good. Rephrase the sentence again to use language that is even less {}.".format(x)}]
        even_less = complete(even_less_messages)
        row = {
             'sentence': sent,
             'text': even_less,
             'more': 0,
             'even_more': 0,
             'less': 0,
             'even_less':  1,
             'property': 'complexity',
             'adjective': x,
             'antonym?': 0 if i == 0 else 1 # the second in the pair is the antonym
        }
        data.append(row)
        print(even_less)

    
df = pd.DataFrame.from_records(data)
df

0
complex
The previous week, I was involved in a vehicular collision.
During the preceding week, I found myself embroiled in a motor vehicle collision.
I had a car accident last week.
I had a crash with my car last week.
1
simple
Last week, I was in a car crash.
Last week, I had a car accident.
Last week I was involved in a collision while operating a motor vehicle.
During the course of the previous week, I was engaged in a vehicular collision resulting in damage to my automobile.
0
complex
She was in possession of astounding news, yet there was a dearth of individuals with whom she could disseminate it.
She found herself in the possession of awe-inspiring tidings that yearned to be shared, however, she was met with the unfortunate circumstance of not having anyone in her proximity with whom she could partake in the act of disseminating the aforementioned news.
She had incredible news to tell, but no one to tell it to.
She had really great news, but no one to tell it to.
1
simple
She h

Unnamed: 0,sentence,text,more,even_more,less,even_less,property,adjective,antonym?
0,Last week I got into a car accident.,"The previous week, I was involved in a vehicul...",1,0,0,0,complexity,complex,0
1,Last week I got into a car accident.,"During the preceding week, I found myself embr...",0,1,0,0,complexity,complex,0
2,Last week I got into a car accident.,I had a car accident last week.,0,0,1,0,complexity,complex,0
3,Last week I got into a car accident.,I had a crash with my car last week.,0,0,0,1,complexity,complex,0
4,Last week I got into a car accident.,"Last week, I was in a car crash.",1,0,0,0,complexity,simple,1
5,Last week I got into a car accident.,"Last week, I had a car accident.",0,1,0,0,complexity,simple,1
6,Last week I got into a car accident.,Last week I was involved in a collision while ...,0,0,1,0,complexity,simple,1
7,Last week I got into a car accident.,"During the course of the previous week, I was ...",0,0,0,1,complexity,simple,1
8,She had some amazing news to share but nobody ...,"She was in possession of astounding news, yet ...",1,0,0,0,complexity,complex,0
9,She had some amazing news to share but nobody ...,She found herself in the possession of awe-ins...,0,1,0,0,complexity,complex,0


## Step 2: Calculating the formality dimension



Now that we have our seed sentences for the complexity dimension, we need to get the vector differences for the seed pairs.

We generated 8 sentences for each original seed sentence, meaning we have four seed pairs.

The formulas for the four seed pairs are as follows:

- ( adjective + more ) - (adjective + less)
- ( adjective + even more ) - (adjective + even less)
- ( antonym + less ) - (antonym + more )
- ( antonym + even less ) - (antonym + even more )

First we get an embedding for each sentence. Then, for each seed sentence we calculate these four formulae to get the vector differences, storing those in a separate list. And then we average those together. 

--So now that we have our seed sentences for the complexity dimension, we need to split them into negative and positive sentences. The generated sentences should be divided as follows.

Positive
- adjective + more
- adjective + even more
- antonym + less
- antonym + even less

Negative
- adjective + less
- adjective + even less
- antonym + more
- antonym + even more

After we split them into positive and negative examples, we embed them using SBERT--

In [11]:
positive = df[df['antonym?']==0][df['more']==1]['text'].to_list() + df[df['antonym?']==0][df['even_more']==1]['text'].to_list() + df[df['antonym?']==1][df['less']==1]['text'].to_list() + df[df['antonym?']==1][df['even_less']==1]['text'].to_list() 
negative = df[df['antonym?']==0][df['less']==1]['text'].to_list() + df[df['antonym?']==0][df['even_less']==1]['text'].to_list() + df[df['antonym?']==1][df['more']==1]['text'].to_list() + df[df['antonym?']==1][df['even_more']==1]['text'].to_list() 

print(positive)
print()
print(negative)

['The previous week, I was involved in a vehicular collision.', 'She was in possession of astounding news, yet there was a dearth of individuals with whom she could disseminate it.', 'Occasionally, one must relinquish and emerge victorious through resorting to dishonest tactics.', "Due to an urgent requirement, an additional percussionist was decidedly crucial, given that the current incumbent's competencies were limited to bongo proficiency alone.", "The dough of the bread invoked in her mind the image of Santa Claus's rotund midsection.", 'Upon his realization that numerous fatalities had occurred on this very road, his apprehension heightened exponentially upon witnessing the precise numerical value associated with the loss of life.', 'The landscape was engulfed by an overwhelming abundance of trash, much like the way sprinkles completely coat a birthday cake.', 'During the preceding week, I found myself embroiled in a motor vehicle collision.', 'She found herself in the possession 

  positive = df[df['antonym?']==0][df['more']==1]['text'].to_list() + df[df['antonym?']==0][df['even_more']==1]['text'].to_list() + df[df['antonym?']==1][df['less']==1]['text'].to_list() + df[df['antonym?']==1][df['even_less']==1]['text'].to_list()
  positive = df[df['antonym?']==0][df['more']==1]['text'].to_list() + df[df['antonym?']==0][df['even_more']==1]['text'].to_list() + df[df['antonym?']==1][df['less']==1]['text'].to_list() + df[df['antonym?']==1][df['even_less']==1]['text'].to_list()
  positive = df[df['antonym?']==0][df['more']==1]['text'].to_list() + df[df['antonym?']==0][df['even_more']==1]['text'].to_list() + df[df['antonym?']==1][df['less']==1]['text'].to_list() + df[df['antonym?']==1][df['even_less']==1]['text'].to_list()
  positive = df[df['antonym?']==0][df['more']==1]['text'].to_list() + df[df['antonym?']==0][df['even_more']==1]['text'].to_list() + df[df['antonym?']==1][df['less']==1]['text'].to_list() + df[df['antonym?']==1][df['even_less']==1]['text'].to_list()
  ne

Obviously we run into the problem where vectors are word level and we want sentence-level representations. The absolute simplest thing I can think of to do here is to use SentenceBERT, which we will download from huggingface.

After initializing the model, we generate vector representations for each sentence in the informal list and for each corresponding sentence in the formal list. We subtract the vectors from one another and then take the average, leaving us with a vector that represents the formality dimension. We can rate any sentence vector(s) on the formality dimension by giving them (as a list) to the function predict_scalarproj along with the dimension itself. 

In [12]:
# load sbert
!pip install -U sentence-transformers

Defaulting to user installation because normal site-packages is not writeable
Collecting sentence-transformers
  Downloading sentence-transformers-2.2.2.tar.gz (85 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m86.0/86.0 kB[0m [31m12.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25ldone
[?25hCollecting transformers<5.0.0,>=4.6.0
  Downloading transformers-4.36.2-py3-none-any.whl (8.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.2/8.2 MB[0m [31m134.7 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Collecting torch>=1.6.0
  Downloading torch-2.1.2-cp39-cp39-manylinux1_x86_64.whl (670.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m670.2/670.2 MB[0m [31m6.8 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting torchvision
  Downloading torchvision-0.16.2-cp39-cp39-manylinux1_x86_64.whl (6.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.8/6.8 MB[0m [

In [13]:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')

#Sentences are encoded by calling model.encode()
pos_embeddings = model.encode(positive)

neg_embeddings = model.encode(negative)

#Print the embeddings
for sentence, embedding in zip(positive[:5], pos_embeddings[:5]):
    print("Sentence:", sentence)
    print("Embedding:", embedding[:100])
    print("")

  from .autonotebook import tqdm as notebook_tqdm


Sentence: The previous week, I was involved in a vehicular collision.
Embedding: [ 0.05463979  0.04207244  0.09171151  0.02267598  0.04437772 -0.03329574
 -0.01103387  0.01919858 -0.0363395  -0.04133713  0.06696524  0.051202
 -0.00163336  0.06005378 -0.02745053 -0.04021568  0.04297435 -0.05806289
 -0.11466546  0.05561393 -0.08777051  0.05436975 -0.02626988  0.05462555
 -0.02784058  0.03818217  0.01682572  0.10313021 -0.01788058 -0.04452566
 -0.0277462  -0.03673949 -0.0765846   0.01879713 -0.02587956 -0.08854938
  0.00326108 -0.03312414  0.06012055 -0.08719996 -0.00367146 -0.04178658
  0.03739707 -0.02572702  0.03442287  0.02197137  0.08840093  0.00773319
  0.06312124 -0.0328535   0.01348238 -0.0161405   0.04381799  0.00384824
 -0.01891634 -0.03191254 -0.01985325  0.07942367 -0.01900075  0.01945524
 -0.02401999  0.03061789  0.00990963  0.06476162 -0.08936954  0.01882749
  0.02193848 -0.00154345  0.09530792  0.06960236  0.06733276  0.02195674
 -0.07398091  0.00656706  0.01133624  0.01481

In [14]:
#### from marianna + katrin
# seed-based method
# averaging over seed pair vectors
# def dimension_seedbased(seeds_pos, seeds_neg, space, paired = False):
#     diffvectors = [ ]
    
#     for negword, posword in _make_seedpairs(seeds_pos, seeds_neg, paired = paired):
#         diffvectors.append(space[posword] - space[negword])

#     # average
#     dimvec = np.mean(diffvectors, axis = 0)
#     return dimvec


In [15]:
def dimension_seedbased(seeds_pos, seeds_neg, model):
    pos = np.mean(model.encode(seeds_pos), axis = 0)
    neg = np.mean(model.encode(seeds_neg), axis = 0)
    dimvec = pos - neg
    return dimvec

In [37]:
complexity_dimension = dimension_seedbased(positive, negative, model)

In [17]:
# vector scalar projection (from marianna + katrin)
def predict_scalarproj(veclist, dimension):
    dir_veclen = math.sqrt(np.dot(dimension, dimension))
    return [np.dot(v, dimension) / dir_veclen for v in veclist]

# Step 3: validating the formality dimension

does it behave the same way as a standard classifier?


We load a regular classifier

We run this prediction method and the formality classifier on the formality dataset. 

We compare. Is the dimension-based method that much worse?

We load a formality dataset - perhaps the word-based one that Marianna uses.

We order the entries by their complexity rating and look at where they fall on our complexity axis.

## Step 4: Rating Toxicity Datasets for formality

We'll start with the 1000-length parallel dataset from the text detoxification paper. 

We load it in

We SBERTize the sentences

We pass them to the prediction method. 

We observe: do toxic and nontoxic comments differ wrt formality?

In [18]:
!pip install datasets

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Defaulting to user installation because normal site-packages is not writeable
Collecting datasets
  Downloading datasets-2.16.1-py3-none-any.whl (507 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m507.1/507.1 kB[0m [31m26.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting pyarrow-hotfix
  Downloading pyarrow_hotfix-0.6-py3-none-any.whl (7.9 kB)
Collecting aiohttp
  Downloading aiohttp-3.9.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m84.1 MB/s[0m eta [36m0:00:00[0m
Collecting xxhash
  Downloading xxhash-3.4.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (193 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m193.8/193.8 kB[0m [31m129.8 MB/s[0m eta [36m0:00:00[0m
Collecting pyarrow>=8.0.0
  Downloading pyarrow-14.0.2-cp39-cp39-manylinux_2_28_x86_64.whl (38.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m38.0/3

In [27]:
from datasets import load_dataset

dataset = load_dataset("civil_comments")

In [28]:
dataset["train"][0]

{'text': "This is so cool. It's like, 'would you want your mother to read this??' Really great idea, well done!",
 'toxicity': 0.0,
 'severe_toxicity': 0.0,
 'obscene': 0.0,
 'threat': 0.0,
 'insult': 0.0,
 'identity_attack': 0.0,
 'sexual_explicit': 0.0}

In [31]:
dataset["train"][:10]['text']

["This is so cool. It's like, 'would you want your mother to read this??' Really great idea, well done!",
 "Thank you!! This would make my life a lot less anxiety-inducing. Keep it up, and don't let anyone get in your way!",
 'This is such an urgent design problem; kudos to you for taking it on. Very impressive!',
 "Is this something I'll be able to install on my site? When will you be releasing it?",
 'haha you guys are a bunch of losers.',
 'ur a sh*tty comment.',
 'hahahahahahahahhha suck it.',
 'FFFFUUUUUUUUUUUUUUU',
 'The ranchers seem motivated by mostly by greed; no one should have the right to allow their animals destroy public land.',
 "It was a great show. Not a combo I'd of expected to be good together but it was."]

In [34]:
###################################
#########
# predicting ratings on a dimension

# ...
# when we only have the dimension:
# vector scalar projection
def predict_scalarproj(veclist, dimension):
    dir_veclen = math.sqrt(np.dot(dimension, dimension))
    return [np.dot(v, dimension) / dir_veclen for v in veclist]

In [41]:
scores = predict_scalarproj(sentence_embs, complexity_dimension)


In [40]:
sentence_embs = [model.encode(row) for row in dataset["train"][:100]['text']]

complexities = []

for i, emb in enumerate(sentence_embs):
    dataset["train"][i]['complexity_computed'] = sentence_embs[i]
    complexities.append( sentence_embs[i] )

dataset["train"][:5]

{'text': ["This is so cool. It's like, 'would you want your mother to read this??' Really great idea, well done!",
  "Thank you!! This would make my life a lot less anxiety-inducing. Keep it up, and don't let anyone get in your way!",
  'This is such an urgent design problem; kudos to you for taking it on. Very impressive!',
  "Is this something I'll be able to install on my site? When will you be releasing it?",
  'haha you guys are a bunch of losers.'],
 'toxicity': [0.0, 0.0, 0.0, 0.0, 0.8936170339584351],
 'severe_toxicity': [0.0, 0.0, 0.0, 0.0, 0.021276595070958138],
 'obscene': [0.0, 0.0, 0.0, 0.0, 0.0],
 'threat': [0.0, 0.0, 0.0, 0.0, 0.0],
 'insult': [0.0, 0.0, 0.0, 0.0, 0.8723404407501221],
 'identity_attack': [0.0, 0.0, 0.0, 0.0, 0.021276595070958138],
 'sexual_explicit': [0.0, 0.0, 0.0, 0.0, 0.0]}

In [42]:
import numpy as np
import scipy.stats

scipy.stats.pearsonr(complexities, scores)    # Pearson's r

ValueError: shapes (100,384) and (100,) not aligned: 384 (dim 1) != 100 (dim 0)

In [43]:
complexities

[array([-4.03161272e-02,  5.83636686e-02,  3.66919711e-02,  5.98456850e-03,
        -4.73445468e-02,  4.12251651e-02,  1.10954195e-02,  9.00886580e-03,
         1.81477866e-03,  1.45907630e-03,  1.19809210e-02,  4.12295386e-02,
         2.78073382e-02, -2.93494971e-03, -6.40874133e-02,  9.67472270e-02,
         5.00832014e-02,  2.23250892e-02, -3.28645147e-02,  6.11442477e-02,
        -6.20214734e-03,  3.44227999e-02,  1.27193481e-01, -3.93520184e-02,
        -3.92203070e-02,  1.03839025e-01, -5.78940138e-02, -1.42695718e-02,
        -7.53626451e-02,  4.02517393e-02, -1.84190143e-02,  3.24216411e-02,
        -1.22104334e-02,  2.91929394e-02,  1.98101848e-02,  4.30645421e-03,
         1.34576419e-02,  3.04973591e-02,  1.97454579e-02,  4.96010818e-02,
         5.23254042e-03,  1.37774525e-02,  5.75277954e-03, -1.10008428e-02,
         5.02623729e-02, -1.41578689e-02, -3.19371447e-02,  7.13324500e-03,
        -3.52285802e-02,  1.86864380e-02, -8.18191990e-02, -4.10282463e-02,
        -2.8