# Measuring Engagement and Satisfaction in Online Mental Health Platform Conversations

## Data preprocessing

In [991]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression
from math import log10, floor

In [992]:
# Display long column text
pd.options.display.max_colwidth = 500

df = pd.read_csv("mentalhealthsupport_dyadic_convs_clean_emotion.csv")
df = df.rename(columns={'conversation id': 'conversation_id', 'post title': 'post_title', 'dialog turn': 'dialog_turn', 'emotion prediction': 'emotion_prediction'})

In [993]:
### --- CLEANING OUT MONOLOGUES FROM DATASET --- ###

# Group data by conversation id and calculate count of each conversation id
df_conv = df.groupby("conversation_id").count()
df_conv = df_conv.drop(columns=["subreddit", "post_title", "author", "text", "compound", "sentiment", "emotion_prediction"])
print("Number of conversations in subreddit: ", len(df_conv))

# Separate conversation id's with a single occurrence as monologues
df_mono = df_conv[df_conv["dialog_turn"] == 1]
print("Number of monologues in subreddit: ", len(df_mono))
df_mono_ids = df_mono.reset_index()
df_mono_ids = df_mono_ids["conversation_id"]

# Separate conversation id's with multiple occurrences as dialogues
df_dia = df_conv[df_conv["dialog_turn"] > 1]
print("Number of dialogues in subreddit: ", len(df_dia))
df_dia = df_dia.reset_index()
df_dia = df_dia.drop(columns=['dialog_turn'])

# Join dialogue conversation id's with original data such that only dialogues remain in the dataset
df = df.join(df_dia.set_index('conversation_id'), on='conversation_id', how="right") 

### ---------------------------------------------- ###

Number of conversations in subreddit:  3551
Number of monologues in subreddit:  29
Number of dialogues in subreddit:  3522


In [994]:
# Either choose from long or short conversations
long_conversations = df[df["dialog_turn"] >= 10]["conversation_id"].unique()
short_conversations = df[df["dialog_turn"] < 4]["conversation_id"]
conversation = df[df["conversation_id"] == long_conversations[6]]
conversation.head()

Unnamed: 0,conversation_id,subreddit,post_title,author,dialog_turn,text,compound,sentiment,emotion_prediction
4191,1754,MentalHealthSupport,Another Cry for Help,Neko_Styx,1,Hello everyone. I'm a F(19) - and I've been struggling with what I call attack-like visions of torture and violence for over a year now. I've had bad intrusive thoughts before - and since I've been in therapy for close to 2 years now I have actually gotten pretty good at dealing with them. But a year ago it went from only those thoughts to actual Attacks. I'll describe the latest one; It was around 11pm - I was on the phone with my Fianceé when I suddenly felt like my entire room was sl...,-0.9868,negative,anxious
4192,1754,MentalHealthSupport,Another Cry for Help,TheBassClarinetBoy,2,Hey friend. Have you had any past traumas?,0.0,neutral,sympathizing
4193,1754,MentalHealthSupport,Another Cry for Help,Neko_Styx,3,"Well - Yes, I grew up with a very hot n cold manipulative and borderline sociopathic father. I was molested by my brothers best friend when I was 6 years old. I was bullied pretty much always from grade 1 through 10. I have slight PTSD from my Parents fighting (get shaky and stuff when people raise their voices) That's about it I think - though my childhood memories are hard to remember in their entirety - but those are the things that stuck vividly.",-0.0516,negative,hopeful
4194,1754,MentalHealthSupport,Another Cry for Help,TheBassClarinetBoy,4,Could these attacks be PTSD flashbacks?,-0.4404,negative,sentimental
4195,1754,MentalHealthSupport,Another Cry for Help,Neko_Styx,5,"As mentioned - all the people are perfect strangers , getting torn apart or tortured in diverse locations and without any link. I've never been physically hurt by anyone and haven't ever witnessed a crime at all , much less of that capacity.",-0.3867,negative,embarrassed


In [995]:
# Round numbers to a given number of significant figures (default = 2)
def round_sig(x, sig=2):
    return round(x, sig-int(floor(log10(abs(x))))-1)

## Measuring the level of engagement

### Does the speaker respond back when the listener gives a response?

In [996]:
def extract_responses(conversation):
    speaker = conversation.author.iloc[0]
    listener = conversation[conversation["author"] != speaker]["author"].unique().item() 
    speaker_responses = conversation[conversation["author"] == speaker]
    listener_responses = conversation[conversation["author"] == listener]
    num_speaker_responses = len(speaker_responses)
    num_listener_responses = len(listener_responses)
    
    return speaker, listener, speaker_responses, listener_responses, num_speaker_responses, num_listener_responses

In [997]:
def calculate_speaker_listener_ratio(conversation):
    _, _, _, _, num_speaker_responses, num_listener_responses = extract_responses(conversation)
    speaker_listener_ratio = num_speaker_responses / num_listener_responses
    
    engagement = ""
    if len(conversation) == 2:
        engagement = "low engagement (single interaction)"
    elif len(conversation) == 3:
        if speaker_listener_ratio == 2: 
            engagement = "low engagement (repeated seeker interaction)"
        else: #speaker_listener_ratio == 0.5: 
            engagement = "low engagement (repeated peer supporter interaction)"
    elif len(conversation) == 4:
        if speaker_listener_ratio == 1:
            engagement = "high engagement (mutual discourse)"
        elif speaker_listener_ratio == 3 or speaker_listener_ratio == (1/3):
            engagement = "low engagement"
    elif len(conversation) > 4:
        if speaker_listener_ratio >= 0.75 and speaker_listener_ratio <= 1.25:
            engagement = "high engagement"
        elif speaker_listener_ratio <= 0.5 or speaker_listener_ratio >= 1.5:
            engagement = "low engagement"
        else:
            engagement = "moderate engagement"
        
    return round_sig(speaker_listener_ratio), engagement

### Does the listener ask any informative questions or give any suggestions? 

Unfortunately, the emotion predictions are very inaccurate. Normally, we could have given bonus points to questioning and/or suggesting listener responses. 

In [998]:
# TODO: separately label each & every sentence, currently only the whole text is labeled

### How the literature measures engagement

## Measuring the level of satisfaction

### Lexical details: "Thank you", "It means a lot"

### Shift of sentiment in speaker responses (sentiment trend)

In [999]:
def plot_emotion_sentiment(responses):   
    sns.set_theme(style="white")
    sns.relplot(x="dialog_turn", y="compound", hue="sentiment", style="emotion_prediction", palette="Set1",data=responses, s=200)

In [1000]:
def calculate_sentiment_shift(responses):
    x = np.array(responses["dialog_turn"]).reshape((-1, 1))
    y = np.array(responses["compound"])
    model = LinearRegression().fit(x, y)
    r_sq = model.score(x, y)
    
    grateful_bonus = check_grateful_positive(responses)
    r_sq += grateful_bonus
    
    satisfaction = ""
    if r_sq >= 0.5: # TODO: play with these variables (maybe even do some ML to learn these params)
        satisfaction = "positive satisfaction"
    elif r_sq >= 0.3:
        satisfaction = "moderate satisfaction"
    elif r_sq < 0.3 and r_sq >= 0:
        satisfaction = "neutral satisfaction"
    elif r_sq < 0:
        satisfaction = "negative satisfaction"
        
    return round_sig(r_sq), satisfaction, grateful_bonus

### If the last speaker turn has grateful emotion and has positive sentiment

In [1001]:
# Get all emotions and the final emotion of the given responses
def get_emotion_prediction(responses):
    emotions = responses["emotion_prediction"]
    final_emotion = emotions.iloc[-1]
    
    return emotions, final_emotion

In [1002]:
# Get all sentiments and the final sentiment of the given responses
def get_sentiment(responses):
    sentiments = responses["sentiment"]
    final_sentiment = sentiments.iloc[-1]
    
    return sentiments, final_sentiment

In [1003]:
# Add bonus points if the last speaker post is grateful and has positive sentiment
def check_grateful_positive(responses):
    _, final_sentiment = get_sentiment(responses)
    _, final_emotion = get_emotion_prediction(responses)
    
    grateful_bonus=0
    
    if final_sentiment == "positive" and final_emotion == "grateful":
        grateful_bonus += 0.1
    
    return grateful_bonus

### How the literature measures satisfaction

## Final classification

In [1016]:
def classify_conversation(conversation):
    return str(calculate_speaker_listener_ratio(conversation)[1] + ", " + calculate_sentiment_shift(speaker_responses)[1])

## Testing the measures

In [1025]:
def test_examples(conversation_id):
    conversation = df[df["conversation_id"] == conversation_id]
    print("Conversation length: ", len(conversation))
    print("Engagement coefficient: ", calculate_speaker_listener_ratio(conversation)[0], " --> ", calculate_speaker_listener_ratio(conversation)[1])
    print("Sentiment coefficient: ", calculate_sentiment_shift(speaker_responses)[0], " with grateful bonus: ", calculate_sentiment_shift(speaker_responses)[2], " --> ", calculate_sentiment_shift(speaker_responses)[1])
    print("Classification summary: ", classify_conversation(conversation))

In [1026]:
conversation_id = 1754

In [1029]:
#df[df["conversation_id"] == 1754]

In [1027]:
test_examples(conversation_id)

Conversation length:  32
Engagement coefficient:  1.1  -->  high engagement
Sentiment coefficient:  0.39  with grateful bonus:  0.1  -->  moderate satisfaction
Classification summary:  high engagement, moderate satisfaction


## Emotion prediction is not accurate at all...

In [1008]:
conversation_bad_emotion_prediction_example = df[df["conversation_id"] == 1754]
conversation_bad_emotion_prediction_example

Unnamed: 0,conversation_id,subreddit,post_title,author,dialog_turn,text,compound,sentiment,emotion_prediction
4191,1754,MentalHealthSupport,Another Cry for Help,Neko_Styx,1,Hello everyone. I'm a F(19) - and I've been struggling with what I call attack-like visions of torture and violence for over a year now. I've had bad intrusive thoughts before - and since I've been in therapy for close to 2 years now I have actually gotten pretty good at dealing with them. But a year ago it went from only those thoughts to actual Attacks. I'll describe the latest one; It was around 11pm - I was on the phone with my Fianceé when I suddenly felt like my entire room was sl...,-0.9868,negative,anxious
4192,1754,MentalHealthSupport,Another Cry for Help,TheBassClarinetBoy,2,Hey friend. Have you had any past traumas?,0.0,neutral,sympathizing
4193,1754,MentalHealthSupport,Another Cry for Help,Neko_Styx,3,"Well - Yes, I grew up with a very hot n cold manipulative and borderline sociopathic father. I was molested by my brothers best friend when I was 6 years old. I was bullied pretty much always from grade 1 through 10. I have slight PTSD from my Parents fighting (get shaky and stuff when people raise their voices) That's about it I think - though my childhood memories are hard to remember in their entirety - but those are the things that stuck vividly.",-0.0516,negative,hopeful
4194,1754,MentalHealthSupport,Another Cry for Help,TheBassClarinetBoy,4,Could these attacks be PTSD flashbacks?,-0.4404,negative,sentimental
4195,1754,MentalHealthSupport,Another Cry for Help,Neko_Styx,5,"As mentioned - all the people are perfect strangers , getting torn apart or tortured in diverse locations and without any link. I've never been physically hurt by anyone and haven't ever witnessed a crime at all , much less of that capacity.",-0.3867,negative,embarrassed
4196,1754,MentalHealthSupport,Another Cry for Help,TheBassClarinetBoy,6,Is it possible you’re having psychotic episodes?,0.0,neutral,anticipating
4197,1754,MentalHealthSupport,Another Cry for Help,Neko_Styx,7,I said in another comment that ai brought it up with my Therapist - she said that Psychotic episodes usually take a lot longer to get through then a maximum of 30 minutes as I have experienced.,0.0,neutral,faithful
4198,1754,MentalHealthSupport,Another Cry for Help,TheBassClarinetBoy,8,Has your therapist ever mention schizophrenia?,0.0,neutral,guilty
4199,1754,MentalHealthSupport,Another Cry for Help,Neko_Styx,9,"Schizophrenia is mostly delusional - though there are hallucinations too - I asked and researched , any disorder that falls under that umbrella has at least one month of consistent psychotic symptoms.",-0.4019,negative,lonely
4200,1754,MentalHealthSupport,Another Cry for Help,TheBassClarinetBoy,10,Have there been any additional stressors since you’ve started to have these hallucinations?,-0.4767,negative,sentimental


When we check the emotion predictions, we see that most of them are very inaccurate. The sentiment tags are more accurate.

In [1009]:
df[df["emotion_prediction"]=="questioning"]

Unnamed: 0,conversation_id,subreddit,post_title,author,dialog_turn,text,compound,sentiment,emotion_prediction
799,342,MentalHealthSupport,Mental Health in the Black Community,yadadameannn,4,thanks man. We just got everything fixed. We were working on coding this weekend. Everything is fine now.,0.5719,positive,questioning
1070,455,MentalHealthSupport,I’m really struggling,McThrowaway42069,2,"Dont. No matter how bad it seems there a people who care for you. You are loved, don't hurt the people who care about you.",0.8635,positive,questioning
1361,578,MentalHealthSupport,Let Me Introduce Myself,antwerpbanana,2,"Hi Caty, thank you for sharing your story. I hope we can be of help to you here.",0.872,positive,questioning
4239,1761,MentalHealthSupport,Feel awful,TheBassClarinetBoy,2,"Hey friend. I’m recommend talking with people you trust about this, and trying to find professional help",0.8934,positive,questioning
5636,2414,MentalHealthSupport,Need help! Mental health destroying relationship...,Rock-it1,2,Therapist here. Can you provide a few more details?,0.0,neutral,questioning


In [1010]:
print(df[(df["conversation_id"]==342) & (df["dialog_turn"]==4)]["text"])
print(df[(df["conversation_id"]==455) & (df["dialog_turn"]==2)]["text"])
print(df[(df["conversation_id"]==578) & (df["dialog_turn"]==2)]["text"])
print(df[(df["conversation_id"]==1761) & (df["dialog_turn"]==2)]["text"])
print(df[(df["conversation_id"]==2414) & (df["dialog_turn"]==2)]["text"])

799    thanks man. We just got everything fixed. We were working on coding this weekend. Everything is fine now. 
Name: text, dtype: object
1070    Dont. No matter how bad it seems there a people who care for you. You are loved, don't hurt the people who care about you.
Name: text, dtype: object
1361    Hi Caty, thank you for sharing your story. I hope we can be of help to you here.
Name: text, dtype: object
4239    Hey friend. I’m recommend talking with people you trust about this, and trying to find professional help
Name: text, dtype: object
5636    Therapist here. Can you provide a few more details?
Name: text, dtype: object


The questioning tags are not accurate either. Only one of the statements that are tagged with the questioning tag are actually questioning.

In [1011]:
df[df["emotion_prediction"]=="suggesting"]

Unnamed: 0,conversation_id,subreddit,post_title,author,dialog_turn,text,compound,sentiment,emotion_prediction


There are no posts with a suggesting tag in this subreddit. Therefore, it cannot be utilized in the calculation of the engagement measure.