In [1]:
from IPython.display import Image, display; display(Image(url="https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.sprinklr.com%2Fblog%2Fchatbot-examples%2F&psig=AOvVaw3GjLwPVFaNAUG6e4xKJYH2&ust=1705391165437000&source=images&cd=vfe&opi=89978449&ved=0CBMQjRxqFwoTCJDLi8yZ34MDFQAAAAAdAAAAABAI"))



## <div style="color:white;display:fill;border-radius:8px;background-color:##800080;font-size:150%; letter-spacing:1.0px"><p style="padding: 15px;color:white;"><b><b><span style='color:white'><span style='color:#F1A424'> | </span> </span></b>Defining the Question</b></p></div>

## <b><span style='color:#F1A424'>|</span> Executive Summary:</b> 

**Mental health, fundamentally a state of well-being, is crucial for individuals to realize their abilities, manage life's normal stresses, work productively, and contribute to their communities. Despite the rising global prevalence of mental health issues, including a 13% increase over the last decade noted by the WHO, access to effective treatments remains uneven, particularly among urban youths who face distinct challenges and stressors.
 Saidika, a burgeoning mental health service provider for urban youth, has encountered challenges due to the growing demand for mental health services. The volume of clients has impeded the prompt allocation of therapy resources, particularly for urgent cases, prompting the need for innovative solutions to enhance the efficiency and effectiveness of mental health care delivery. By leveraging the capabilities of AI and advancements in NLP, the project aims to bridge the gap between the growing demand for mental health services and the current limitations in supply and accessibility.**


## <b><span style='color:#F1A424'>|</span> Problem Statement:</b> 

**Saidika's platform is currently unable to efficiently handle the increasing influx of clients seeking mental health services. The inability to quickly triage and prioritize client needs is leading to potential delays in addressing urgent cases, which could have severe consequences on the well-being of individuals in need.**
**

## <b><span style='color:#F1A424'>|</span> Proposed Solution:</b> 

**Main Objective is to integrate ban advanced AI-powered mental health chatbot into Saidika's existing platform
to optimize client management processes, ensuring timely and appropriate allocation of therapy resources to those in need.**


## <b><span style='color:#F1A424'>|</span>Specific Obectives:</b> 
- **Client Categorization: To develop a chatbot that can accurately categorize clients based on their responses, distinguishing between varying levels of care requirements and scheduling clients based on their assessed needs and therapists' availability, optimizing the use of Saidika's resources.**
- **Urgency Escalation: To ensure the chatbot is capable of rapidly identifying and escalating urgent cases to therapists, facilitating prompt intervention.**
- **Service Accessibility: To broaden access to mental health care by providing a 24/7 chatbot service that will offer real-time interaction to clients who require immediate attention or a platform to express their concerns, bridging the gap until a professional is available.**
- **Resource Optimization: To aid therapists in managing their workload more effectively by allowing the chatbot to handle routine inquiries and non-urgent interactions.**
- **Data Collection and Analysis: To gather and analyze interaction data to continually improve the chatbot’s performance and the platform’s services.**
- **User Experience Enhancement: To create a user-friendly chatbot interface that provides a supportive environment for clients to express their concerns.**
- **Integration and Compatibility: To seamlessly integrate the chatbot into both web and mobile applications, ensuring functionality across various devices.**


## <b><span style='color:#F1A424'>|</span> Project Impact:</b> 

**The successful implementation of the mental health chatbot is expected to significantly improve the scalability of Saidika's services, enabling them to handle a greater volume of clients without sacrificing the quality of care. This technological solution aims to not only streamline operations but also to provide a critical early support system for individuals seeking mental health assistance. The chatbot's ability to analyze data will also furnish Saidika with valuable insights, driving policy and decision-making to better serve the community's mental health needs. Ultimately, the project endeavors to foster a more resilient urban youth population, better equipped to contribute positively to their communities**

## DATA PERTINENCE AND ATTRIBUTION


**The business aims to gain valuable insights into mental health trends, sentiments, and urgency levels by leveraging a diverse dataset acquired from public domain resources and Saidika's private, anonymized user data with proper consent and privacy law adherence. The data primarily consists of information gathered from health forums, Reddit, a dedicated mental health forum, and Beyond Blue.**

**Data Preparation:**

**Data Sources: Public domain resources and private Saidika user data.**

**Variable Types:**

- **Categorical variables: Representing various types of mental health issues.**

- **Binary variables: Indicating urgency levels.**
- **Continuous variables: Expressing sentiment scores associated with mental health discussions.**

**Preprocessing Steps:**

- **Text data cleaning: Removal of identifiable information.**

- **Tokenization: Breaking down text into tokens.**

- **Lemmatization: Reducing words to their base or root form.**

- **Vectorization: Converting text into numerical vectors suitable for Natural Language Processing (NLP) tasks.**

**Libraries Used:**

- **BeautifulSoup: Utilized for parsing and extracting data from HTML content.**

- **Python Libraries (NLTK, spaCy): Applied for NLP tasks such as tokenization, lemmatization, and other text processing operations.**

**Algorithms:**

- **Logistic Regression: Employed for analyzing categorical and binary variables, predicting urgency levels based on mental health issues.**

- **LSTM (Long Short-Term Memory): Utilized for sequence modeling in NLP, capturing dependencies in sentiment scores over the course of discussions.**

- **BERT (Bidirectional Encoder Representations from Transformers): Implemented for advanced contextualized embeddings, enhancing understanding of the nuanced context within mental health discourse.**

- **GPT (Generative Pre-trained Transformer): Employed for generating human-like text responses and comprehending the context of mental health discussions.**

**Overall, the objective is to extract meaningful insights, patterns, and correlations from this rich dataset, contributing to a deeper understanding of mental health issues, sentiments, and urgency levels, ultimately informing strategies for better mental health support and intervention.**








## <div style="color:white;display:fill;border-radius:8px;background-color:#800080;font-size:150%; letter-spacing:1.0px"><p style="padding: 12px;color:white;"><b><b><span style='color:white'><span style='color:#F1A424'>1 |</span></span></b>Data Loading & Preparation</b></p></div>

## <b>1.1 <span style='color:#F1A424'>|</span> Importing Necessary Libraries</b> 

In [2]:
import re
import string
import numpy as np
import random
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns  #plotting statistical graphs
%matplotlib inline
from plotly import graph_objs as go
import plotly.express as px
import plotly.figure_factory as ff
# import squarify
from collections import Counter

# Load the Text Cleaning Package
import neattext.functions as nfx

from PIL import Image
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator ##is a data visualization technique used
#for representing text data in which the size of each word indicates its frequency

from sklearn.linear_model import LogisticRegression
from sklearn import metrics
from sklearn.metrics import confusion_matrix,roc_auc_score,classification_report
from sklearn.compose import ColumnTransformer

from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier,AdaBoostClassifier,GradientBoostingClassifier,ExtraTreesClassifier
from sklearn.linear_model import RidgeClassifier,SGDClassifier
from sklearn.svm import SVC
from sklearn.naive_bayes import MultinomialNB


import nltk
from nltk.corpus import stopwords

from tqdm import tqdm ##new progress bars repeatedly
import os
import nltk ##building Python programs to work with human language data
#import spacy #for training the NER model tokenize words
#import random
#from spacy.util import compounding
#from spacy.util import minibatch


pd.set_option('max_colwidth', 400)
pd.set_option('use_mathjax', False)


import warnings
warnings.filterwarnings("ignore")

## <b>1.2 <span style='color:#F1A424'>|</span>Loading in our Data</b> 

In [3]:
# load the dataset -> feature extraction -> data visualization -> data cleaning -> train test split
# -> model building -> model training -> model evaluation -> model saving -> streamlit application deploy

# load the dataset just using specific features
df = pd.read_csv('../data/Aggregated_Data_Final.csv')

df

Unnamed: 0,Subreddit,Reddit Post,Unnamed: 2
0,CPTSD,Feeling like I was made to be unlovable,"I don't know if it was the emotional neglect, the psychological abuse, medical abuse, bullying, the CSA, whatever. I'm a mess right now. I feel like a horrible monster that somewhat a lot of people see as attractive, but under the facade I'm still a monster. As if I was someone who was built for being unlovable and despised, physically and emotionally, since I was born. I keep working and wor..."
1,CPTSD,DAE not know what to do with themselves when they have time?,"See title.\n\nI used to be the person full of hobbies (biking, drawing, reading, writing, walking, gaming) who really disliked people who never knew what to do with their free time and would be clingy. Now I am one of them.\n\nThrough years of hard depression and su.c.dal.ty thanks to cptsd I have stopped all my hobbies. I entrench myself in work and by now also meeting people and sometimes ob..."
2,CPTSD,Yoga triggers me- anyone else?,"I was doing yoga for years as a tool to help me back into my body when I was feeling rough as a form of reconnection. I even went as far as becoming trained in teaching, doing a 200hr training. As my trauma symptoms peaked however yoga would actually start having the reverse effect and would dissociate me. (In retrospect I wonder if I was in fact being dissociated the whole time.)\n\nStarted a..."
3,CPTSD,Did anyone else have a parent who said - you can make the choice - do you want ho listen to the sweet loving voice of tell it in or should I beat you?,The child me thought I made the right choice by listening to him. And he said as much. That I had finally done something right . Especially when they kept blaming me for all the things I did wrong . Anyone else ?
4,CPTSD,"Women: What is the real situation of misogyny, patriarchy, sexual abuse and harassment in your country?",
...,...,...,...
27673,suicidewatch,"But I want help doing it Tired of everyone saying no don't.\n\n*Get up so I can punch you again*\n\nI'm exhausted I want to clock out and be done everything just keeps getting worse, everything keeps losing value including myself- as if any of that even mattered.\n\nFuck the helpline I want real help I want help out",
27674,suicidewatch,Nothing to live for The ONLY reason I am alive right now is because of my sweet cat Pippin. Yesterday was the anniversary of adopting him 2 years ago. \nI've been really depressed and haven't been able to play with him as much so hes been meowing and being a little naughty as a result. I got so mad yesterday and yelled at him. All I can think about now is how I should give him to someone healt...,
27675,suicidewatch,Iâ€™m going to fucking kill myself 18 years too long. I think Iâ€™m going to go,
27676,suicidewatch,Iâ€™m going to pieces All Iâ€™ve done for about a month has been lay in bed. I donâ€™t enjoy anything. Canâ€™t focus on anything. I am terrified of the future. I donâ€™t want to be alive. I am in so much emotional pain. The only reason I am alive is my father because I donâ€™t want to hurt him. That and I canâ€™t decide on a method. I think not wanting to hurt him keeps me from choosing a meth...,


In [4]:
# Combine the two columns,'Reddit Post','Unnamed: 2' into a new column named "reddit_post"
df['reddit_post'] = df['Unnamed: 2'].fillna(df['Reddit Post'])

# Drop the original columns if needed
df.drop(['Reddit Post', 'Unnamed: 2'], axis=1, inplace=True)


In [5]:
df

Unnamed: 0,Subreddit,reddit_post
0,CPTSD,"I don't know if it was the emotional neglect, the psychological abuse, medical abuse, bullying, the CSA, whatever. I'm a mess right now. I feel like a horrible monster that somewhat a lot of people see as attractive, but under the facade I'm still a monster. As if I was someone who was built for being unlovable and despised, physically and emotionally, since I was born. I keep working and wor..."
1,CPTSD,"See title.\n\nI used to be the person full of hobbies (biking, drawing, reading, writing, walking, gaming) who really disliked people who never knew what to do with their free time and would be clingy. Now I am one of them.\n\nThrough years of hard depression and su.c.dal.ty thanks to cptsd I have stopped all my hobbies. I entrench myself in work and by now also meeting people and sometimes ob..."
2,CPTSD,"I was doing yoga for years as a tool to help me back into my body when I was feeling rough as a form of reconnection. I even went as far as becoming trained in teaching, doing a 200hr training. As my trauma symptoms peaked however yoga would actually start having the reverse effect and would dissociate me. (In retrospect I wonder if I was in fact being dissociated the whole time.)\n\nStarted a..."
3,CPTSD,The child me thought I made the right choice by listening to him. And he said as much. That I had finally done something right . Especially when they kept blaming me for all the things I did wrong . Anyone else ?
4,CPTSD,"Women: What is the real situation of misogyny, patriarchy, sexual abuse and harassment in your country?"
...,...,...
27673,suicidewatch,"But I want help doing it Tired of everyone saying no don't.\n\n*Get up so I can punch you again*\n\nI'm exhausted I want to clock out and be done everything just keeps getting worse, everything keeps losing value including myself- as if any of that even mattered.\n\nFuck the helpline I want real help I want help out"
27674,suicidewatch,Nothing to live for The ONLY reason I am alive right now is because of my sweet cat Pippin. Yesterday was the anniversary of adopting him 2 years ago. \nI've been really depressed and haven't been able to play with him as much so hes been meowing and being a little naughty as a result. I got so mad yesterday and yelled at him. All I can think about now is how I should give him to someone healt...
27675,suicidewatch,Iâ€™m going to fucking kill myself 18 years too long. I think Iâ€™m going to go
27676,suicidewatch,Iâ€™m going to pieces All Iâ€™ve done for about a month has been lay in bed. I donâ€™t enjoy anything. Canâ€™t focus on anything. I am terrified of the future. I donâ€™t want to be alive. I am in so much emotional pain. The only reason I am alive is my father because I donâ€™t want to hurt him. That and I canâ€™t decide on a method. I think not wanting to hurt him keeps me from choosing a meth...


In [6]:
#summary of our DataFrame
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 27678 entries, 0 to 27677
Data columns (total 2 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   Subreddit    27678 non-null  object
 1   reddit_post  27678 non-null  object
dtypes: object(2)
memory usage: 432.6+ KB


Finding the number of unique classes (subreddits) in our data

In [7]:
#obtain the unique values in the 'Subreddit' column
df.Subreddit.unique()

array(['CPTSD', 'diagnosedPTSD', 'alcoholism', 'socialanxiety',
       'suicidewatch'], dtype=object)

Below, we count the number of characters in each post.

In [8]:
#character count of each reddit post
df['reddit_post'].apply(str).apply(len)

0         443
1        1789
2         522
3         212
4         103
         ... 
27673     311
27674    1147
27675      79
27676     607
27677     526
Name: reddit_post, Length: 27678, dtype: int64

We find the number of null values in each column

In [9]:
#df.isna().sum()

In [10]:
#drop all NAN values in our dataframe
#df.dropna(inplace=True)


In [11]:
#check the number of null values 
#df.isna().sum()


Next, we find the number of words in each post

In [12]:
#word count in each reddit post
df[df['reddit_post'].isna()==False]['reddit_post'].apply(lambda x: len(x.split(" ")))

0         81
1        315
2         96
3         43
4         16
        ... 
27673     56
27674    225
27675     16
27676    125
27677     98
Name: reddit_post, Length: 27678, dtype: int64

## <div style="color:white;display:fill;border-radius:8px;background-color:#800080;font-size:150%; letter-spacing:1.0px"><p style="padding: 12px;color:white;"><b><b><span style='color:white'><span style='color:#F1A424'>2 |</span></span></b> Data Quality Checks</b></p></div>
   
- **Another crucial step in any project involves ensuring the quality of your data. Remember that your model’s performance is directly tied to the data it processes. Therefore, take the time to remove duplicates and handle missing values appropriately.**

- **Here we always check for missing values, outliers and remove any unnecessary variables/features/columns. Since we have text data, outliers cannot be checked.**

## <b>2.1 <span style='color:#F1A424'>|</span> Checking for NaN Values</b> 

In [13]:
#check for the sum NaN values in our dataframe
df.isna().sum()

Subreddit      0
reddit_post    0
dtype: int64

In [14]:
#prints the count of NaN values for each column after dropping NaN values
print(df.isna().sum())
print("*"*40)

Subreddit      0
reddit_post    0
dtype: int64
****************************************


**As noted , we have no missing values in our dataframe.**

## <b>2.2 <span style='color:#F1A424'>|</span> Checking for Sentence Length Consistency</b> 

In [15]:
df['reddit_post'].apply(len)

0         443
1        1789
2         522
3         212
4         103
         ... 
27673     311
27674    1147
27675      79
27676     607
27677     526
Name: reddit_post, Length: 27678, dtype: int64

**This can give you an overview of the number of words per tweet. We also notice that some consist of less then five words hence won't be instrumental in constructing our predictive model.**

In [16]:
sum(df['reddit_post'].apply(len) > 5) , sum(df['reddit_post'].apply(len) <= 5)

(27678, 0)

**All our posts have words greater than five**

## <b>2.3 <span style='color:#F1A424'>|</span> Checking for Duplicates</b> 

In [17]:
#check and print the number of duplicates
print(df.duplicated().sum())
print("*"*40)

12
****************************************


**we notice that we have 12 duplicates.**

In [18]:
#checking if the duolicate values are indeed duplicates
df[df.duplicated(subset=['reddit_post'],keep=False)].sort_values(by='reddit_post').sample(10)

Unnamed: 0,Subreddit,reddit_post
4225,socialanxiety,Am I strange for being aroused by a man's looks? Or does it make me dirty that I like looking at men? Is it common? I ask because Ive been reading a lot on reddit about dating &amp; attraction lately. A lot of men keep telling me that most women are able to be aroused by ugly men so long as those men say &amp; do the right things. I am not this way though. I just like beauty. My question is: i...
6473,suicidewatch,"sad new year i went into 2020 completely alone in my room, crying and wanting to die. my boyfriend didnâ€™t help. heâ€™s at a friends house and doesnâ€™t wanna talk to me today."
4205,socialanxiety,Thereâ€™s a party going on in my house and iâ€™m stuck in my room. I hate that iâ€™m like this but donâ€™t know any different.
16952,suicidewatch,"I'm going to go shoot up an elementary school as soon as I run out of money. I can't work, and I can't get a job. I have some money, about $40,000. Inheritance. As soon as my money runs out, I'm going to go shoot up an elementary school and spend the rest of my life in prison.\n\nI'll do it because prison is better than being homeless, and because I want to ""give back"" to this fucked-up world ..."
26120,suicidewatch,I canâ€™t stop crying WHY WONT ANYONE HELP ME???
4592,socialanxiety,Question about benzodiazepines/social exposure therapy It's been a while since I've seen a doctor about some of my social issues and I wanted some perspective before seeking any specific medication. Have benzodiazepines ever facilitated for you any kind of exposure therapy? Does the ease benzodiazepines bring in social scenarios leave any lasting effects that go beyond the timeframe the drug i...
1754,alcoholism,How yâ€™all get so much booze Iâ€™m 17 and working towards it but I always hit a speedbump whether itâ€™s money or finding somewhere to get some I find myself getting high off household appliances whenever I canâ€™t find any and thatâ€™s not good for me so I just want some ideas on finding a steady constant source
1755,alcoholism,How yâ€™all get so much booze Iâ€™m 17 and working towards it but I always hit a speedbump whether itâ€™s money or finding somewhere to get some I find myself getting high off household appliances whenever I canâ€™t find any and thatâ€™s not good for me so I just want some ideas on finding a steady constant source
4837,socialanxiety,"I need help My grandma says that I have to go to a ""school"" but Im suspicious, I can't find anything about it online and my grandma is being weird about it. I have a feeling they are gonna drive me to a pysch ward as a punishment. They did it before and no one listened to me. I'm almost 100 percent sure that's what they are gonna do, I'm not depressed or anything. I got kicked out of school la..."
7152,suicidewatch,Help me Please


In [19]:
df = df.drop_duplicates()

print(df.duplicated().sum())
print("*"*40)

0
****************************************


In [20]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 27666 entries, 0 to 27677
Data columns (total 2 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   Subreddit    27666 non-null  object
 1   reddit_post  27666 non-null  object
dtypes: object(2)
memory usage: 648.4+ KB


## <div style="color:white;display:fill;border-radius:8px;background-color:#800080;font-size:150%; letter-spacing:1.0px"><p style="padding: 12px;color:white;"><b><b><span style='color:white'><span style='color:#F1A424'>3 |</span></span></b> Data Preprocessing</b></p></div>




## <b>3.1 <span style='color:#F1A424'>|</span> cleaning textual data </b> 

We will clean and preprocess the textual data in the dataset to enhance its quality and consistency:
- Remove unnecessary characters.
- Convert text to lowercase for uniformity.
- Tokenization: Tokenize the text data to break it into individual words or tokens. This step is crucial for further analysis of the textual content.
- Normalization:Apply normalization techniques, such as stemming or lemmatization, to reduce words to their base or root forms. This aids in standardizing the text.
- Stop Word Removal:Eliminate common stop words from the text to focus on meaningful content. Stop words often do not contribute significantly to the analysis.
- Entity Recognition:Identify and recognize entities within the text. This step is particularly useful when dealing with named entities or specific information entities.
- Syntax Parsing:Perform syntax parsing to analyze the grammatical structure of sentences. This can provide insights into relationships between words.
- Text Transformation:Implement additional text transformations as needed for your specific analysis or modeling requirements.

**We will utilize the NeatText Library for text cleaning, a straightforward NLP package designed for cleaning and preprocessing textual data. This library simplifies the process of cleaning unstructured text by handling tasks such as removing special characters and stopwords, thereby reducing noise in the data.**

In [21]:
# load the text cleaning packages

import neattext as nt
import neattext.functions as nfx

# Methods and Attributes of the function
dir(nt)

['AUTOMATED_READ_INDEX',
 'BTC_ADDRESS_REGEX',
 'CONTRACTIONS_DICT',
 'CURRENCY_REGEX',
 'CURRENCY_SYMB_REGEX',
 'Callable',
 'Counter',
 'CreditCard_REGEX',
 'DATE_REGEX',
 'EMAIL_REGEX',
 'EMOJI_REGEX',
 'FUNCTORS_WORDLIST',
 'HASTAG_REGEX',
 'HTML_TAGS_REGEX',
 'List',
 'MASTERCard_REGEX',
 'MD5_SHA_REGEX',
 'MOST_COMMON_PUNCT_REGEX',
 'NUMBERS_REGEX',
 'PHONE_REGEX',
 'PUNCT_REGEX',
 'PoBOX_REGEX',
 'SPECIAL_CHARACTERS_REGEX',
 'STOPWORDS',
 'STOPWORDS_de',
 'STOPWORDS_en',
 'STOPWORDS_es',
 'STOPWORDS_fr',
 'STOPWORDS_ru',
 'STOPWORDS_yo',
 'STREET_ADDRESS_REGEX',
 'TextCleaner',
 'TextExtractor',
 'TextFrame',
 'TextMetrics',
 'TextPipeline',
 'Tuple',
 'URL_PATTERN',
 'USER_HANDLES_REGEX',
 'VISACard_REGEX',
 'ZIP_REGEX',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 'clean_text',
 'defaultdict',
 'digit2words',
 'emoji_explainer',
 'emojify',
 'explainer',
 'extract_btc_address',
 

### <b>3.1.1 <span style='color:#F1A424'>|</span> clean_text function</b> 

In [22]:
# Noise scan
df['reddit_post'].apply(lambda x: nt.TextFrame(x).noise_scan()['text_noise'])

0        13.769752
1        11.906093
2        12.835249
3        14.150943
4        10.679612
           ...    
27673    12.540193
27674    14.734089
27675     8.860759
27676    14.332784
27677    14.068441
Name: reddit_post, Length: 27666, dtype: float64

In [23]:
# Ensure all entries in reddit_post column are strings
df['reddit_post'] = df['reddit_post'].astype(str)

# Now apply the clean_text function
df['clean_post'] = df['reddit_post'].apply(lambda x: nfx.clean_text(x, puncts=False, stopwords=False))

In [24]:
df

Unnamed: 0,Subreddit,reddit_post,clean_post
0,CPTSD,"I don't know if it was the emotional neglect, the psychological abuse, medical abuse, bullying, the CSA, whatever. I'm a mess right now. I feel like a horrible monster that somewhat a lot of people see as attractive, but under the facade I'm still a monster. As if I was someone who was built for being unlovable and despised, physically and emotionally, since I was born. I keep working and wor...","i don't know if it was the emotional neglect, the psychological abuse, medical abuse, bullying, the csa, whatever. i'm a mess right now. i feel like a horrible monster that somewhat a lot of people see as attractive, but under the facade i'm still a monster. as if i was someone who was built for being unlovable and despised, physically and emotionally, since i was born. i keep working and work..."
1,CPTSD,"See title.\n\nI used to be the person full of hobbies (biking, drawing, reading, writing, walking, gaming) who really disliked people who never knew what to do with their free time and would be clingy. Now I am one of them.\n\nThrough years of hard depression and su.c.dal.ty thanks to cptsd I have stopped all my hobbies. I entrench myself in work and by now also meeting people and sometimes ob...","see title. i used to be the person full of hobbies (biking, drawing, reading, writing, walking, gaming) who really disliked people who never knew what to do with their free time and would be clingy. now i am one of them. through years of hard depression and su.c.dal.ty thanks to cptsd i have stopped all my hobbies. i entrench myself in work and by now also meeting people and sometimes obligato..."
2,CPTSD,"I was doing yoga for years as a tool to help me back into my body when I was feeling rough as a form of reconnection. I even went as far as becoming trained in teaching, doing a 200hr training. As my trauma symptoms peaked however yoga would actually start having the reverse effect and would dissociate me. (In retrospect I wonder if I was in fact being dissociated the whole time.)\n\nStarted a...","i was doing yoga for years as a tool to help me back into my body when i was feeling rough as a form of reconnection. i even went as far as becoming trained in teaching, doing a 200hr training. as my trauma symptoms peaked however yoga would actually start having the reverse effect and would dissociate me. (in retrospect i wonder if i was in fact being dissociated the whole time.) started agai..."
3,CPTSD,The child me thought I made the right choice by listening to him. And he said as much. That I had finally done something right . Especially when they kept blaming me for all the things I did wrong . Anyone else ?,the child me thought i made the right choice by listening to him. and he said as much. that i had finally done something right . especially when they kept blaming me for all the things i did wrong . anyone else ?
4,CPTSD,"Women: What is the real situation of misogyny, patriarchy, sexual abuse and harassment in your country?","women: what is the real situation of misogyny, patriarchy, sexual abuse and harassment in your country?"
...,...,...,...
27673,suicidewatch,"But I want help doing it Tired of everyone saying no don't.\n\n*Get up so I can punch you again*\n\nI'm exhausted I want to clock out and be done everything just keeps getting worse, everything keeps losing value including myself- as if any of that even mattered.\n\nFuck the helpline I want real help I want help out","but i want help doing it tired of everyone saying no don't. *get up so i can punch you again* i'm exhausted i want to clock out and be done everything just keeps getting worse, everything keeps losing value including myself- as if any of that even mattered. fuck the helpline i want real help i want help out"
27674,suicidewatch,Nothing to live for The ONLY reason I am alive right now is because of my sweet cat Pippin. Yesterday was the anniversary of adopting him 2 years ago. \nI've been really depressed and haven't been able to play with him as much so hes been meowing and being a little naughty as a result. I got so mad yesterday and yelled at him. All I can think about now is how I should give him to someone healt...,nothing to live for the only reason i am alive right now is because of my sweet cat pippin. yesterday was the anniversary of adopting him 2 years ago. i've been really depressed and haven't been able to play with him as much so hes been meowing and being a little naughty as a result. i got so mad yesterday and yelled at him. all i can think about now is how i should give him to someone healthi...
27675,suicidewatch,Iâ€™m going to fucking kill myself 18 years too long. I think Iâ€™m going to go,iâ€™m going to fucking kill myself 18 years too long. i think iâ€™m going to go
27676,suicidewatch,Iâ€™m going to pieces All Iâ€™ve done for about a month has been lay in bed. I donâ€™t enjoy anything. Canâ€™t focus on anything. I am terrified of the future. I donâ€™t want to be alive. I am in so much emotional pain. The only reason I am alive is my father because I donâ€™t want to hurt him. That and I canâ€™t decide on a method. I think not wanting to hurt him keeps me from choosing a meth...,iâ€™m going to pieces all iâ€™ve done for about a month has been lay in bed. i donâ€™t enjoy anything. canâ€™t focus on anything. i am terrified of the future. i donâ€™t want to be alive. i am in so much emotional pain. the only reason i am alive is my father because i donâ€™t want to hurt him. that and i canâ€™t decide on a method. i think not wanting to hurt him keeps me from choosing a meth...


In [25]:
# Extract URLs into another column before removing them
# If we were to remove the URLs after remove the special characters e.g '//' the function would be ubable to detect the URLs
df['urls'] = df['clean_post'].apply(nfx.extract_urls)

df[['reddit_post', 'clean_post', 'urls']].sample(5)

Unnamed: 0,reddit_post,clean_post,urls
12737,"Iâ€™ve cut myself again Iâ€™m so pathetic, Iâ€™ve been cutting myself ever since I was 14 years old, it has been 5 years and I still canâ€™t stop.\n\nIâ€™m a waste of space and I wish I wasnâ€™t so coward, because then I would have ended it all already.\n\nI just want to rest.","iâ€™ve cut myself again iâ€™m so pathetic, iâ€™ve been cutting myself ever since i was 14 years old, it has been 5 years and i still canâ€™t stop. iâ€™m a waste of space and i wish i wasnâ€™t so coward, because then i would have ended it all already. i just want to rest.",[]
22889,I lost like $50K to this coronavirus. ^ Title. Good bye stocks :(,i lost like $50k to this coronavirus. ^ title. good bye stocks :(,[]
18363,if anyone has a nuke please just fucking launch it already fuck you lifeâ€™s inherently pointless,if anyone has a nuke please just fucking launch it already fuck you lifeâ€™s inherently pointless,[]
2576,My sister is having a New Years party with all her cool friends downstairs My room (and the beer) is down there so Iâ€™m trapped up here and theyâ€™re like filtering up and down to see me awkwardly sitting on the couch by myself. I thought maybe there was a chance I would join them but now that theyâ€™re here there is no chance lmao. \n\nJust figured I could share some New Years despair with y...,my sister is having a new years party with all her cool friends downstairs my room (and the beer) is down there so iâ€™m trapped up here and theyâ€™re like filtering up and down to see me awkwardly sitting on the couch by myself. i thought maybe there was a chance i would join them but now that theyâ€™re here there is no chance lmao. just figured i could share some new years despair with yâ€™a...,[]
26706,"a vent i guess i dont know why im writing this. i guess its because when it comes down to it all i really do want is for people to listen to me. and validation. i really wish people did like me. ive been on and off depressed since i was 12 probably and im 19 now. i was getting better, i had only seriously considered suicide a few times in the past year, which is way less than i used to. but my...","a vent i guess i dont know why im writing this. i guess its because when it comes down to it all i really do want is for people to listen to me. and validation. i really wish people did like me. ive been on and off depressed since i was 12 probably and im 19 now. i was getting better, i had only seriously considered suicide a few times in the past year, which is way less than i used to. but my...",[]


### <b>3.1.4 <span style='color:#F1A424'>|</span> Special Characters</b> 

In [26]:
# Remove special characters

df['clean_post'] = df['clean_post'].apply(nfx.remove_special_characters)

df[['reddit_post', 'clean_post']].sample(5)

Unnamed: 0,reddit_post,clean_post
26441,"Handgun So things are starting to chill out here a bit. I work in the entertainment industry, and me and my closest friends were among the first to be unemployed. Before restaurants and bars, the 20,000 plus events were rightfully shut down ASAP, exactly one month ago. There were definitely a few weeks when everything felt so uncertain. All my non violent friends started buying guns. For the p...",handgun so things are starting to chill out here a bit i work in the entertainment industry and me and my closest friends were among the first to be unemployed before restaurants and bars the 20000 plus events were rightfully shut down asap exactly one month ago there were definitely a few weeks when everything felt so uncertain all my non violent friends started buying guns for the past few y...
4454,"Classmate takes pictures of me Sorry if this is worded funny or the grammar is bad, my mind is doing itâ€™s thing again and itâ€™s hard to focus on composing my thoughts. \n\nAnyway, Iâ€™m a 16 year old guy and thereâ€™s a guy in my photography class who has periodically taken pictures of me. We donâ€™t talk to each other except for the occasional few word exchange regarding an assignment. The...",classmate takes pictures of me sorry if this is worded funny or the grammar is bad my mind is doing its thing again and its hard to focus on composing my thoughts anyway im a 16 year old guy and theres a guy in my photography class who has periodically taken pictures of me we dont talk to each other except for the occasional few word exchange regarding an assignment the first time he got out o...
14898,"What am I doing wrong? Recently went to college and had the biggest spike of depression ever. The police got involved when I tried to end it made my roommates hate me. \nI worked really fucking hard as was able to get new and amazing roommates, I also have a solid group of friends who I adore, a wonderful job, and Iâ€™m doing a lot better in school. Iâ€™ve worked my ass off to make sure every ...",what am i doing wrong recently went to college and had the biggest spike of depression ever the police got involved when i tried to end it made my roommates hate me i worked really fucking hard as was able to get new and amazing roommates i also have a solid group of friends who i adore a wonderful job and im doing a lot better in school ive worked my ass off to make sure every aspect of my li...
3920,"I didn't know how to talk to people, so I wrote books to express my feelings. Hi guys, this is my first time posting on here and I'm finally ready to talk to the community. So for the past year my social anxiety has been so severe that I don't have any friends at all. Eventually I got so alone and depressed that I started writing, it helped me a lot. Writing was a way for me to create friends ...",i didnt know how to talk to people so i wrote books to express my feelings hi guys this is my first time posting on here and im finally ready to talk to the community so for the past year my social anxiety has been so severe that i dont have any friends at all eventually i got so alone and depressed that i started writing it helped me a lot writing was a way for me to create friends and kind o...
15401,"Iâ€™m sitting in a bathroom stall at lunch crying and listening to sad music Bahahahha Iâ€™m such a fucking loser and terrible person I donâ€™t deserve shit on Valentines Day Iâ€™m just gonna be a sad piece of shit forever I really just need someone else to shoot me in the head, cuz way my pussy ass can do it.",im sitting in a bathroom stall at lunch crying and listening to sad music bahahahha im such a fucking loser and terrible person i dont deserve shit on valentines day im just gonna be a sad piece of shit forever i really just need someone else to shoot me in the head cuz way my pussy ass can do it


### <b>3.1.5 <span style='color:#F1A424'>|</span> Multiple Whitespaces</b> 

In [27]:
# Remove whitespaces
df['clean_post'] = df['clean_post'].apply(nfx.remove_multiple_spaces)

df[['reddit_post', 'clean_post']].head()

Unnamed: 0,reddit_post,clean_post
0,"I don't know if it was the emotional neglect, the psychological abuse, medical abuse, bullying, the CSA, whatever. I'm a mess right now. I feel like a horrible monster that somewhat a lot of people see as attractive, but under the facade I'm still a monster. As if I was someone who was built for being unlovable and despised, physically and emotionally, since I was born. I keep working and wor...",i dont know if it was the emotional neglect the psychological abuse medical abuse bullying the csa whatever im a mess right now i feel like a horrible monster that somewhat a lot of people see as attractive but under the facade im still a monster as if i was someone who was built for being unlovable and despised physically and emotionally since i was born i keep working and working but i still...
1,"See title.\n\nI used to be the person full of hobbies (biking, drawing, reading, writing, walking, gaming) who really disliked people who never knew what to do with their free time and would be clingy. Now I am one of them.\n\nThrough years of hard depression and su.c.dal.ty thanks to cptsd I have stopped all my hobbies. I entrench myself in work and by now also meeting people and sometimes ob...",see title i used to be the person full of hobbies biking drawing reading writing walking gaming who really disliked people who never knew what to do with their free time and would be clingy now i am one of them through years of hard depression and sucdalty thanks to cptsd i have stopped all my hobbies i entrench myself in work and by now also meeting people and sometimes obligatory projects li...
2,"I was doing yoga for years as a tool to help me back into my body when I was feeling rough as a form of reconnection. I even went as far as becoming trained in teaching, doing a 200hr training. As my trauma symptoms peaked however yoga would actually start having the reverse effect and would dissociate me. (In retrospect I wonder if I was in fact being dissociated the whole time.)\n\nStarted a...",i was doing yoga for years as a tool to help me back into my body when i was feeling rough as a form of reconnection i even went as far as becoming trained in teaching doing a 200hr training as my trauma symptoms peaked however yoga would actually start having the reverse effect and would dissociate me in retrospect i wonder if i was in fact being dissociated the whole time started again recen...
3,The child me thought I made the right choice by listening to him. And he said as much. That I had finally done something right . Especially when they kept blaming me for all the things I did wrong . Anyone else ?,the child me thought i made the right choice by listening to him and he said as much that i had finally done something right especially when they kept blaming me for all the things i did wrong anyone else
4,"Women: What is the real situation of misogyny, patriarchy, sexual abuse and harassment in your country?",women what is the real situation of misogyny patriarchy sexual abuse and harassment in your country


### <b>3.1.6 <span style='color:#F1A424'>|</span> Emojis</b> 

In [28]:
# Remove emojis
df['clean_post'] = df['clean_post'].apply(nfx.remove_emojis)

df[['reddit_post', 'clean_post']].sample(5)

Unnamed: 0,reddit_post,clean_post
22341,"I'm getting kind of scared. Apparently my depression is worse than I thought. Figured I'd post here too, you guys likely understand long-term suicidal ideation.\n\nI've had dysthymia since middle school. Passive suicidal ideation as well. Somehow managed to do well in high school and college, despite all of that and the strong urges to self harm. I had a lot of potential but kind of threw it a...",im getting kind of scared apparently my depression is worse than i thought figured id post here too you guys likely understand longterm suicidal ideation ive had dysthymia since middle school passive suicidal ideation as well somehow managed to do well in high school and college despite all of that and the strong urges to self harm i had a lot of potential but kind of threw it away im not a st...
1103,"Started the New Year with a bang. Hey, I'm new here, 22F. \nLast night started out simple enough. Having a few drinks with my on and off again boyfriend of two years, then his dad and sister wanted to go out to a bar, which I didn't really want to do, but I wanted them to spend time together. At this point I had had 3 Moscow mules...and had three double rum and cokes at the bar. \n\nFelt awkwa...",started the new year with a bang hey im new here 22f last night started out simple enough having a few drinks with my on and off again boyfriend of two years then his dad and sister wanted to go out to a bar which i didnt really want to do but i wanted them to spend time together at this point i had had 3 moscow mulesand had three double rum and cokes at the bar felt awkwardim sitting in the b...
16527,i want to fucking kill myself (surprise surprise) yeah im sure seeing a post like this in this subreddit is reaaally unusual but basically i wanna fucking die lol im so fucking stupid and irresponsible and fat and gross and useless and i just fucking ruin everything and i hate myself so much i am so fucking disgusting and i fuck everything up and can someone just please do the world a favor an...,i want to fucking kill myself surprise surprise yeah im sure seeing a post like this in this subreddit is reaaally unusual but basically i wanna fucking die lol im so fucking stupid and irresponsible and fat and gross and useless and i just fucking ruin everything and i hate myself so much i am so fucking disgusting and i fuck everything up and can someone just please do the world a favor and ...
6176,"How to not feel awkward when talking to someone? I've noticed that the majority of my SA comes from basically whether I feel 'inferior' to someone or not. If I don't, then I have no trouble holding a conversation with someone. In my mind I know that I'm equal to everyone, but it's so hard to get my brain to actually realise that. How do I change this?",how to not feel awkward when talking to someone ive noticed that the majority of my sa comes from basically whether i feel inferior to someone or not if i dont then i have no trouble holding a conversation with someone in my mind i know that im equal to everyone but its so hard to get my brain to actually realise that how do i change this
20872,Impossible to live without the sensation of love Living in essentially isolation is a lot less enjoyable than I tried to convince myself it would be. It's impossible to go through life without ever knowing the feeling of being loved and being touched by another human. I'm finished with coming back home and being left in my head for hours with the same feelings every night. Stopping myself to s...,impossible to live without the sensation of love living in essentially isolation is a lot less enjoyable than i tried to convince myself it would be its impossible to go through life without ever knowing the feeling of being loved and being touched by another human im finished with coming back home and being left in my head for hours with the same feelings every night stopping myself to save m...


### <b>3.1.7 <span style='color:#F1A424'>|</span> Contractions</b> 

In [29]:
pip install contractions

Note: you may need to restart the kernel to use updated packages.


In [30]:
import contractions

# Apply the contractions.fix function to the clean_tweet column
df['clean_post'] = df['clean_post'].apply(contractions.fix)

df[['reddit_post', 'clean_post']].head()

Unnamed: 0,reddit_post,clean_post
0,"I don't know if it was the emotional neglect, the psychological abuse, medical abuse, bullying, the CSA, whatever. I'm a mess right now. I feel like a horrible monster that somewhat a lot of people see as attractive, but under the facade I'm still a monster. As if I was someone who was built for being unlovable and despised, physically and emotionally, since I was born. I keep working and wor...",i do not know if it was the emotional neglect the psychological abuse medical abuse bullying the csa whatever i am a mess right now i feel like a horrible monster that somewhat a lot of people see as attractive but under the facade i am still a monster as if i was someone who was built for being unlovable and despised physically and emotionally since i was born i keep working and working but i...
1,"See title.\n\nI used to be the person full of hobbies (biking, drawing, reading, writing, walking, gaming) who really disliked people who never knew what to do with their free time and would be clingy. Now I am one of them.\n\nThrough years of hard depression and su.c.dal.ty thanks to cptsd I have stopped all my hobbies. I entrench myself in work and by now also meeting people and sometimes ob...",see title i used to be the person full of hobbies biking drawing reading writing walking gaming who really disliked people who never knew what to do with their free time and would be clingy now i am one of them through years of hard depression and sucdalty thanks to cptsd i have stopped all my hobbies i entrench myself in work and by now also meeting people and sometimes obligatory projects li...
2,"I was doing yoga for years as a tool to help me back into my body when I was feeling rough as a form of reconnection. I even went as far as becoming trained in teaching, doing a 200hr training. As my trauma symptoms peaked however yoga would actually start having the reverse effect and would dissociate me. (In retrospect I wonder if I was in fact being dissociated the whole time.)\n\nStarted a...",i was doing yoga for years as a tool to help me back into my body when i was feeling rough as a form of reconnection i even went as far as becoming trained in teaching doing a 200hr training as my trauma symptoms peaked however yoga would actually start having the reverse effect and would dissociate me in retrospect i wonder if i was in fact being dissociated the whole time started again recen...
3,The child me thought I made the right choice by listening to him. And he said as much. That I had finally done something right . Especially when they kept blaming me for all the things I did wrong . Anyone else ?,the child me thought i made the right choice by listening to him and he said as much that i had finally done something right especially when they kept blaming me for all the things i did wrong anyone else
4,"Women: What is the real situation of misogyny, patriarchy, sexual abuse and harassment in your country?",women what is the real situation of misogyny patriarchy sexual abuse and harassment in your country


### <b>3.1.8 <span style='color:#F1A424'>|</span> Stopwords</b> 

In [31]:
# Extract stopwords
df['clean_post'].apply(lambda x: nt.TextExtractor(x).extract_stopwords())

0                                                                                                                                                               [i, do, not, if, it, was, the, the, the, whatever, i, am, a, now, i, a, that, a, of, see, as, but, under, the, i, am, still, a, as, if, i, was, someone, who, was, for, being, and, and, since, i, was, i, keep, and, but, i, still, do, not, for, this]
1        [see, i, used, to, be, the, full, of, who, really, who, never, what, to, do, with, their, and, would, be, now, i, am, one, of, them, through, of, and, to, i, have, all, my, i, myself, in, and, by, now, also, and, sometimes, to, an, that, me, without, any, when, i, have, and, am, alone, i, on, the, and, do, nothing, i, i, even, myself, also, myself, a, for, being, this, sometimes, i, for, the, ...
2                                                                                                       [i, was, doing, for, as, a, to, me, back, into, my, when, i, was, as, a, of, i

In [32]:
# Remove the stop words

df['clean_post'] = df['clean_post'].apply(nfx.remove_stopwords)

df[['reddit_post', 'clean_post']].head()

Unnamed: 0,reddit_post,clean_post
0,"I don't know if it was the emotional neglect, the psychological abuse, medical abuse, bullying, the CSA, whatever. I'm a mess right now. I feel like a horrible monster that somewhat a lot of people see as attractive, but under the facade I'm still a monster. As if I was someone who was built for being unlovable and despised, physically and emotionally, since I was born. I keep working and wor...",know emotional neglect psychological abuse medical abuse bullying csa mess right feel like horrible monster somewhat lot people attractive facade monster built unlovable despised physically emotionally born working working feel fit world
1,"See title.\n\nI used to be the person full of hobbies (biking, drawing, reading, writing, walking, gaming) who really disliked people who never knew what to do with their free time and would be clingy. Now I am one of them.\n\nThrough years of hard depression and su.c.dal.ty thanks to cptsd I have stopped all my hobbies. I entrench myself in work and by now also meeting people and sometimes ob...",title person hobbies biking drawing reading writing walking gaming disliked people knew free time clingy years hard depression sucdalty thanks cptsd stopped hobbies entrench work meeting people obligatory projects like drivers license extent leaves free time free time lie couch think pity hate lot way watch netflix hours doom scroll reddit waste time browsing internet try sleep lot better inte...
2,"I was doing yoga for years as a tool to help me back into my body when I was feeling rough as a form of reconnection. I even went as far as becoming trained in teaching, doing a 200hr training. As my trauma symptoms peaked however yoga would actually start having the reverse effect and would dissociate me. (In retrospect I wonder if I was in fact being dissociated the whole time.)\n\nStarted a...",yoga years tool help body feeling rough form reconnection went far trained teaching 200hr training trauma symptoms peaked yoga actually start reverse effect dissociate retrospect wonder fact dissociated time started recently bad damn disconnect hard exercise tips
3,The child me thought I made the right choice by listening to him. And he said as much. That I had finally done something right . Especially when they kept blaming me for all the things I did wrong . Anyone else ?,child thought right choice listening said finally right especially kept blaming things wrong
4,"Women: What is the real situation of misogyny, patriarchy, sexual abuse and harassment in your country?",women real situation misogyny patriarchy sexual abuse harassment country


In [33]:
# Noise Scan after cleaning text
df['clean_post'].apply(lambda x: nt.TextFrame(x).noise_scan()['text_noise'])

0        0
1        0
2        0
3        0
4        0
        ..
27673    0
27674    0
27675    0
27676    0
27677    0
Name: clean_post, Length: 27666, dtype: int64

## <b>3.2 <span style='color:#F1A424'>|</span> Linguistic Processing (Clean Text)</b> 

+ Tokenization
+ Stemming / Lemmatization
+ Parts of Speech Tagging
+ Calculating Sentiment Based on Polarity & Subjectivity

### <b>3.2.1 <span style='color:#F1A424'>|</span> Tokenization</b> 

In [34]:
test_sample = df['clean_post'].loc[12827]

test_sample

'close life think wrong lot pain losing girlfriend realizing real friends family uncaring suicidal broke suicidal pain great find alleviate pain try feel better working tell moment'

In [35]:
from nltk.tokenize import RegexpTokenizer

basic_token_pattern = r"(?u)\b\w\w+\b"

tokenizer = RegexpTokenizer(basic_token_pattern)

tokenizer.tokenize(test_sample)

['close',
 'life',
 'think',
 'wrong',
 'lot',
 'pain',
 'losing',
 'girlfriend',
 'realizing',
 'real',
 'friends',
 'family',
 'uncaring',
 'suicidal',
 'broke',
 'suicidal',
 'pain',
 'great',
 'find',
 'alleviate',
 'pain',
 'try',
 'feel',
 'better',
 'working',
 'tell',
 'moment']

In [36]:
# Tokenise the clean_tweet column
df['preprocessed_post'] = df['clean_post'].apply(lambda x: tokenizer.tokenize(x))

# df.iloc[100]["preprocessed_tweet"][:20]

In [37]:
df[['clean_post', 'preprocessed_post']].iloc[100]

clean_post                      curious immense pressure fear paranoia strings attached thought feeling word
preprocessed_post    [curious, immense, pressure, fear, paranoia, strings, attached, thought, feeling, word]
Name: 100, dtype: object

In [38]:
df

Unnamed: 0,Subreddit,reddit_post,clean_post,urls,preprocessed_post
0,CPTSD,"I don't know if it was the emotional neglect, the psychological abuse, medical abuse, bullying, the CSA, whatever. I'm a mess right now. I feel like a horrible monster that somewhat a lot of people see as attractive, but under the facade I'm still a monster. As if I was someone who was built for being unlovable and despised, physically and emotionally, since I was born. I keep working and wor...",know emotional neglect psychological abuse medical abuse bullying csa mess right feel like horrible monster somewhat lot people attractive facade monster built unlovable despised physically emotionally born working working feel fit world,[],"[know, emotional, neglect, psychological, abuse, medical, abuse, bullying, csa, mess, right, feel, like, horrible, monster, somewhat, lot, people, attractive, facade, monster, built, unlovable, despised, physically, emotionally, born, working, working, feel, fit, world]"
1,CPTSD,"See title.\n\nI used to be the person full of hobbies (biking, drawing, reading, writing, walking, gaming) who really disliked people who never knew what to do with their free time and would be clingy. Now I am one of them.\n\nThrough years of hard depression and su.c.dal.ty thanks to cptsd I have stopped all my hobbies. I entrench myself in work and by now also meeting people and sometimes ob...",title person hobbies biking drawing reading writing walking gaming disliked people knew free time clingy years hard depression sucdalty thanks cptsd stopped hobbies entrench work meeting people obligatory projects like drivers license extent leaves free time free time lie couch think pity hate lot way watch netflix hours doom scroll reddit waste time browsing internet try sleep lot better inte...,[],"[title, person, hobbies, biking, drawing, reading, writing, walking, gaming, disliked, people, knew, free, time, clingy, years, hard, depression, sucdalty, thanks, cptsd, stopped, hobbies, entrench, work, meeting, people, obligatory, projects, like, drivers, license, extent, leaves, free, time, free, time, lie, couch, think, pity, hate, lot, way, watch, netflix, hours, doom, scroll, reddit, wa..."
2,CPTSD,"I was doing yoga for years as a tool to help me back into my body when I was feeling rough as a form of reconnection. I even went as far as becoming trained in teaching, doing a 200hr training. As my trauma symptoms peaked however yoga would actually start having the reverse effect and would dissociate me. (In retrospect I wonder if I was in fact being dissociated the whole time.)\n\nStarted a...",yoga years tool help body feeling rough form reconnection went far trained teaching 200hr training trauma symptoms peaked yoga actually start reverse effect dissociate retrospect wonder fact dissociated time started recently bad damn disconnect hard exercise tips,[],"[yoga, years, tool, help, body, feeling, rough, form, reconnection, went, far, trained, teaching, 200hr, training, trauma, symptoms, peaked, yoga, actually, start, reverse, effect, dissociate, retrospect, wonder, fact, dissociated, time, started, recently, bad, damn, disconnect, hard, exercise, tips]"
3,CPTSD,The child me thought I made the right choice by listening to him. And he said as much. That I had finally done something right . Especially when they kept blaming me for all the things I did wrong . Anyone else ?,child thought right choice listening said finally right especially kept blaming things wrong,[],"[child, thought, right, choice, listening, said, finally, right, especially, kept, blaming, things, wrong]"
4,CPTSD,"Women: What is the real situation of misogyny, patriarchy, sexual abuse and harassment in your country?",women real situation misogyny patriarchy sexual abuse harassment country,[],"[women, real, situation, misogyny, patriarchy, sexual, abuse, harassment, country]"
...,...,...,...,...,...
27673,suicidewatch,"But I want help doing it Tired of everyone saying no don't.\n\n*Get up so I can punch you again*\n\nI'm exhausted I want to clock out and be done everything just keeps getting worse, everything keeps losing value including myself- as if any of that even mattered.\n\nFuck the helpline I want real help I want help out",want help tired saying punch exhausted want clock keeps getting worse keeps losing value including mattered fuck helpline want real help want help,[],"[want, help, tired, saying, punch, exhausted, want, clock, keeps, getting, worse, keeps, losing, value, including, mattered, fuck, helpline, want, real, help, want, help]"
27674,suicidewatch,Nothing to live for The ONLY reason I am alive right now is because of my sweet cat Pippin. Yesterday was the anniversary of adopting him 2 years ago. \nI've been really depressed and haven't been able to play with him as much so hes been meowing and being a little naughty as a result. I got so mad yesterday and yelled at him. All I can think about now is how I should give him to someone healt...,live reason alive right sweet cat pippin yesterday anniversary adopting 2 years ago depressed able play hes meowing little naughty result got mad yesterday yelled think healthier stable away live boyfriend hes far away friends spoken mother 17 dad disowned political opinion accomplished wish job guess minimum wage qualifications experience customer service caregiving waste resources hate skin ...,[],"[live, reason, alive, right, sweet, cat, pippin, yesterday, anniversary, adopting, years, ago, depressed, able, play, hes, meowing, little, naughty, result, got, mad, yesterday, yelled, think, healthier, stable, away, live, boyfriend, hes, far, away, friends, spoken, mother, 17, dad, disowned, political, opinion, accomplished, wish, job, guess, minimum, wage, qualifications, experience, custom..."
27675,suicidewatch,Iâ€™m going to fucking kill myself 18 years too long. I think Iâ€™m going to go,going fucking kill 18 years long think going,[],"[going, fucking, kill, 18, years, long, think, going]"
27676,suicidewatch,Iâ€™m going to pieces All Iâ€™ve done for about a month has been lay in bed. I donâ€™t enjoy anything. Canâ€™t focus on anything. I am terrified of the future. I donâ€™t want to be alive. I am in so much emotional pain. The only reason I am alive is my father because I donâ€™t want to hurt him. That and I canâ€™t decide on a method. I think not wanting to hurt him keeps me from choosing a meth...,going pieces month lay bed enjoy focus terrified future want alive emotional pain reason alive father want hurt decide method think wanting hurt keeps choosing method future father longer think choice end life scared wish sleep wake hate alive,[],"[going, pieces, month, lay, bed, enjoy, focus, terrified, future, want, alive, emotional, pain, reason, alive, father, want, hurt, decide, method, think, wanting, hurt, keeps, choosing, method, future, father, longer, think, choice, end, life, scared, wish, sleep, wake, hate, alive]"


### <b>3.2.2 <span style='color:#F1A424'>|</span> Lemmatization</b> 

In [39]:
import nltk
nltk.download('wordnet')


[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\DELL\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

In [40]:
# Define a function to lemmatise the tokens
def lemmatise_tokens(tokens):
    lemmatizer = nltk.stem.WordNetLemmatizer()
    lemmatized_tokens = [lemmatizer.lemmatize(token) for token in tokens]
    return lemmatized_tokens

# Lemmatise the tokens
# Lemmatise the tokens
df['lemma_preprocessed_post'] = df['preprocessed_post'].apply(lambda x: lemmatise_tokens(x))
 
# df.iloc[100]["preprocessed_tweet"][:20]
    

In [41]:
df[['clean_post', 'lemma_preprocessed_post']].iloc[260]

clean_post                                                     mom cruel pos growing neglected invalidated hated mewhen got depressed child treated worse burden stand suicidal cut arms mom told ahead encouraged sucide mother loves child matter mad hurts mother like friends partner family wish god loving mother leastits wanted
lemma_preprocessed_post    [mom, cruel, po, growing, neglected, invalidated, hated, mewhen, got, depressed, child, treated, worse, burden, stand, suicidal, cut, arm, mom, told, ahead, encouraged, sucide, mother, love, child, matter, mad, hurt, mother, like, friend, partner, family, wish, god, loving, mother, leastits, wanted]
Name: 260, dtype: object

In [42]:
# Define a function to stem the tokens
def stem_tokens(tokens):
    stemmer = nltk.stem.PorterStemmer()
    stemmed_tokens = [stemmer.stem(token) for token in tokens]
    return stemmed_tokens

# Stem the tokens
df['stemma_preprocessed_post'] = df['preprocessed_post'].apply(lambda x: stem_tokens(x))

# df.iloc[100]["preprocessed_tweet"][:20]

In [43]:
df[['clean_post', 'stemma_preprocessed_post']].iloc[260]

clean_post                             mom cruel pos growing neglected invalidated hated mewhen got depressed child treated worse burden stand suicidal cut arms mom told ahead encouraged sucide mother loves child matter mad hurts mother like friends partner family wish god loving mother leastits wanted
stemma_preprocessed_post    [mom, cruel, po, grow, neglect, invalid, hate, mewhen, got, depress, child, treat, wors, burden, stand, suicid, cut, arm, mom, told, ahead, encourag, sucid, mother, love, child, matter, mad, hurt, mother, like, friend, partner, famili, wish, god, love, mother, leastit, want]
Name: 260, dtype: object

### <b>3.2.3 <span style='color:#F1A424'>|</span> Calculating Sentiment Based on Polarity & Subjectivity</b>

TextBlob is a Python library for processing textual data, including sentiment analysis. It uses natural language processing (NLP) and the Natural Language Toolkit (NLTK) to achieve its tasks. When a sentence is passed into TextBlob, it returns two outputs: polarity and subjectivity. The polarity score is a float within the range [-1, 1], where -1 indicates a negative sentiment and 1 indicates a positive sentiment. The subjectivity score is a float within the range, where 0 is very objective and 1 is very subjective.

In [44]:
pip install textblob


Note: you may need to restart the kernel to use updated packages.


In [45]:
from textblob import TextBlob

# Create a function to get the subjectivity
def getSubjectivity(text):
  return TextBlob(text).sentiment.subjectivity

# Create a function to get the polarity
def getPolarity(text):
  return TextBlob(text).sentiment.polarity

# Create two new columns 'Subjectivity' & 'Polarity'
df['Subjectivity'] = df['clean_post'].apply(getSubjectivity)
df['Polarity'] = df['clean_post'].apply(getPolarity)

# Show the new dataframe with columns 'Subjectivity' & 'Polarity'
df[['clean_post','Subjectivity','Polarity']].head()

Unnamed: 0,clean_post,Subjectivity,Polarity
0,know emotional neglect psychological abuse medical abuse bullying csa mess right feel like horrible monster somewhat lot people attractive facade monster built unlovable despised physically emotionally born working working feel fit world,0.50119,0.034524
1,title person hobbies biking drawing reading writing walking gaming disliked people knew free time clingy years hard depression sucdalty thanks cptsd stopped hobbies entrench work meeting people obligatory projects like drivers license extent leaves free time free time lie couch think pity hate lot way watch netflix hours doom scroll reddit waste time browsing internet try sleep lot better inte...,0.528098,0.00455
2,yoga years tool help body feeling rough form reconnection went far trained teaching 200hr training trauma symptoms peaked yoga actually start reverse effect dissociate retrospect wonder fact dissociated time started recently bad damn disconnect hard exercise tips,0.541667,-0.198333
3,child thought right choice listening said finally right especially kept blaming things wrong,0.742857,0.017857
4,women real situation misogyny patriarchy sexual abuse harassment country,0.566667,0.35


In [46]:
# Create a function to compute the negative, positive and nuetral analysis
def getAnalysis(score):
  if score < 0:
    return 'Negative'
  elif score == 0:
    return 'Neutral'
  else:
    return 'Positive'
  
df['sentiment'] = df['Polarity'].apply(getAnalysis)

# Show the dataframe
df[['reddit_post','clean_post','Subjectivity','Polarity','sentiment']].head()

Unnamed: 0,reddit_post,clean_post,Subjectivity,Polarity,sentiment
0,"I don't know if it was the emotional neglect, the psychological abuse, medical abuse, bullying, the CSA, whatever. I'm a mess right now. I feel like a horrible monster that somewhat a lot of people see as attractive, but under the facade I'm still a monster. As if I was someone who was built for being unlovable and despised, physically and emotionally, since I was born. I keep working and wor...",know emotional neglect psychological abuse medical abuse bullying csa mess right feel like horrible monster somewhat lot people attractive facade monster built unlovable despised physically emotionally born working working feel fit world,0.50119,0.034524,Positive
1,"See title.\n\nI used to be the person full of hobbies (biking, drawing, reading, writing, walking, gaming) who really disliked people who never knew what to do with their free time and would be clingy. Now I am one of them.\n\nThrough years of hard depression and su.c.dal.ty thanks to cptsd I have stopped all my hobbies. I entrench myself in work and by now also meeting people and sometimes ob...",title person hobbies biking drawing reading writing walking gaming disliked people knew free time clingy years hard depression sucdalty thanks cptsd stopped hobbies entrench work meeting people obligatory projects like drivers license extent leaves free time free time lie couch think pity hate lot way watch netflix hours doom scroll reddit waste time browsing internet try sleep lot better inte...,0.528098,0.00455,Positive
2,"I was doing yoga for years as a tool to help me back into my body when I was feeling rough as a form of reconnection. I even went as far as becoming trained in teaching, doing a 200hr training. As my trauma symptoms peaked however yoga would actually start having the reverse effect and would dissociate me. (In retrospect I wonder if I was in fact being dissociated the whole time.)\n\nStarted a...",yoga years tool help body feeling rough form reconnection went far trained teaching 200hr training trauma symptoms peaked yoga actually start reverse effect dissociate retrospect wonder fact dissociated time started recently bad damn disconnect hard exercise tips,0.541667,-0.198333,Negative
3,The child me thought I made the right choice by listening to him. And he said as much. That I had finally done something right . Especially when they kept blaming me for all the things I did wrong . Anyone else ?,child thought right choice listening said finally right especially kept blaming things wrong,0.742857,0.017857,Positive
4,"Women: What is the real situation of misogyny, patriarchy, sexual abuse and harassment in your country?",women real situation misogyny patriarchy sexual abuse harassment country,0.566667,0.35,Positive


In [47]:
df['sentiment'].value_counts()

Negative    14784
Positive    11533
Neutral      1349
Name: sentiment, dtype: int64

In [48]:
# df['sentiment'].value_counts()

In [49]:
df

Unnamed: 0,Subreddit,reddit_post,clean_post,urls,preprocessed_post,lemma_preprocessed_post,stemma_preprocessed_post,Subjectivity,Polarity,sentiment
0,CPTSD,"I don't know if it was the emotional neglect, the psychological abuse, medical abuse, bullying, the CSA, whatever. I'm a mess right now. I feel like a horrible monster that somewhat a lot of people see as attractive, but under the facade I'm still a monster. As if I was someone who was built for being unlovable and despised, physically and emotionally, since I was born. I keep working and wor...",know emotional neglect psychological abuse medical abuse bullying csa mess right feel like horrible monster somewhat lot people attractive facade monster built unlovable despised physically emotionally born working working feel fit world,[],"[know, emotional, neglect, psychological, abuse, medical, abuse, bullying, csa, mess, right, feel, like, horrible, monster, somewhat, lot, people, attractive, facade, monster, built, unlovable, despised, physically, emotionally, born, working, working, feel, fit, world]","[know, emotional, neglect, psychological, abuse, medical, abuse, bullying, csa, mess, right, feel, like, horrible, monster, somewhat, lot, people, attractive, facade, monster, built, unlovable, despised, physically, emotionally, born, working, working, feel, fit, world]","[know, emot, neglect, psycholog, abus, medic, abus, bulli, csa, mess, right, feel, like, horribl, monster, somewhat, lot, peopl, attract, facad, monster, built, unlov, despis, physic, emot, born, work, work, feel, fit, world]",0.501190,0.034524,Positive
1,CPTSD,"See title.\n\nI used to be the person full of hobbies (biking, drawing, reading, writing, walking, gaming) who really disliked people who never knew what to do with their free time and would be clingy. Now I am one of them.\n\nThrough years of hard depression and su.c.dal.ty thanks to cptsd I have stopped all my hobbies. I entrench myself in work and by now also meeting people and sometimes ob...",title person hobbies biking drawing reading writing walking gaming disliked people knew free time clingy years hard depression sucdalty thanks cptsd stopped hobbies entrench work meeting people obligatory projects like drivers license extent leaves free time free time lie couch think pity hate lot way watch netflix hours doom scroll reddit waste time browsing internet try sleep lot better inte...,[],"[title, person, hobbies, biking, drawing, reading, writing, walking, gaming, disliked, people, knew, free, time, clingy, years, hard, depression, sucdalty, thanks, cptsd, stopped, hobbies, entrench, work, meeting, people, obligatory, projects, like, drivers, license, extent, leaves, free, time, free, time, lie, couch, think, pity, hate, lot, way, watch, netflix, hours, doom, scroll, reddit, wa...","[title, person, hobby, biking, drawing, reading, writing, walking, gaming, disliked, people, knew, free, time, clingy, year, hard, depression, sucdalty, thanks, cptsd, stopped, hobby, entrench, work, meeting, people, obligatory, project, like, driver, license, extent, leaf, free, time, free, time, lie, couch, think, pity, hate, lot, way, watch, netflix, hour, doom, scroll, reddit, waste, time,...","[titl, person, hobbi, bike, draw, read, write, walk, game, dislik, peopl, knew, free, time, clingi, year, hard, depress, sucdalti, thank, cptsd, stop, hobbi, entrench, work, meet, peopl, obligatori, project, like, driver, licens, extent, leav, free, time, free, time, lie, couch, think, piti, hate, lot, way, watch, netflix, hour, doom, scroll, reddit, wast, time, brows, internet, tri, sleep, lo...",0.528098,0.004550,Positive
2,CPTSD,"I was doing yoga for years as a tool to help me back into my body when I was feeling rough as a form of reconnection. I even went as far as becoming trained in teaching, doing a 200hr training. As my trauma symptoms peaked however yoga would actually start having the reverse effect and would dissociate me. (In retrospect I wonder if I was in fact being dissociated the whole time.)\n\nStarted a...",yoga years tool help body feeling rough form reconnection went far trained teaching 200hr training trauma symptoms peaked yoga actually start reverse effect dissociate retrospect wonder fact dissociated time started recently bad damn disconnect hard exercise tips,[],"[yoga, years, tool, help, body, feeling, rough, form, reconnection, went, far, trained, teaching, 200hr, training, trauma, symptoms, peaked, yoga, actually, start, reverse, effect, dissociate, retrospect, wonder, fact, dissociated, time, started, recently, bad, damn, disconnect, hard, exercise, tips]","[yoga, year, tool, help, body, feeling, rough, form, reconnection, went, far, trained, teaching, 200hr, training, trauma, symptom, peaked, yoga, actually, start, reverse, effect, dissociate, retrospect, wonder, fact, dissociated, time, started, recently, bad, damn, disconnect, hard, exercise, tip]","[yoga, year, tool, help, bodi, feel, rough, form, reconnect, went, far, train, teach, 200hr, train, trauma, symptom, peak, yoga, actual, start, revers, effect, dissoci, retrospect, wonder, fact, dissoci, time, start, recent, bad, damn, disconnect, hard, exercis, tip]",0.541667,-0.198333,Negative
3,CPTSD,The child me thought I made the right choice by listening to him. And he said as much. That I had finally done something right . Especially when they kept blaming me for all the things I did wrong . Anyone else ?,child thought right choice listening said finally right especially kept blaming things wrong,[],"[child, thought, right, choice, listening, said, finally, right, especially, kept, blaming, things, wrong]","[child, thought, right, choice, listening, said, finally, right, especially, kept, blaming, thing, wrong]","[child, thought, right, choic, listen, said, final, right, especi, kept, blame, thing, wrong]",0.742857,0.017857,Positive
4,CPTSD,"Women: What is the real situation of misogyny, patriarchy, sexual abuse and harassment in your country?",women real situation misogyny patriarchy sexual abuse harassment country,[],"[women, real, situation, misogyny, patriarchy, sexual, abuse, harassment, country]","[woman, real, situation, misogyny, patriarchy, sexual, abuse, harassment, country]","[women, real, situat, misogyni, patriarchi, sexual, abus, harass, countri]",0.566667,0.350000,Positive
...,...,...,...,...,...,...,...,...,...,...
27673,suicidewatch,"But I want help doing it Tired of everyone saying no don't.\n\n*Get up so I can punch you again*\n\nI'm exhausted I want to clock out and be done everything just keeps getting worse, everything keeps losing value including myself- as if any of that even mattered.\n\nFuck the helpline I want real help I want help out",want help tired saying punch exhausted want clock keeps getting worse keeps losing value including mattered fuck helpline want real help want help,[],"[want, help, tired, saying, punch, exhausted, want, clock, keeps, getting, worse, keeps, losing, value, including, mattered, fuck, helpline, want, real, help, want, help]","[want, help, tired, saying, punch, exhausted, want, clock, keep, getting, worse, keep, losing, value, including, mattered, fuck, helpline, want, real, help, want, help]","[want, help, tire, say, punch, exhaust, want, clock, keep, get, wors, keep, lose, valu, includ, matter, fuck, helplin, want, real, help, want, help]",0.580000,-0.280000,Negative
27674,suicidewatch,Nothing to live for The ONLY reason I am alive right now is because of my sweet cat Pippin. Yesterday was the anniversary of adopting him 2 years ago. \nI've been really depressed and haven't been able to play with him as much so hes been meowing and being a little naughty as a result. I got so mad yesterday and yelled at him. All I can think about now is how I should give him to someone healt...,live reason alive right sweet cat pippin yesterday anniversary adopting 2 years ago depressed able play hes meowing little naughty result got mad yesterday yelled think healthier stable away live boyfriend hes far away friends spoken mother 17 dad disowned political opinion accomplished wish job guess minimum wage qualifications experience customer service caregiving waste resources hate skin ...,[],"[live, reason, alive, right, sweet, cat, pippin, yesterday, anniversary, adopting, years, ago, depressed, able, play, hes, meowing, little, naughty, result, got, mad, yesterday, yelled, think, healthier, stable, away, live, boyfriend, hes, far, away, friends, spoken, mother, 17, dad, disowned, political, opinion, accomplished, wish, job, guess, minimum, wage, qualifications, experience, custom...","[live, reason, alive, right, sweet, cat, pippin, yesterday, anniversary, adopting, year, ago, depressed, able, play, he, meowing, little, naughty, result, got, mad, yesterday, yelled, think, healthier, stable, away, live, boyfriend, he, far, away, friend, spoken, mother, 17, dad, disowned, political, opinion, accomplished, wish, job, guess, minimum, wage, qualification, experience, customer, s...","[live, reason, aliv, right, sweet, cat, pippin, yesterday, anniversari, adopt, year, ago, depress, abl, play, he, meow, littl, naughti, result, got, mad, yesterday, yell, think, healthier, stabl, away, live, boyfriend, he, far, away, friend, spoken, mother, 17, dad, disown, polit, opinion, accomplish, wish, job, guess, minimum, wage, qualif, experi, custom, servic, caregiv, wast, resourc, hate...",0.574048,0.023063,Positive
27675,suicidewatch,Iâ€™m going to fucking kill myself 18 years too long. I think Iâ€™m going to go,going fucking kill 18 years long think going,[],"[going, fucking, kill, 18, years, long, think, going]","[going, fucking, kill, 18, year, long, think, going]","[go, fuck, kill, 18, year, long, think, go]",0.600000,-0.325000,Negative
27676,suicidewatch,Iâ€™m going to pieces All Iâ€™ve done for about a month has been lay in bed. I donâ€™t enjoy anything. Canâ€™t focus on anything. I am terrified of the future. I donâ€™t want to be alive. I am in so much emotional pain. The only reason I am alive is my father because I donâ€™t want to hurt him. That and I canâ€™t decide on a method. I think not wanting to hurt him keeps me from choosing a meth...,going pieces month lay bed enjoy focus terrified future want alive emotional pain reason alive father want hurt decide method think wanting hurt keeps choosing method future father longer think choice end life scared wish sleep wake hate alive,[],"[going, pieces, month, lay, bed, enjoy, focus, terrified, future, want, alive, emotional, pain, reason, alive, father, want, hurt, decide, method, think, wanting, hurt, keeps, choosing, method, future, father, longer, think, choice, end, life, scared, wish, sleep, wake, hate, alive]","[going, piece, month, lay, bed, enjoy, focus, terrified, future, want, alive, emotional, pain, reason, alive, father, want, hurt, decide, method, think, wanting, hurt, keep, choosing, method, future, father, longer, think, choice, end, life, scared, wish, sleep, wake, hate, alive]","[go, piec, month, lay, bed, enjoy, focu, terrifi, futur, want, aliv, emot, pain, reason, aliv, father, want, hurt, decid, method, think, want, hurt, keep, choos, method, futur, father, longer, think, choic, end, life, scare, wish, sleep, wake, hate, aliv]",0.437500,-0.012500,Negative


In [50]:
df['preprocessed_post']

0                                                                                                                                         [know, emotional, neglect, psychological, abuse, medical, abuse, bullying, csa, mess, right, feel, like, horrible, monster, somewhat, lot, people, attractive, facade, monster, built, unlovable, despised, physically, emotionally, born, working, working, feel, fit, world]
1        [title, person, hobbies, biking, drawing, reading, writing, walking, gaming, disliked, people, knew, free, time, clingy, years, hard, depression, sucdalty, thanks, cptsd, stopped, hobbies, entrench, work, meeting, people, obligatory, projects, like, drivers, license, extent, leaves, free, time, free, time, lie, couch, think, pity, hate, lot, way, watch, netflix, hours, doom, scroll, reddit, wa...
2                                                                                                          [yoga, years, tool, help, body, feeling, rough, form, reconnection, went, f

In [51]:
df['lemma_preprocessed_post'] = df['lemma_preprocessed_post'].apply(lambda x: ' '.join(x))

In [52]:
df['stemma_preprocessed_post'] = df['stemma_preprocessed_post'].apply(lambda x: ' '.join(x))

df['preprocessed_post'] = df['preprocessed_post'].apply(lambda x: ' '.join(x))

In [53]:
df['preprocessed_post']

0                                                                                                                                                                          know emotional neglect psychological abuse medical abuse bullying csa mess right feel like horrible monster somewhat lot people attractive facade monster built unlovable despised physically emotionally born working working feel fit world
1        title person hobbies biking drawing reading writing walking gaming disliked people knew free time clingy years hard depression sucdalty thanks cptsd stopped hobbies entrench work meeting people obligatory projects like drivers license extent leaves free time free time lie couch think pity hate lot way watch netflix hours doom scroll reddit waste time browsing internet try sleep lot better inte...
2                                                                                                                                                yoga years tool help body feeling rou

In [54]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 27666 entries, 0 to 27677
Data columns (total 10 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Subreddit                 27666 non-null  object 
 1   reddit_post               27666 non-null  object 
 2   clean_post                27666 non-null  object 
 3   urls                      27666 non-null  object 
 4   preprocessed_post         27666 non-null  object 
 5   lemma_preprocessed_post   27666 non-null  object 
 6   stemma_preprocessed_post  27666 non-null  object 
 7   Subjectivity              27666 non-null  float64
 8   Polarity                  27666 non-null  float64
 9   sentiment                 27666 non-null  object 
dtypes: float64(2), object(8)
memory usage: 3.6+ MB


In [55]:
# save the dataframe to csv using the name 'interim_data.csv' fo the data folder
df.to_csv('interim_data.csv', index=False)