###**Importing Libraries**

In [1]:
#!pip install langdetect
#!pip install googletrans==4.0.0-rc1

In [2]:
import pandas as pd
import numpy as np
# import scikitplot as skplt
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier, GradientBoostingClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.preprocessing import StandardScaler

import warnings
warnings.filterwarnings('ignore')

###**Import your dataset**

In [3]:
#reading the .xlsx dataset in to the dataframe
df = pd.read_excel("AIDA-Dataset.xlsx")

In [4]:
df.shape

(520, 107)

###**Data preprossesing**

In [5]:
short_names_mapping = {
    'Do you want to participate in this survey?': 'Participation',
    'Please state your age? (e.g. 25)': 'Age',
    'What is your gender?': 'Gender',
    'What degree are you currently pursuing?': 'Degree',
    'What is your current academic year? (e.g. 2)': 'AcademicYear',
    'Which medium of school have you come from?': 'SchoolMedium',
    'Which department do you study in?': 'Department',
    'What is your current cumulative grade points average (CGPA)?': 'CGPA',
    'Do you have any personal history of depression?': 'DepressionHistory',
    'What is your current relationship status?': 'RelationshipStatus',
    'What is your religion?': 'Religion',
    'What is your current place of residence?': 'Residence',
    'Currently I live - ': 'CurrentLocation',
    'Number of siblings (e.g. 2)': 'SiblingsCount',
    'What is your ordinal position/birth order among your siblings? (e.g. 1)': 'SiblingOrder',
    ' [I feel that I have additional responsibility to be in this position.]': 'ResponsibilityFeeling',
    'What is your father’s occupation? (If you do not see the occupation listed below or in case of a deceased father please choose the "other" option.)': 'FatherOccupation',
    'What is your mother’s occupation? (If you do not see the occupation listed below or in case of a deceased mother please choose the "other" option.)': 'MotherOccupation',
    'What is your family’s (household) average monthly income? (e.g. 2000)': 'FamilyIncome',
    'Do you have any financial hardship currently?': 'FinancialHardship',
    'Is your family financially dependent on you?': 'FinancialDependency',
    'Do you have anyone in your family who has been suffering from fatal diseases?': 'FamilyIllness',
    ' [How often do you communicate with your parents?]': 'ParentalCommunicationFrequency',
    'Give your opinion about these: [My parents are very strict about my daily life style]': 'ParentalStrictness',
    'Give your opinion about these: [My parents give pressure for my academic subject selection]': 'ParentalAcademicPressure',
    'Give your opinion about these: [My parents give pressure for my marriage.]': 'ParentalMarriagePressure',
    'Give your opinion about these: [My parents give pressure for my career choice.]': 'ParentalCareerPressure',
    'Give your opinion about these: [My parents give pressure regarding my decision on studying abroad]': 'ParentalStudyAbroadPressure',
    'Give your opinion about these statements [I have bad understanding with my parents]': 'BadParentalUnderstanding',
    'Give your opinion about these statements [I have bad bonding with my siblings]': 'BadSiblingBonding',
    'Give your opinion about these statements [My  parents have unstable relationship]': 'UnstableParentalRelationshipStability',
    'Give your opinion about these statements [I don’t get along with most of my classmates/friends]': 'BadPeerRelationship',
    'Give your opinion about these statements [My university teachers are unsupportive]': 'UnsupportiveTeacherSupport',
    'Give your opinion about these statements [An unstable relationship with my partner affects my life.]': 'PartnerRelationshipImpact',
    'Give your opinion about these statements [A discouraging working environment (colleague/boss) affects the output of my work.]': 'WorkEnvironmentImpact',
    ' [I feel society interferes with my freedom of expression (freedom of speech / movement / etc...)]': 'SocietalInterference',
    'Have you ever experienced this sort of violence before? [I have faced Physical Violence before. (someone uses his/her body part or an object to control my actions) (শারিরিকনির্যাতন)]': 'PhysicalViolenceExperience',
    'Have you ever experienced this sort of violence before? [I have faced Sexual Violence before. (i am/was forced to unwillingly take part in sexual activity) (যৌনসহিংসতা)]': 'SexualViolenceExperience',
    'Have you ever experienced this sort of violence before? [I have faced Verbal Violence before. (someone uses language, whether spoken or written as an abusing tool against me) (মৌখিকগালি)]': 'VerbalViolenceExperience',
    'Have you ever experienced this sort of violence before? [I have faced Emotional Violence before. (someone says or does something to make me feel stupid or worthless) (মানসিকহিংসা)]': 'EmotionalViolenceExperience',
    'What kind of institution do you study in?': 'InstitutionType',
    'Self-perception on Academic status: [My academic performance is very poor.]': 'PoorAcademicPerformanceSelfPerception',
    'Self-perception on Academic status: [I have been criticized for my academic result.]': 'AcademicCriticismSelfPerception',
    'Self-perception on Academic status: [I am unsatisfied with my academic workload (i.e., presentations, assignments, tutorials)]': 'UnsatisfiedAcademicWorkloadSelfPerception',
    'Give your opinion for these: [I am not studying in my subject of interest.]': 'NonInterestSubjectOpinion',
    'Give your opinion for these: [I am unhappy with the subject that I am studying.]': 'UnhappySubjectOpinion',
    'Give your opinion for these: [I am not studying in the institution of my interest.]': 'NonInterestInstitutionOpinion',
    'Give your opinion for these: [I am unhappy with the institution that I am studying.]': 'UnhappyInstitutionOpinion',
    'Did you select the subject due to career prospects only?': 'CareerProspectsSubjectSelection',
    'What is your current employment status?': 'EmploymentStatus',
    'Which of the following is true for you?': 'PersonalSituation',
    'How the pandemic is affecting your full time /part time job/tutoring?': 'PandemicImpactOnWork',
    'I am satisfied with my current job.': 'JobSatisfaction',
    'My family/partner is satisfied with my current job.': 'FamilyPartnerJobSatisfaction',
    ' [In my opinion, enough jobs are not available related to my pursued subject. (your academic subject )]': 'InsufficientJobsSubjectOpinion',
    'How many close friends (approximately) do you have? (e.g. 3)': 'CloseFriendsCount',
    ' [How often do you hang out with your friends?]': 'FriendHangoutFrequency',
    'Are you involved in any extracurricular activities?': 'ExtracurricularInvolvement',
    'How strongly do you agree with each of the following questions: [How often do you smoke?]': 'SmokingFrequencyPerception',
    'How strongly do you agree with each of the following questions: [How often do you take drugs?]': 'DrugUseFrequencyPerception',
    'How strongly do you agree with each of the following questions: [How often do you drink alcohol?]': 'AlcoholConsumptionFrequencyPerception',
    'How strongly do you agree with each of the following questions: [How often do you engage in physical exercise?]': 'PhysicalExerciseFrequencyPerception',
    'How strongly do you agree with each of the following questions: [How often do you practice religious activities?]': 'ReligiousActivitiesFrequencyPerception',
    'How strongly do you agree with each of the following questions: [How often do you play video games?]': 'VideoGamePlayingFrequencyPerception',
    'How strongly do you agree with each of the following questions: [How often do you play sports?]': 'SportsPlayingFrequencyPerception',
    'On average how many hours do you spend in social media? (e.g. 5)': 'SocialMediaUsageHours',
    'Give your opinion on these: [The success stories of my peers in social media sometimes pose negative impact on me]': 'PeerSuccessImpactOnSocialMediaOpinion',
    'Give your opinion on these: [Social media has negative influence on my purchase decision]': 'SocialMediaInfluenceOnPurchaseDecisionOpinion',
    "Give your opinion on these: [I find human bonding less trust-worthy because it's easy to become imposters on social media]": 'TrustInHumanBondingOnSocialMediaOpinion',
    'Give your opinion on these: [I feel numbness of feelings as I am constantly seeing variety of incidents on social media.]': 'NumbnessFeelingsDueToSocialMediaOpinion',
    'Give your opinion on these: [I like to interact with my friends in person rather than in virtual mode.]': 'PreferenceForInPersonInteractionOpinion',
    'Sad': 'Sadness',
    'Depressed mood (sadness, hopeless, helpless, worthless)': 'DepressedMood',
    'Future': 'FutureOutlook',
    'Failure': 'FearOfFailure',
    'Satisfaction': 'SatisfactionLevel',
    'Guilty ': 'FeelingsOfGuilt',
    'Feelings of guiltiness ': 'GuiltFeelings',
    'Punished': 'FeelingsOfPunishment',
    'Disappointment ': 'FeelingsOfDisappointment',
    'Blame on me': 'SelfBlame',
    'Cry ': 'TendencyToCry',
    'Irritated': 'Irritability',
    'Interest in other people': 'InterestInOthers',
    'Decisions': 'DecisionMaking',
    'Look': 'AppearanceConcern',
    'Self-Care (This question is pertaining to what you feeling right now)': 'SelfCarePerception',
    'Energy Level:': 'EnergyLevel',
    'Work and Activities': 'WorkAndActivitiesLevel',
    'Sleep ': 'SleepQuality',
    'Sleeping Too Much:': 'ExcessiveSleeping',
    'Tired': 'Fatigue',
    'Appetite': 'AppetiteLevel',
    'a)\tAccording to you:': 'SubjectiveHealthPerception',
    'b)\tAccording to weekly measurements:': 'ObjectiveHealthPerception',
    'Feelings of suicide': 'SuicidalFeelings',
    'Retardation (প্রতিবন্ধকতা) (slowness of thought and speech, impaired ability to concentrate, decreased motor activity)/ (চিন্তা , কথা , মনোযোগ , মস্তিষ্কের কর্মক্ষমতা হ্রাস)': 'CognitiveRetardation',
    'Agitation (feeling of irritability or severe restlessness) (উত্কণ্ঠা)': 'Agitation',
    'Anxiety Psychic (মানসিক দুশ্চিন্তা)': 'PsychicAnxiety',
    ' [I felt that I was rather touchy/oversensitive over the past week]': 'SensitivityPerception',
    ' [I was aware of the action of my heart in the absence of physical exertion (e.g. sense of heart rate increase, heart missing a beat)]': 'HeartAwarenessPerception',
    'ANXIETY SOMATIC (দৈহিক) (physiological concomitants of anxiety) such as\ngastro-intestinal – dry mouth, wind, indigestion (বদ হজম), diarrhea, cramps ( পেটে ব্যাথা), belching( বমি)\ncardio-vascular – palpitations ( বুক ধড়ফড়) , headaches\nrespiratory – hyperventilation ( জোরে জোরে শ্বাস নেয়া), sighing urinary frequency sweating': 'SomaticAnxietySymptoms',
    'Somatic Symptoms Gastro-Intestinal': 'GastrointestinalSymptoms',
    'General Somatic Symptoms': 'GeneralSomaticSymptoms',
    'Genital Symptoms (symptoms such as loss of libido, menstrual disturbances)': 'GenitalSymptoms',
    'Hypochondriasis (The belief and fear of serious illness) / ( মরণব্যাধি আছে এমন ভ্রান্ত ধারনা )': 'Hypochondriasis',
    'Insight (self-perception)': 'SelfPerception',
}
df.rename(columns=short_names_mapping, inplace=True)


###**Feature Engineering**

In [6]:
df.columns.tolist()

['Participation',
 'Age',
 'Gender',
 'Degree',
 'AcademicYear',
 'SchoolMedium',
 'Department',
 'CGPA',
 'DepressionHistory',
 'RelationshipStatus',
 'Religion',
 'Residence',
 'CurrentLocation',
 'SiblingsCount',
 'SiblingOrder',
 'ResponsibilityFeeling',
 'FatherOccupation',
 'MotherOccupation',
 'FamilyIncome',
 'FinancialHardship',
 'FinancialDependency',
 'FamilyIllness',
 'ParentalCommunicationFrequency',
 'ParentalStrictness',
 'ParentalAcademicPressure',
 'ParentalMarriagePressure',
 'ParentalCareerPressure',
 'ParentalStudyAbroadPressure',
 'BadParentalUnderstanding',
 'BadSiblingBonding',
 'UnstableParentalRelationshipStability',
 'BadPeerRelationship',
 'UnsupportiveTeacherSupport',
 'PartnerRelationshipImpact',
 'WorkEnvironmentImpact',
 'SocietalInterference',
 'PhysicalViolenceExperience',
 'SexualViolenceExperience',
 'VerbalViolenceExperience',
 'EmotionalViolenceExperience',
 'InstitutionType',
 'PoorAcademicPerformanceSelfPerception',
 'AcademicCriticismSelfPercepti

In [7]:
df['Residence'].head()

0    Urban
1    Urban
2    Urban
3    Urban
4    Urban
Name: Residence, dtype: object

In [8]:
# demographic_feature=['Age',
#  'Gender',
#  'Degree',
#  'SchoolMedium',
#  'CGPA']

In [9]:
# df_demography = df[demographic_feature]
# df_demography.head()

In [10]:
df.columns.tolist()

['Participation',
 'Age',
 'Gender',
 'Degree',
 'AcademicYear',
 'SchoolMedium',
 'Department',
 'CGPA',
 'DepressionHistory',
 'RelationshipStatus',
 'Religion',
 'Residence',
 'CurrentLocation',
 'SiblingsCount',
 'SiblingOrder',
 'ResponsibilityFeeling',
 'FatherOccupation',
 'MotherOccupation',
 'FamilyIncome',
 'FinancialHardship',
 'FinancialDependency',
 'FamilyIllness',
 'ParentalCommunicationFrequency',
 'ParentalStrictness',
 'ParentalAcademicPressure',
 'ParentalMarriagePressure',
 'ParentalCareerPressure',
 'ParentalStudyAbroadPressure',
 'BadParentalUnderstanding',
 'BadSiblingBonding',
 'UnstableParentalRelationshipStability',
 'BadPeerRelationship',
 'UnsupportiveTeacherSupport',
 'PartnerRelationshipImpact',
 'WorkEnvironmentImpact',
 'SocietalInterference',
 'PhysicalViolenceExperience',
 'SexualViolenceExperience',
 'VerbalViolenceExperience',
 'EmotionalViolenceExperience',
 'InstitutionType',
 'PoorAcademicPerformanceSelfPerception',
 'AcademicCriticismSelfPercepti

In [11]:
df.head()

Unnamed: 0,Participation,Age,Gender,Degree,AcademicYear,SchoolMedium,Department,CGPA,DepressionHistory,RelationshipStatus,...,Agitation,PsychicAnxiety,SensitivityPerception,HeartAwarenessPerception,SomaticAnxietySymptoms,GastrointestinalSymptoms,GeneralSomaticSymptoms,GenitalSymptoms,Hypochondriasis,SelfPerception
0,Yes,23,Female,Undergraduate,4,Bangla Medium,Mathematical/Engineering & Physical Sciences,3.38,No,Single,...,Fidgetiness/ (অস্থিরতা),Subjective(depending on patient/subject) tensi...,some of the time,not at all,Absent.,None.,None.,Absent.,Not present.,Denies being ill at all.
1,Yes,24,Female,Postgraduate,1,English Medium,Business & Economics Studies,3.08,Yes,Single,...,None.,Worrying about minor matters. (ছোটখাট বিষয় নিয়...,not at all,not at all,Mild.,None.,None.,Absent.,Not present.,Denies being ill at all.
2,Yes,24,Male,Undergraduate,4,Bangla Medium,Mathematical/Engineering & Physical Sciences,3.05,Yes,Prefer not to say,...,Fidgetiness/ (অস্থিরতা),Subjective(depending on patient/subject) tensi...,some of the time,a good part of time,Absent.,None.,"Heaviness in limbs, back or head. Backaches, h...",Absent.,Not present.,Acknowledges being depressed and ill.
3,Yes,22,Male,Postgraduate,1,Bangla Medium,Mathematical/Engineering & Physical Sciences,3.82,Yes,Single,...,None.,Worrying about minor matters. (ছোটখাট বিষয় নিয়...,a good part of time,some of the time,Mild.,None.,None.,Absent.,Not present.,Denies being ill at all.
4,Yes,25,Male,Undergraduate,4,Bangla Medium,Business & Economics Studies,3.4,Yes,Single,...,"Moving about, can’t sit still.",Subjective(depending on patient/subject) tensi...,some of the time,some of the time,Mild.,Loss of appetite but eating without staff enco...,"Heaviness in limbs, back or head. Backaches, h...",Severe.,Preoccupation (আচ্ছন্ন) with health,Acknowledges being depressed and ill.


In [12]:
df['FamilyIncome'] = pd.to_numeric(df['FamilyIncome'], errors='coerce')

In [13]:
print(df['FamilyIncome'].isnull().sum())

0


In [14]:
df2 = pd.DataFrame()

In [15]:
df2.head()

In [16]:
#Grouping of features


demographics = ['Age', 'Gender', 'Religion', 'Residence']

academic = ['Degree','AcademicYear','SchoolMedium', 'Department', 'CGPA', 'TeacherSupport', 'InstitutionType',
            'PoorAcademicPerformanceSelfPerception','AcademicCriticismSelfPerception',
            'UnsatisfiedAcademicWorkloadSelfPerception','NonInterestSubjectOpinion','UnhappySubjectOpinion',
            'NonInterestInstitutionOpinion','UnhappyInstitutionOpinion',]


lifestyle = ['RelationshipStatus','CurrentLocation',  'CloseFriendsCount',
             'FriendHangoutFrequency','ExtracurricularInvolvement', 'SmokingFrequencyPerception',
 'DrugUseFrequencyPerception','AlcoholConsumptionFrequencyPerception','PhysicalExerciseFrequencyPerception',
             'ReligiousActivitiesFrequencyPerception','VideoGamePlayingFrequencyPerception',
             'SportsPlayingFrequencyPerception','SocialMediaUsageHours','PeerSuccessImpactOnSocialMediaOpinion',
 'SocialMediaInfluenceOnPurchaseDecisionOpinion','TrustInHumanBondingOnSocialMediaOpinion',]


work = [ 'CareerProspectsSubjectSelection', 'EmploymentStatus', 'PersonalSituation', 'PandemicImpactOnWork',
 'JobSatisfaction','FamilyPartnerJobSatisfaction','InsufficientJobsSubjectOpinion']

health = ['DepressionHistory',  'Sadness','DepressedMood',  'SensitivityPerception',
 'HeartAwarenessPerception','SomaticAnxietySymptoms','GastrointestinalSymptoms','GeneralSomaticSymptoms',
 'GenitalSymptoms','Hypochondriasis']

energy = [ 'EnergyLevel','WorkAndActivitiesLevel','SleepQuality','ExcessiveSleeping','Fatigue',
 'AppetiteLevel','SubjectiveHealthPerception','ObjectiveHealthPerception', 'Agitation']

emotions = ['FutureOutlook','FearOfFailure','SatisfactionLevel','FeelingsOfGuilt','GuiltFeelings',
 'FeelingsOfPunishment','FeelingsOfDisappointment','SelfBlame','TendencyToCry','Irritability',
 'InterestInOthers','DecisionMaking','AppearanceConcern','SelfCarePerception', 'SuicidalFeelings',
 'CognitiveRetardation','PsychicAnxiety', 'SelfPerception', 'ResponsibilityFeeling']

family = ['SiblingsCount','SiblingOrder',  'FatherOccupation', 'MotherOccupation', 'FamilyIncome', 'FinancialHardship',
          'FinancialDependency', 'FamilyIllness', 'ParentalCommunicationFrequency', 'ParentalStrictness',
         'ParentalAcademicPressure', 'ParentalMarriagePressure', 'ParentalCareerPressure',
          'ParentalStudyAbroadPressure', 'ParentalUnderstanding', 'SiblingBonding','ParentalRelationshipStability'
         ,'PartnerRelationshipImpact', ]

social = ['PeerRelationship', 'WorkEnvironmentImpact', 'SocietalInterference',  'NumbnessFeelingsDueToSocialMediaOpinion',
 'PreferenceForInPersonInteractionOpinion']

violence = ['PhysicalViolenceExperience', 'SexualViolenceExperience', 'VerbalViolenceExperience', 'EmotionalViolenceExperience']

In [17]:

Academic_features = ['AcademicYear',
 'CGPA', 'PoorAcademicPerformanceSelfPerception', 'UnhappySubjectOpinion','CareerProspectsSubjectSelection']

Social_features = ['BadParentalUnderstanding', 'BadSiblingBonding', 'UnstableParentalRelationshipStability',
                   'BadPeerRelationship', 'UnsupportiveTeacherSupport', 'SocietalInterference'
                   , 'CloseFriendsCount', 'FriendHangoutFrequency', 'InterestInOthers',
                   'PhysicalViolenceExperience', 'SexualViolenceExperience', 'VerbalViolenceExperience',
                  'EmotionalViolenceExperience']

#extra for test -> hoping to increase accuracy

health = ['DepressionHistory','DepressedMood',  'SensitivityPerception',
 'HeartAwarenessPerception','SomaticAnxietySymptoms','GastrointestinalSymptoms','GeneralSomaticSymptoms',
 'GenitalSymptoms','Hypochondriasis'] #0.70 accuracy, no improvement

general = ['Age', 'FamilyIncome', 'CGPA', 'AcademicYear', 'CloseFriendsCount', 'DrugUseFrequencyPerception',
         'PhysicalViolenceExperience'] #general + phq-9 = 93% accuracy BERT


Pressure_factors = ['ParentalStrictness',
 'ParentalAcademicPressure',
 'ParentalMarriagePressure',
 'ParentalCareerPressure',
 'ParentalStudyAbroadPressure']

In [18]:
#Pressure_factors

def create_pressure_description(row):

    pressure = (
        f"I  {row.get('ParentalStrictness', 'not')} that my Parents are Strict. "
        f"I  {row.get('ParentalAcademicPressure', 'not')} that I experienced Academic Pressure from my parents, "
        f" and I  {row.get('ParentalMarriagePressure', 'not')} that I experienced Marriage Pressure from my parents. "
        f"I {row.get('ParentalCareerPressure', 'not')} that I experienced Career Pressure from my parents."
        f"I {row.get('ParentalStudyAbroadPressure', 'not')} that I experienced Pressure for studying abroad from my parents."
    )

    return pressure


In [19]:
#Academic



def create_academic_description(row):

    # Academic and career prospects
    academic_career = (
        f"I am currently in my {row.get('AcademicYear', 'unknown')} year with a CGPA of {row.get('CGPA', 'not available')}, "
        f"and I {row.get('PoorAcademicPerformanceSelfPerception', 'satisfied')} about my academic performance being poor. "
        f"I {row.get('UnhappySubjectOpinion', 'neutral')} about being unhappy with my choice of subject, "
        f"and I {row.get('CareerProspectsSubjectSelection', 'uncertain')} about choosing this subject due to career prospects."
    )

    # Combining all elements
    return academic_career



In [20]:
#social Violence

# social_violence_factors = ['PhysicalViolenceExperience', 'SexualViolenceExperience', 'VerbalViolenceExperience',
# 'EmotionalViolenceExperience']


#Violence experiences

def create_violence_description(row):

    violence = (
        f"I  {row.get('PhysicalViolenceExperience', 'not')} that I experienced physical violence, "
        f"and I  {row.get('SexualViolenceExperience', 'not')} that I experienced sexual violence. "
        f"Also I  {row.get('VerbalViolenceExperience', 'not')} that I experienced verbal violence, "
        f"and I {row.get('EmotionalViolenceExperience', 'not')} that I experienced emotional violence."
    )

    return violence



In [21]:
#social relationship

# social_relationship_factors = ['BadParentalUnderstanding', 'BadSiblingBonding', 'UnstableParentalRelationshipStability',
# 'BadPeerRelationship', 'UnsupportiveTeacherSupport', 'SocietalInterference', 'CloseFriendsCount',
# 'FriendHangoutFrequency', 'InterestInOthers']

def create_social_description(row):
    # Describing family and parental relationships
    family_relationships = (
        f" I {row.get('BadParentalUnderstanding', 'not specified')} that My understanding with my parents is bad. "
        f" I {row.get('BadSiblingBonding', 'not specified')}  that my bonding with my siblings is  bad. "
        f" I {row.get('UnstableParentalRelationshipStability', 'not specified')} that there is unstability in my parental relationship."
    )

    # Peer and societal relationships
    peer_and_societal = (
        f"I {row.get('BadPeerRelationship', 'not specified')} that my relationship with my peers is bad. "
        f" I {row.get('UnsupportiveTeacherSupport', 'not specified')} that my teachers are unsupportive. "
        f"I  {row.get('SocietalInterference', 'not specified')} that I feel social interference in my life. "
    )

    # Details about friends and social activities
    friends_details = (
        f"I have {row.get('CloseFriendsCount', 'not specified')} close friends and "
        f"I hang out with my friends {row.get('FriendHangoutFrequency', 'not specified')}. "
        f"{row.get('InterestInOthers', 'not specified')}"
    )

    # Combining all elements
    description = f"{family_relationships} {peer_and_societal} {friends_details}"


    return f"{family_relationships} {peer_and_societal} {friends_details}"




In [22]:
# PHQ9_features = ['SatisfactionLevel',
#  "Sadness",
#  'SleepQuality',
#  'Fatigue',
#  'AppetiteLevel',
#  'FearOfFailure',
#  'DecisionMaking',
#  'Agitation',
#  'SuicidalFeelings']


def create_phq_description(row):
    # Personal feelings and experiences related to PHQ9 depression screening features
    mental_health = (
        f"{row.get('SatisfactionLevel', 'not changes in my satisfaction level')}"
        f"{row.get('Sadness', 'No changes in my sadness levels')} "
        f"{row.get('SleepQuality', 'not specified')} "
        f" About my fatigue levels, {row.get('Fatigue', 'not specified')} "
        f"{row.get('AppetiteLevel', 'not specified')} "
        f"{row.get('FearOfFailure', 'not specified')} "
        f"{row.get('DecisionMaking', 'not specified')} "
        f"Due to Agitation, I experience {row.get('Agitation', 'not specified')} "
        f"About my suicidal thoughts, it is - {row.get('SuicidalFeelings', 'not specified')}"
    )

    return mental_health



In [23]:
# general = ['Age', 'FamilyIncome', 'CGPA', 'AcademicYear', 'CloseFriendsCount', 'DrugUseFrequencyPerception',
#          'DepressionHistory']


def create_general_description(row):
    # Demographics and academic details
    demographics = (
        f"I am {row.get('Age', 'not specified')} years old, studying in my {row.get('AcademicYear', 'not specified')} year with a CGPA of {row.get('CGPA', 'not specified')}. "
        f"My family's income level is {row.get('FamilyIncome', 'not specified')}. "
    )

    # Social and lifestyle information
    social_lifestyle = (
        f"I have {row.get('CloseFriendsCount', 'not specified')} close friends. "
        f"I {row.get('DrugUseFrequencyPerception', 'not specified')}. use drugs. "
    )

    # Mental health history
    mental_health_history = (
        f"I have a history of depression: {row.get('DepressionHistory', 'not specified')}."
    )

    # Combining all elements
    description = f"{demographics} {social_lifestyle} {mental_health_history}"

    return description



In [24]:

# Apply each function to the DataFrame
df2['AcademicDescription'] = df.apply(create_academic_description, axis=1)
df2['ViolenceDescription'] = df.apply(create_violence_description, axis=1)
df2['SocialDescription'] = df.apply(create_social_description, axis=1)
df2['PHQDescription'] = df.apply(create_phq_description, axis=1)
df2['GeneralDescription'] = df.apply(create_general_description, axis=1)
df2['PressureDescription'] = df.apply(create_pressure_description, axis=1)

In [25]:
# import pandas as pd

# def create_personalized_description(row):
#     # Emotions and mental state
#     mental_state = (
#         f"I often feel {row.get('Sadness', 'no specific sadness')}, "
#         f"and I have been feeling {row.get('DepressedMood', 'not depressed')}. "
#         f"My anxiety levels are often described as {row.get('PsychicAnxiety', 'stable')}, "
#         f"and my sleep quality is {row.get('SleepQuality', 'average')}. "
#         f"My appetite is {row.get('AppetiteLevel', 'normal')}, "
#         f"and I find myself {row.get('CognitiveRetardation', 'reacting normally')}. "
#         f"My satisfaction with life is {row.get('SatisfactionLevel', 'moderate')}, "
#         f"and I am usually {row.get('WorkAndActivitiesLevel', 'active')} at work or other activities."
#     )

#     # Health and well-being
#     health = f"Perceived depression history: {row['DepressionHistory']}, currently feels {row['Sadness']} and {row['DepressedMood']}. Experiences somatic anxiety {row['SomaticAnxietySymptoms']}, gastrointestinal issues {row['GastrointestinalSymptoms']}, and general somatic symptoms {row['GeneralSomaticSymptoms']}."



#     # Academic and career prospects
#     academic_career = (
#         f"I am currently in my {row.get('AcademicYear', 'unknown')} year with a CGPA of {row.get('CGPA', 'not available')}, "
#         f"and I feel {row.get('PoorAcademicPerformanceSelfPerception', 'satisfied')} about my academic performance being poor. "
#         f"I {row.get('UnhappySubjectOpinion', 'neutral')} about being unhappy with my choice of subject, "
#         f"and I {row.get('CareerProspectsSubjectSelection', 'uncertain')} about choosing this subject due to career prospects."
#     )

#     # Family and social relationships
#     family_social = (
#         f"I {row.get('BadParentalUnderstanding', 'think')} that my understanding with my parents is poor, "
#         f"and I {row.get('BadSiblingBonding', 'feel')} that my bonding with my siblings is weak. "
#         f"I {row.get('UnstableParentalRelationshipStability', 'believe')} that the relationship stability of my parents is unstable, "
#         f"and I {row.get('BadPeerRelationship', 'observe')} that my relationship with peers is problematic. "
#         f"My teachers are {row.get('UnsupportiveTeacherSupport', 'considered')} unsupportive, "
#         f"and I experience {row.get('SocietalInterference', 'significant')} societal interference."
#     )

#     #Violence experiences
#     violence = (
#         f"I  {row.get('PhysicalViolenceExperience', 'not')} that I experienced physical violence, "
#         f"I  {row.get('SexualViolenceExperience', 'not')} that I experienced sexual violence, "
#         f"I  {row.get('VerbalViolenceExperience', 'not')} that I experienced verbal violence, "
#         f"and I {row.get('EmotionalViolenceExperience', 'not')} that I experienced emotional violence."
#     )

#     # Friends and social life
#     friends_social_life = (
#         f"I have {row.get('CloseFriendsCount', 'a few')} close friends, "
#         f"and I hang out with friends {row.get('FriendHangoutFrequency', 'occasionally')}. "
#         f"My interest in others is {row.get('InterestInOthers', 'moderate')}."
#     )


# #

#     # Combine all elements
#     description = f"{mental_state} {health} {academic_career} {family_social} {violence} {friends_social_life}"

#     # Trim the description if it exceeds 400 words
#     words = description.split()
#     if len(words) > 400:
#         description = " ".join(words[:400])

#     return description

# # Assuming df is your DataFrame that you want to apply this function to
# df2['Sentences'] = df.apply(create_personalized_description, axis=1)


In [26]:
# to_be_fixed = ['PoorAcademicPerformanceSelfPerception', 'UnhappySubjectOpinion', 'CareerProspectsSubjectSelection',
#               'BadParentalUnderstanding', 'BadSiblingBonding', 'UnstableParentalRelationshipStability',
#               'BadPeerRelationship', 'UnsupportiveTeacherSupport', 'PhysicalViolenceExperience', 'SexualViolenceExperience',
#                'VerbalViolenceExperience', 'EmotionalViolenceExperience']

#---------Fixed----

In [27]:
# #Attempt n using general features and PH9


# import pandas as pd

# def create_personalized_description(row):
#     # Personal feelings and experiences
#     personal_feelings = (
#         f"I often feel a lack of interest in activities {row.get('SatisfactionLevel', 'not specified')} and "
#         f"I've been feeling down {row.get('Sadness', 'not specified')}. "
#         f"I have issues with sleeping {row.get('SleepQuality', 'not specified')} and "
#         f"often feel tired {row.get('Fatigue', 'not specified')}."
#     )

#     # Health details
#     health_details = (
#         f"My appetite is {row.get('AppetiteLevel', 'not specified')} and "
#         f"I frequently feel bad about myself {row.get('FearOfFailure', 'not specified')}. "
#         f"I have trouble concentrating {row.get('DecisionMaking', 'not specified')} and "
#         f"feel slow in my movements {row.get('Agitation', 'not specified')}."
#     )

#     # Risk behavior
#     risk_behavior = f"Thoughts of self-harm occur {row.get('SuicidalFeelings', 'not specified')}."

#     # Demographic and academic details
#     demographics = (
#         f"I am {row.get('Age', 'not specified')} years old, with a family income of {row.get('FamilyIncome', 'not specified')}. "
#         f"I maintain a CGPA of {row.get('CGPA', 'not specified')} during my {row.get('AcademicYear', 'not specified')} year in college. "
#         f"I have {row.get('CloseFriendsCount', 'not specified')} close friends."
#     )

#     # Lifestyle and perception
#     lifestyle = (
#         f"My perception of drug use frequency is {row.get('DrugUseFrequencyPerception', 'not specified')}. "
#         f"I have experienced physical violence {row.get('PhysicalViolenceExperience', 'not specified')}."
#     )

#     # Combining all elements
#     description = f"{personal_feelings} {health_details} {risk_behavior} {demographics} {lifestyle}"

#     # Ensuring the length of the tokenized sentence is within BERT's limit
#     tokens = description.split()
#     if len(tokens) > 512:
#         return " ".join(tokens[:512])
#     return description

# # Assuming df is your DataFrame that you want to apply this function to
# # Apply the function and convert to DataFrame
# df2 = df.apply(create_personalized_description, axis=1).to_frame(name='Sentences')



In [28]:
df['Residence'].head()

0    Urban
1    Urban
2    Urban
3    Urban
4    Urban
Name: Residence, dtype: object

In [29]:
df2.head()

Unnamed: 0,AcademicDescription,ViolenceDescription,SocialDescription,PHQDescription,GeneralDescription,PressureDescription
0,I am currently in my 4 year with a CGPA of 3.3...,I Disagree that I experienced physical violen...,I Disagree that My understanding with my pare...,I get as much satisfaction out of things as I ...,"I am 23 years old, studying in my 4 year with ...",I Agree that my Parents are Strict. I Agree ...
1,I am currently in my 1 year with a CGPA of 3.0...,I Disagree that I experienced physical violen...,I Disagree that My understanding with my pare...,I don't enjoy things the way I used to.I do no...,"I am 24 years old, studying in my 1 year with ...",I Disagree that my Parents are Strict. I Dis...
2,I am currently in my 4 year with a CGPA of 3.0...,I Neutral that I experienced physical violenc...,I Agree that My understanding with my parents...,I get as much satisfaction out of things as I ...,"I am 24 years old, studying in my 4 year with ...",I Neutral that my Parents are Strict. I Disa...
3,I am currently in my 1 year with a CGPA of 3.8...,"I Agree that I experienced physical violence,...",I Strongly disagree that My understanding wit...,I get as much satisfaction out of things as I ...,"I am 22 years old, studying in my 1 year with ...",I Neutral that my Parents are Strict. I Neut...
4,I am currently in my 4 year with a CGPA of 3.4...,I Disagree that I experienced physical violen...,I Disagree that My understanding with my pare...,I don't enjoy things the way I used to.I feel ...,"I am 25 years old, studying in my 4 year with ...",I Strongly disagree that my Parents are Stric...


In [30]:
print(type(df2))  # Should output <class 'pandas.core.frame.DataFrame'>


<class 'pandas.core.frame.DataFrame'>


In [31]:
#To add target variable
target_df = pd.read_excel('PHQ9_levels.xlsx')

In [32]:
target_df.head()

Unnamed: 0.1,Unnamed: 0,DepressionLevel
0,0,Minimal
1,1,Minimal
2,2,Minimal
3,3,Minimal
4,4,Moderate


In [33]:
target_df['DepressionLevel'].head()

0     Minimal
1     Minimal
2     Minimal
3     Minimal
4    Moderate
Name: DepressionLevel, dtype: object

In [34]:
df2['PHQ9_levels'] = target_df['DepressionLevel']

In [35]:
df2['PHQ9_levels'].head()

0     Minimal
1     Minimal
2     Minimal
3     Minimal
4    Moderate
Name: PHQ9_levels, dtype: object

In [36]:
df2.to_excel('PHQ9DepressionNLP_elaborated.xlsx', index=False)

In [37]:
df2.head()

Unnamed: 0,AcademicDescription,ViolenceDescription,SocialDescription,PHQDescription,GeneralDescription,PressureDescription,PHQ9_levels
0,I am currently in my 4 year with a CGPA of 3.3...,I Disagree that I experienced physical violen...,I Disagree that My understanding with my pare...,I get as much satisfaction out of things as I ...,"I am 23 years old, studying in my 4 year with ...",I Agree that my Parents are Strict. I Agree ...,Minimal
1,I am currently in my 1 year with a CGPA of 3.0...,I Disagree that I experienced physical violen...,I Disagree that My understanding with my pare...,I don't enjoy things the way I used to.I do no...,"I am 24 years old, studying in my 1 year with ...",I Disagree that my Parents are Strict. I Dis...,Minimal
2,I am currently in my 4 year with a CGPA of 3.0...,I Neutral that I experienced physical violenc...,I Agree that My understanding with my parents...,I get as much satisfaction out of things as I ...,"I am 24 years old, studying in my 4 year with ...",I Neutral that my Parents are Strict. I Disa...,Minimal
3,I am currently in my 1 year with a CGPA of 3.8...,"I Agree that I experienced physical violence,...",I Strongly disagree that My understanding wit...,I get as much satisfaction out of things as I ...,"I am 22 years old, studying in my 1 year with ...",I Neutral that my Parents are Strict. I Neut...,Minimal
4,I am currently in my 4 year with a CGPA of 3.4...,I Disagree that I experienced physical violen...,I Disagree that My understanding with my pare...,I don't enjoy things the way I used to.I feel ...,"I am 25 years old, studying in my 4 year with ...",I Strongly disagree that my Parents are Stric...,Moderate


In [38]:
df2['PHQDescription'][0]

"I get as much satisfaction out of things as I used to.I feel sad I don't sleep as well as I used to.  About my fatigue levels, I don't get more tired than usual. My appetite is no worse than usual. I feel I have failed more than the average person. I make decisions about as well as I ever could. Due to Agitation, I experience Fidgetiness/ (অস্থিরতা) About my suicidal thoughts, it is - Absent."

In [39]:
df2['GeneralDescription'][0]

"I am 23 years old, studying in my 4 year with a CGPA of 3.38. My family's income level is 100000.  I have 5 close friends. I Never. use drugs.  I have a history of depression: No."

In [40]:
df2['AcademicDescription'][0]

'I am currently in my 4 year with a CGPA of 3.38, and I Disagree about my academic performance being poor. I Strongly disagree about being unhappy with my choice of subject, and I No about choosing this subject due to career prospects.'

In [41]:
df2['ViolenceDescription'][0]

'I  Disagree that I experienced physical violence, and I  Disagree that I experienced sexual violence. Also I  Agree that I experienced verbal violence, and I Disagree that I experienced emotional violence.'

In [42]:
df2['SocialDescription'][0]

' I Disagree that My understanding with my parents is bad.  I Neutral  that my bonding with my siblings is  bad.  I Disagree that there is unstability in my parental relationship. I Disagree that my relationship with my peers is bad.  I Disagree that my teachers are unsupportive. I  Agree that I feel social interference in my life.  I have 5 close friends and I hang out with my friends Often. I have not lost interest in other people.'

In [43]:
df2['PressureDescription'][0]

'I  Agree that my Parents are Strict. I  Agree that I experienced Academic Pressure from my parents,  and I  Agree that I experienced Marriage Pressure from my parents. I Disagree that I experienced Career Pressure from my parents.I Disagree that I experienced Pressure for studying abroad from my parents.'