 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Title<a id=5></a></p>
</div>

# Predicting Stress in Bangladeshi University Students: A LIME-Interpretable Machine Learning Approach

# Introduction

Stress has become a pervasive issue among university students, exerting detrimental effects on their academic performance, mental and physical health, and overall well-being. Recognizing the urgency to address this problem, our research paper employs machine learning algorithms and explainable artificial intelligence (XAI) techniques to predict stress levels among university students. By accurately identifying stress levels and understanding the underlying factors, we can take proactive measures to minimize the negative impact on students' academic performance and well-being.

The study involved collecting data from multiple universities in Bangladesh, ensuring a diverse and representative sample. Various features, including academic performance, socio-economic background, lifestyle choices, mental health indicators, and self-reported stress levels, were considered as input variables. Machine learning algorithms were trained on this dataset to develop prediction models capable of accurately classifying stress levels in university students.

Among the machine learning algorithms evaluated, the Support Vector Classifier (SVC) emerged as the top performer, achieving an accuracy of 90%. This high level of accuracy underscores the potential of machine learning in effectively predicting stress levels. The SVC model incorporates a kernel function that separates stress levels based on different features, enabling robust classification.

To address concerns about model interpretability and biases, an explainable artificial intelligence technique called LIME (Local Interpretable Model-agnostic Explanations) was employed. LIME provides insights into the decision-making process of the machine learning model, allowing us to identify specific observations and features that significantly contribute to stress predictions. By applying the Lime method, the research team gained a deeper understanding of the model's behavior and ensured transparency and understandability in the stress prediction process.

The results of this research not only contribute to accurate stress prediction but also reveal the causes and factors contributing to stress among university students. By identifying at-risk individuals early on, universities and support services can implement targeted interventions and support systems tailored to the specific needs of stressed students. Furthermore, the interpretability offered by the Lime method enables stakeholders to address any biases present in the prediction model, ensuring fair and equitable support for all students.

In conclusion, this research paper demonstrates the potential of machine learning algorithms and explainable artificial intelligence techniques in predicting stress levels among university students. By utilizing the Support Vector Classifier and the Lime method, we achieved a high level of accuracy and transparency in stress prediction. The findings of this study not only assist in early detection of stress but also provide insights into the underlying causes. Through proactive interventions and support, we can mitigate the negative effects of stress on students' academic performance and overall well-being, fostering a healthier and more conducive learning environment.

# Importing the Libraries

This section will give information about Python libraries to be used in the study and these libraries will be imported into the project. Here are the libraries and explanations we will use:

**NumPy** : This library is actually a dependency for other libraries. The main purpose of this library is to provide a variety of mathematical operations on matrices and vectors in Python. Our project will be used this library to provide support to other libraries.

**Pandas** : This library performs import and processing of dataset in Python. In our project, it will be used to include the CSV extension dataset in the project and to perform various operations on it.

**Matplotlib** : This library, which is usually used to visualize data. It will perform the same task in our project.

**Seaborn** : This library which has similar features to Matplotlib is another library used for data visualization in Python. In our project, it will be used for the implementation of various features not included in the Matplotlib library.

S**ckit-Learn** : This library includes the implementation of various machine larning algorithms. With this library, we will perform all operations from building to evaluation of regression models using functions and classes in this library.

Now let's import NumPy, Pandas, Matplotlib and Seaborn libraries into our project and get them ready for use:

In [None]:
import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline

import seaborn as sns
import warnings as wr
wr.filterwarnings('ignore')

from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from xgboost import XGBClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.ensemble import BaggingClassifier
from sklearn.naive_bayes import MultinomialNB
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn import preprocessing
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import classification_report,accuracy_score,precision_score
from sklearn.metrics import confusion_matrix,roc_curve,recall_score,f1_score,roc_auc_score

# DATA EXPLORATION

## Detailed Information of the Dataset

In [None]:
# Importing the dataset
dataset = pd.read_csv('/kaggle/input/stress-prediction/Survey-Survey.csv',header= 0,encoding= 'unicode_escape') # Read CSV file and load into "dataset" variable
dataset.info() ## Show detailed information for dataset columns(attributes)

In [None]:
dataset.head() # Print first 5 entry of the dataset

In [None]:
dataset.tail()  # Prints last 5 entries of the dataset

In [None]:
dataset.describe()  # Print table which contain statistical data of the dataset

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Initial Review<a id=5></a></p>
</div>


In [None]:
dataset.shape # "dataset.shape" refers to the shape or dimensions of a dataset.

In [None]:
dataset.columns # refers to the column names or feature names of a dataset.

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Checking Missing Values<a id=5></a></p>
</div>

checking for missing values in a dataset is an important step in data preprocessing and cleaning. Missing values can potentially affect the performance of machine learning models and need to be handled appropriately.


In [None]:
dataset.isnull().sum() # Count the number of missing values in each column

### Delete column

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;DELETE COLUMN<a id=5></a></p>
</div>

deleting a column from a dataset is often done during the data preprocessing stage when certain columns are deemed unnecessary for the analysis or modeling process

In [None]:
# Drop columns based on column index.
dataset1 = dataset.drop(dataset.columns[[0,1]],axis = 1) # Delete the specified column
dataset1.head()

In [None]:
dataset1.shape

In [None]:
dataset1.columns

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;RENAME COLUMN<a id=5></a></p>
</div>
Renaming a column in a dataset is often necessary to provide more descriptive or meaningful names to the columns or to resolve any naming conflicts


In [None]:
# Changing columns name with index number
mapping = {dataset1.columns[0]: 'Major',
           dataset1.columns[1]: 'class',
           dataset1.columns[2]: 'gender',
           dataset1.columns[3]: 'age',
           dataset1.columns[4]: 'livingWfamily',
           dataset1.columns[5]: 'acasatisfaction',
           dataset1.columns[6]: 'CGPA',
           dataset1.columns[7]: 'drugs',
           dataset1.columns[8]: 'relationship',
           dataset1.columns[9]: 'breakup',
           dataset1.columns[10]: 'conflict',
           dataset1.columns[11]: 'financialProb',
           dataset1.columns[12]: 'Violence',
           dataset1.columns[13]: 'bullied',
           dataset1.columns[14]: 'abused',
           dataset1.columns[15]: 'smediaT',
           dataset1.columns[16]: 'Q1A', 
           dataset1.columns[17]: 'Q2A',
           dataset1.columns[18]: 'Q3A',
           dataset1.columns[19]: 'Q4A',
           dataset1.columns[20]: 'Q5A',
           dataset1.columns[21]: 'Q6A',
           dataset1.columns[22]: 'Q7A',
           dataset1.columns[23]: 'TIPI1',
           dataset1.columns[24]: 'TIPI2', 
           dataset1.columns[25]: 'TIPI3',
           dataset1.columns[26]: 'TIPI4',
           dataset1.columns[27]: 'TIPI5',
           dataset1.columns[28]: 'TIPI6', 
           dataset1.columns[29]: 'TIPI7',
           dataset1.columns[30]: 'TIPI8',
           dataset1.columns[31]: 'TIPI9',
           dataset1.columns[32]: 'TIPI10'}
dataset2 = dataset1.rename(columns=mapping)
display(dataset2)

### Handle null value

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;HANDLE NULL VALUE<a id=5></a></p>
</div>
Handling null values, also known as missing values, is an important step in machine learning preprocessing to ensure accurate and reliable analysis or modeling.

In [None]:
# if your dataset contains missing value, check which column has missing values
dataset2.isnull().sum()

In [None]:
# if your dataset contains missing value, remove those missing values
dataset2.dropna(inplace=True)
dataset2.shape

In [None]:
dataset2.duplicated().sum() # Count the number of duplicate rows

In [None]:
dataset2.drop_duplicates(inplace=True) # Remove duplicate rows

In [None]:
dataset2.shape

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;MAJOR FEATURE<a id=5></a></p>
</div>


In [None]:
dataset2['Major'].value_counts()

In [None]:
plt.figure(figsize=(10,5)) #sets the size of the figure to be displayed
dataset2['Major'].value_counts()[:20].plot(kind='barh',color='green') #calculates the count of each major and selects the top 20 using the value_counts() method.
plt.ylabel('Majors')
plt.xlabel('Count')
plt.title('Top 20 Majors of people participated in the Survey')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;DROP MAJOR FEATURE<a id=5></a></p>
</div>

In [None]:
dataset2.drop('Major', inplace=True, axis=1)# removes the column named 'Major' from the DataFrame 
dataset2.head()

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;GENDER FEATURE<a id=5></a></p>
</div>

In [None]:
from typing import ValuesView
plt.figure(figsize=(5, 5))
dataset2['gender'].value_counts()[:30].plot(kind= 'bar')

In [None]:
print('Count of People participated as of Gender\n',dataset2['gender'].value_counts())

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
dataset2['gender'].value_counts().plot(kind='pie', autopct='%1.0f%%')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;ENCODING TECHNIQUES<a id=5></a></p>
</div>
Label encoding assigns a unique numerical label to each category in a categorical variable. It is suitable for ordinal variables where the order of categories matters. Libraries like scikit-learn provide the LabelEncoder class for label encoding.

In [None]:
le = LabelEncoder()
for col in dataset2.columns:
    if dataset2[col].dtype == np.number:
        continue 
    else:
        dataset2[col] = le.fit_transform(dataset2[col])

In [None]:
dataset2.head()

In [None]:
dataset2.head(20)

In [None]:
dataset2.columns

In [None]:
new_data=dataset2.iloc[::]
data_3=dataset2.filter(regex='Q\d{1,2}A')
data_3.head()

In [None]:
dataset2['S_Count'] = (dataset2['Q1A'] + dataset2['Q2A'] + dataset2['Q3A'] + dataset2['Q4A'] + dataset2['Q5A'] + dataset2['Q6A'] + dataset2['Q7A'])*2
dataset2.head()

In [None]:
dataset2.S_Count.max()

In [None]:
dataset2.S_Count.min()

In [None]:
stress = dataset2
stress.head()

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;STRESS LEVEL<a id=5></a></p>
</div>


In [None]:
#defining function filter 
def filter(x):
    if x>=0 and x <= 18:
        return 'Normal'
    if x>=19 and x<=25:
        return 'Moderate'
    if x>=26:
        return 'Severe'
stress['Slevel'] = stress['S_Count'].apply(filter)
stress.tail(5)

In [None]:
stress.drop('S_Count', inplace=True, axis=1)
stress.head()

In [None]:
stress_order = ['Normal', 'Moderate', 'Severe']
stress['Slevel'] = pd.Categorical(stress['Slevel'], categories=stress_order, ordered=True)
plt.figure(figsize=(10,6))
sns.countplot(data=stress, x='Slevel', order=stress_order)
plt.title('People Condition for Stress Level', fontsize=15)
plt.show()

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Encoding Tecnique with target Column<a id=5></a></p>
</div>


In [None]:
stress.Slevel=stress.Slevel.replace(['Normal', 'Moderate', 'Severe'],[0,1,2])

In [None]:
# from sklearn.preprocessing import LabelEncoder
# le = LabelEncoder()
# stress.Slevel = le.fit_transform(stress.Slevel)

In [None]:
stress['Slevel'].value_counts()

In [None]:
# stress.Slevel=stress.Slevel.replace(['Normal', 'Moderate', 'Severe'],[0,1,2])

stress.Slevel =stress.Slevel.replace({0:'Normal', 1:'Moderate',2:'Severe'})
print("Total Reviews:",len(stress),
      "\nTotal Normal Data:",len(stress[stress.Slevel=='Normal']),
      "\nTotal Moderate Data:",len(stress[stress.Slevel=='Moderate']),
      "\nTotal Severe Data:",len(stress[stress.Slevel=='Severe']))

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

sns.set(font_scale=1.4)
plt.figure(figsize=(2, 6))
stress['Slevel'].value_counts().plot(kind='bar', figsize=(6, 2))
plt.xlabel("Stress Class", labelpad=12)
plt.ylabel("Number of Stress Data", labelpad=12)
plt.xticks(rotation=0)
plt.title("Dataset Distribution", y=1.02)

plt.show()

print("Total Reviews:",len(stress),
      "\nTotal Normal Data:",len(stress[stress.Slevel=='Normal']),
      "\nTotal Moderate Data:",len(stress[stress.Slevel=='Moderate']),
      "\nTotal Severe Data:",len(stress[stress.Slevel=='Severe']))

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['Slevel'].value_counts().plot(kind='pie', autopct='%1.0f%%')

In [None]:
stress.Slevel=stress.Slevel.replace(['Normal', 'Moderate', 'Severe'],[0,1,2])
stress['Slevel'].value_counts()

In [None]:
stress.head()

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Visualization SLEVEL<a id=5></a></p>
</div>

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.Slevel, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('Stress Condition',fontsize=15)

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['Slevel'].value_counts().plot(kind='pie', autopct='%1.0f%%')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Visualization GENDER<a id=5></a></p>
</div>

#### Female = 0, Male = 1, Other = 2

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.gender, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('Stress Condition of Different Genders',fontsize=15)

In [None]:
plt.figure(figsize=(15, 5))
ax = sns.countplot(x='gender', data=stress, hue='Slevel')

for p in ax.patches:
    ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))

plt.title('Stress Condition of Different Genders', fontsize=15)
plt.show()

In [None]:
plt.rcParams["figure.figsize"] = [5,5]  # 0 = Famele
stress['gender'].value_counts().plot(kind='pie', autopct='%1.0f%%')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Data Comparison for Are you happy about your academic condition?<a id=5></a></p>
</div>

#### YES = 1, NO = 0

In [None]:
stress['acasatisfaction'].value_counts()

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.acasatisfaction, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('Are you happy about your academic condition?For stress',fontsize=15)

In [None]:
plt.figure(figsize=(15, 5))
ax=sns.countplot(x=stress.acasatisfaction, data=stress, hue='Slevel')

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))
plt.title('Are you happy about your academic condition?For stress',fontsize=15)

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['acasatisfaction'].value_counts().plot(kind='pie', autopct='%1.0f%%')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Data Comparison for CGPA<a id=5></a></p>
</div>

#### 3.01-3.50 = 0 , > 3.50 = 2, < 3.00 = 1

In [None]:
stress['CGPA'].value_counts()

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.CGPA, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('Stress Condition of Different CGPA',fontsize=15)

In [None]:
plt.figure(figsize=(15, 5))
ax=sns.countplot(x=stress.CGPA, data=stress, hue='Slevel')

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))
plt.title('Stress Condition of Different CGPA',fontsize=15)

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['CGPA'].value_counts().plot(kind='pie', autopct='%1.0f%%')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Data Comparison for Age<a id=5></a></p>
</div>

#### > 25 = 2 ,21-25 = 0, < 20 = 1

In [None]:
stress['age'].value_counts()

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.age, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('Stress Condition of Different Age',fontsize=15)

In [None]:
plt.figure(figsize=(15, 5))
ax=sns.countplot(x=stress.age, data=stress, hue='Slevel')

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))
plt.title('Stress Condition of Different Age',fontsize=15)

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['age'].value_counts().plot(kind='pie', autopct='%1.0f%%')

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

plt.figure(figsize=(30, 10))

# Count plot for CGPA
plt.subplot(1, 2, 1)
ax3 = sns.countplot(x='CGPA', data=stress, hue='Slevel')
plt.title('Stress Condition of CGPA', fontsize=15)

for p in ax3.patches:
    ax3.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))

# Count plot for class
plt.subplot(1, 2, 2)
ax2 = sns.countplot(x='class', data=stress, hue='Slevel')
plt.title('Stress Condition of Your current class level is', fontsize=15)

for p in ax2.patches:
    ax2.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))

plt.tight_layout()
plt.show()

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

plt.figure(figsize=(30, 10))

# Count plot for acasatisfaction
plt.subplot(1, 3, 1)
ax1 = sns.countplot(x='acasatisfaction', data=stress, hue='Slevel')
plt.title('Stress Condition of acasatisfaction', fontsize=15)

for p in ax1.patches:
    ax1.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))

# Count plot for age
plt.subplot(1, 3, 2)
ax2 = sns.countplot(x='age', data=stress, hue='Slevel')
plt.title('Stress Condition of Age', fontsize=15)

for p in ax2.patches:
    ax2.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))

# Count plot for CGPA
plt.subplot(1, 3, 3)
ax3 = sns.countplot(x='CGPA', data=stress, hue='Slevel')
plt.title('Stress Condition of CGPA', fontsize=15)

for p in ax3.patches:
    ax3.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))

plt.tight_layout()
plt.show()

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Data Comparison for Have you ever been bullied?<a id=5></a></p>
</div>

#### No = 0, Yes = 1

In [None]:
stress['bullied'].value_counts()

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.bullied, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('Data Comparison for Have you ever been bullied?',fontsize=15)

In [None]:
plt.figure(figsize=(15, 5))
ax=sns.countplot(x=stress.bullied, data=stress, hue='Slevel')
for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))
plt.title('Have you ever been bullied?:Stress',fontsize=15)

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['bullied'].value_counts().plot(kind='pie', autopct='%1.0f%%')

### Data Comparison for How often do you conflict with your friend?

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Data Comparison for How often do you conflict with your friend?<a id=5></a></p>
</div>


#### Sometimes = 2, Never occurs = 1, Most of the time = 0

In [None]:
stress['conflict'].value_counts()

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.conflict, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('conflict with your friend : Stress',fontsize=15)

In [None]:
plt.figure(figsize=(15, 5))
ax=sns.countplot(x=stress.conflict, data=stress, hue='Slevel')
for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))
plt.title('conflict with your friend : Stress',fontsize=15)

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['conflict'].value_counts().plot(kind='pie', autopct='%1.0f%%')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Data Comparison for Did you have a recent breakup?<a id=5></a></p>
</div>


#### Yes = 1, No = 0

In [None]:
stress['breakup'].value_counts()

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.breakup, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('Did you have a recent breakup? : Stress',fontsize=15)

In [None]:
plt.figure(figsize=(15,5))
ax=sns.countplot(x=stress.breakup, data=stress, hue='Slevel')
for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))
plt.title('Did you have a recent breakup? : Stress',fontsize=15)

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['breakup'].value_counts().plot(kind='pie', autopct='%1.0f%%')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Data Comparison for How many hours do you spend on social media?<a id=5></a></p>
</div>

#### < 1 Hours = 1, > 3 Hours = 2 , 1-3 Hours = 0

In [None]:
stress['smediaT'].value_counts()

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.smediaT, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('How many hours do you spend on social media? : Stress',fontsize=15)

In [None]:
plt.figure(figsize=(25,5))
ax=sns.countplot(x=stress.smediaT, data=stress, hue='Slevel')
for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))
plt.title('How many hours do you spend on social media? : Stress',fontsize=15)

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['smediaT'].value_counts().plot(kind='pie', autopct='%1.0f%%')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Data Comparison for Violence in family?¶<a id=5></a></p>
</div>


#### Most of the time = 0, Never = 1, Rarely = 3, Often = 2 

In [None]:
stress['Violence'].value_counts()

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.Violence, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('Violence in family?:Stress',fontsize=15)

In [None]:
plt.figure(figsize=(15,5))
ax = sns.countplot(x=stress.Violence, data=stress, hue='Slevel')
for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))
plt.title('Violence in family?:Stress',fontsize=15)

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['Violence'].value_counts().plot(kind='pie', autopct='%1.0f%%')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Data Comparison for Are you addicted to any drugs?<a id=5></a></p>
</div>


#### No = 0, Yes = 1

In [None]:
stress['drugs'].value_counts()

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.drugs, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('Violence in family?:Stress',fontsize=15)

In [None]:
plt.figure(figsize=(15,5))
ax = sns.countplot(x=stress.drugs, data=stress, hue='Slevel')
for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('Are you addicted to any drugs?: Stress',fontsize=15)

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['drugs'].value_counts().plot(kind='pie', autopct='%1.0f%%')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Data Comparison for Have you ever been sexually harassed or abused?<a id=5></a></p>
</div>

#### Maybe = 0, Yes = 2, No = 1

In [None]:
stress['abused'].value_counts()

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.abused, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('Have you ever been sexually harassed or abused? : Stress',fontsize=15)

In [None]:
plt.figure(figsize=(15,5))
ax = sns.countplot(x=stress.abused, data=stress, hue='Slevel')
for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))
plt.title('Have you ever been sexually harassed or abused? : Stress',fontsize=15)

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['abused'].value_counts().plot(kind='pie', autopct='%1.0f%%')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Data Comparison for Are you in a relationship?<a id=5></a></p>
</div>

#### Yes = 1, No = 0

In [None]:
stress['relationship'].value_counts()

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.relationship, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('Are you in a relationship?:Stress',fontsize=15)

In [None]:
plt.figure(figsize=(15,5))
ax=sns.countplot(x=stress.relationship, data=stress, hue='Slevel')
for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))
plt.title('Are you in a relationship?:Stress',fontsize=15)

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['relationship'].value_counts().plot(kind='pie', autopct='%1.0f%%')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Data Comparison for Do you have financial problem in your family?<a id=5></a></p>
</div>

#### Yes = 1, No = 0

In [None]:
stress['financialProb'].value_counts()

In [None]:
plt.figure(figsize=(10,5))
ax=sns.countplot(x=stress.financialProb, data=stress)

for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.9))
plt.title('Do you have financial problem in your family?: Stress',fontsize=15)

In [None]:
plt.figure(figsize=(15,5))
ax = sns.countplot(x=stress.financialProb, data=stress, hue='Slevel')
for p in ax.patches:
   ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+0.9))
plt.title('Do you have financial problem in your family?: Stress',fontsize=15)

In [None]:
plt.rcParams["figure.figsize"] = [5,5] 
stress['financialProb'].value_counts().plot(kind='pie', autopct='%1.0f%%')

 <div style="color:#FFFFFF          ; display  :fill; border-radius:90px;
           background-color:#13A4B4; font-size:10px; font-family  :cursive">
    
<p style="padding   : .1px;    color    :#FFFFFF; 
          text-align: Left;   font-size:28px; font-family  :cursive">       
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&nbsp;Correlation of Columns(Attributes)<a id=5></a></p>
</div>
In this section, we'll find the correlation matrix between the columns and we'll visualize it into a Heatmap. In this way, we will be able to see the relationship between the attributes more clearly and visualize them in the future.

In [None]:
stress.corr()

In [None]:
corr = stress.corr()
plt.figure(figsize=(30,30))
sns.heatmap(stress.corr(), cmap='YlGnBu', annot = True)
plt.title("Correlation Map", fontweight = "bold", fontsize=16)

In [None]:
stress.drop('Slevel', axis=1).corrwith(stress.Slevel).plot(kind='bar', grid=True, figsize=(12, 10), title="Correlation with target",color="green");

In [None]:
p = stress.hist(figsize = (20,20))

In [None]:
stress.corr().style.background_gradient(cmap='coolwarm').set_precision(2)

In [None]:
stress1=stress.copy()
stress2=stress.copy()
stress3=stress.copy()
stress4=stress.copy()
stress5=stress.copy()

In [None]:
stress.to_csv('part-1.csv', index = False, encoding='utf-8') # False: not include index