# Social Media Use Patterns and Mental Health, a Data Science Project

The relationship between social media use and mental health outcomes is not fully understood.
Given that social media is widespread among many different demographics, it is critical to understand this relationship and the effects social media has on mental health in order to properly treat mental disorders and improve quality of life for those suffering from them.

As "mental health" is a far reaching and abstractly defined concept, this project will focus on two of the most common mental health disorders, anxiety and depression, and will use various machine learning classifiers in an attempt to isolate the relationship between these disorders and social media use in individuals.

### The impact of mental illness

Mental illness impacts individuals, and by extension society, in a variety of ways.

According to the World Health Organization (WHO):
- Mental health conditions can cause difficulties in all aspects of life, including relationships with family, friends and community.
- In 2019, 970 million people globally were living with a mental disorder, with anxiety and depression the most common. This means approximately **1 in 8[/bold] people suffer from some form of mental illness.
- In 2019, 301 million people were living with an anxiety disorder including 58 million children and adolescents.
- In 2019, 280 million people were living with depression, including 23 million children and adolescents.

The high prevalence of both social media use and mental health disorders combined with the currently limited understanding of the way they interact makes further analysis of the subject crucial to public health.

# Project goals

Despite the prevalence of these disorders, detection and diagnosis still pose a significant challenge due to several factors:
- Diagnosis of anxiety and depression is made according to the self reported feelings of the patient.

  This means that factors such as the individuals personal feelings on mental health, their willingness to accept help, and social stigma all play a part in the detection of these disorders in a way which is not present with physical conditions.
-  Disease comorbidity

   The existence of two or more mental health disorders in an individual is common, and those with one type of mental disorder often develop other types of mental disorders.
   Moreover, many disorders share similar symptoms, making it difficult to identify the primary condition.

The primary goal of this project is to create a predictive model for detection of anxiety and depression based on individual social media use patterns.
Through this, I hope to create another tool for individuals and health care professionals to use in the difficult task of mental health diagnosis.

The secondary goal of this project is to map the relationship between mental health and social media, and to identify healthy and unhealthy social media use patterns using machine learning tools. 

# Exploratory Data Analysis
### The Dataset

The dataset used in this project is comprised of 482 responses to a survey conducted on Bangladeshi citizens.
The first 8 questions are designed to understand the demographics and social media use patterns of the participants. 
The last 12 questions are designed to get various mental health indicators regarding the participants, and responses are based on the Likert scale (meaning a low score of 1 indicates that the participant "strongly disagrees" with the question, and a high score of 5 means the participant "strongly agrees").

The set: https://docs.google.com/spreadsheets/d/1lWFIL7h0F7xtmJHNPJX7ttPkO4v9j3xQ2E9Qb1wjek4/edit?usp=sharing

### Importing and loading the data

In [37]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [48]:
data = pd.read_csv('formresponses.csv')

### Data Preprocessing

We have ample information regarding the different variables, and so we will begin by performing manual dimensionality reduction to remove irrelevant data.

Since the main focus of this project is an analysis related to depression and anxiety, we will remove all mental health indicators that are unrelated to these two conditions. The symptoms we require indicators for are as follows:

<u>Depression:</u>


Let us examine our variables:

In [67]:
list(data.columns)

['Timestamp',
 '1. What is your age?',
 '2. Gender',
 '3. Relationship Status',
 '4. Occupation Status',
 '5. What type of organizations are you affiliated with?',
 '6. Do you use social media?',
 '7. What social media platforms do you commonly use?',
 '8. What is the average time you spend on social media every day?',
 '9. How often do you find yourself using Social media without a specific purpose?',
 '10. How often do you get distracted by Social media when you are busy doing something?',
 "11. Do you feel restless if you haven't used Social media in a while?",
 '12. On a scale of 1 to 5, how easily distracted are you?',
 '13. On a scale of 1 to 5, how much are you bothered by worries?',
 '14. Do you find it difficult to concentrate on things?',
 '15. On a scale of 1-5, how often do you compare yourself to other successful people through the use of social media?',
 '16. Following the previous question, how do you feel about these comparisons, generally speaking?',
 '17. How often do




In addition, the first column indicates a timestamp for when the participant took the survery, which we dont need.