## Social Media Usage and Emotional Wellbeing

Social media has become deeply embedded into society and our personal lives, impacting our everyday activities and emotional wellbeing. This analysis explores a dataset that has captured social media engagement and the users' prevailing emotional state. The objective of this research is to understand the relationship between social media habits and emotional well-being.

This notebook will include the following sections:
- Data Cleaning / Preparation
- Exploratory Data Analysis
- Model Selection
- Model Analysis

## Section 1: Data Cleaning / Preparation
- Load libraries and data
- Understand the data with descriptive statistics
- Locate and address any missing values
- 

In [137]:
# Import libraries for data analysis, visualization, math calculation

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
import scipy.stats as stats

In [138]:
social_df = pd.read_csv("https://raw.githubusercontent.com/gurlv/SocialMediaDataset/main/SocialMediaDataset.csv")

In [139]:
print(social_df.head())

   User_ID  Age      Gender   Platform  Daily_Usage_Minutes  Posts_Per_Day   
0        1   25      Female  Instagram                  120              3  \
1        2   30        Male    Twitter                   90              5   
2        3   22  Non-binary   Facebook                   60              2   
3        4   28      Female  Instagram                  200              8   
4        5   33        Male   LinkedIn                   45              1   

   Likes_Received_Per_Day  Comments_Received_Per_Day  Messages_Sent_Per_Day   
0                      45                         10                     12  \
1                      20                         25                     30   
2                      15                          5                     20   
3                     100                         30                     50   
4                       5                          2                     10   

  Dominant_Emotion  
0        Happiness  
1            A

In [140]:
print(social_df.count())

User_ID                      1000
Age                          1000
Gender                       1000
Platform                     1000
Daily_Usage_Minutes          1000
Posts_Per_Day                1000
Likes_Received_Per_Day       1000
Comments_Received_Per_Day    1000
Messages_Sent_Per_Day        1000
Dominant_Emotion             1000
dtype: int64


In [141]:
print(social_df.columns)

Index(['User_ID', 'Age', 'Gender', 'Platform', 'Daily_Usage_Minutes',
       'Posts_Per_Day', 'Likes_Received_Per_Day', 'Comments_Received_Per_Day',
       'Messages_Sent_Per_Day', 'Dominant_Emotion'],
      dtype='object')


In [142]:
print(social_df.dtypes)

User_ID                       int64
Age                           int64
Gender                       object
Platform                     object
Daily_Usage_Minutes           int64
Posts_Per_Day                 int64
Likes_Received_Per_Day        int64
Comments_Received_Per_Day     int64
Messages_Sent_Per_Day         int64
Dominant_Emotion             object
dtype: object


In [143]:
# Identify missing values
missing_values = social_df.isnull().sum()

# Print the missing values count for each column
print("Missing Values:")
print(missing_values)

Missing Values:
User_ID                      0
Age                          0
Gender                       0
Platform                     0
Daily_Usage_Minutes          0
Posts_Per_Day                0
Likes_Received_Per_Day       0
Comments_Received_Per_Day    0
Messages_Sent_Per_Day        0
Dominant_Emotion             0
dtype: int64


In [144]:
# Get a list of the categorical columns and all unique values
emotion_list = social_df['Dominant_Emotion'].unique()
print("Emotions:", emotion_list)

gender_list = social_df['Gender'].unique()
print("Gender:", gender_list)

platform_list = social_df['Platform'].unique()
print("Platform:", platform_list)

Emotions: ['Happiness' 'Anger' 'Neutral' 'Anxiety' 'Boredom' 'Sadness']
Gender: ['Female' 'Male' 'Non-binary']
Platform: ['Instagram' 'Twitter' 'Facebook' 'LinkedIn' 'Whatsapp' 'Telegram'
 'Snapchat']


In [145]:
# Group the data by 'Platform' column
grouped_data = social_df.groupby('Platform')['Gender']

# Calculate descriptive statistics for each group
group_stats = grouped_data.describe()
print(group_stats)

          count unique         top freq
Platform                               
Facebook    190      3  Non-binary  140
Instagram   250      3      Female  160
LinkedIn    120      3        Male   50
Snapchat     80      2  Non-binary   50
Telegram     80      2        Male   60
Twitter     200      3        Male  110
Whatsapp     80      2      Female   60


In [146]:
print(social_df.dtypes)

User_ID                       int64
Age                           int64
Gender                       object
Platform                     object
Daily_Usage_Minutes           int64
Posts_Per_Day                 int64
Likes_Received_Per_Day        int64
Comments_Received_Per_Day     int64
Messages_Sent_Per_Day         int64
Dominant_Emotion             object
dtype: object


## Section 2: Exploratory Data Analysis

## Section 3: Model Selection and Analysis