# Final project - Analysis of the impact of social media on academic grades

Authors - Artush Darchinyan, Alex Petrosyan

### Visualization Canvas
- #### Story
This research is based on the **Students Social Media Addiction Dataset** from Kaggle.  
The aim is to analyze how social media usage influences academic performance, alongside other influencing factors.  
We have an initial hypothesis that the use of different social media platforms affects students academic levels.

- #### Audience
This study will be valuable for **universities** and **research organizations** interested in understanding the impact of social media usage on academic performance.

- #### Data
The data was sourced from **Kaggle** and is stored in a **CSV file**.

- #### Tools
I chose **Dash** for its flexibility, seamless Python integration, and ability to build highly customizable, interactive dashboards with real-time updates.  
Additionally, I used **R Shiny** for its novelty and complementary visualization capabilities.

- #### Dataset Link
*https://www.kaggle.com/datasets/adilshamim8/social-media-addiction-vs-relationships*

In [1]:
# !pip install dash pandas plotly
import pandas as pd
import numpy as np
import dash
from dash import dcc, html, Input, Output, dash_table
import plotly.express as px
import plotly.graph_objects as go

In [2]:
df = pd.read_csv('Students Social Media Addiction.csv')

In [3]:
df.head()

Unnamed: 0,Student_ID,Age,Gender,Academic_Level,Country,Avg_Daily_Usage_Hours,Most_Used_Platform,Affects_Academic_Performance,Sleep_Hours_Per_Night,Mental_Health_Score,Relationship_Status,Conflicts_Over_Social_Media,Addicted_Score
0,1,19,Female,Undergraduate,Bangladesh,5.2,Instagram,Yes,6.5,6,In Relationship,3,8
1,2,22,Male,Graduate,India,2.1,Twitter,No,7.5,8,Single,0,3
2,3,20,Female,Undergraduate,USA,6.0,TikTok,Yes,5.0,5,Complicated,4,9
3,4,18,Male,High School,UK,3.0,YouTube,No,7.0,7,Single,1,4
4,5,21,Male,Graduate,Canada,4.5,Facebook,Yes,6.0,6,In Relationship,2,7


In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 705 entries, 0 to 704
Data columns (total 13 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   Student_ID                    705 non-null    int64  
 1   Age                           705 non-null    int64  
 2   Gender                        705 non-null    object 
 3   Academic_Level                705 non-null    object 
 4   Country                       705 non-null    object 
 5   Avg_Daily_Usage_Hours         705 non-null    float64
 6   Most_Used_Platform            705 non-null    object 
 7   Affects_Academic_Performance  705 non-null    object 
 8   Sleep_Hours_Per_Night         705 non-null    float64
 9   Mental_Health_Score           705 non-null    int64  
 10  Relationship_Status           705 non-null    object 
 11  Conflicts_Over_Social_Media   705 non-null    int64  
 12  Addicted_Score                705 non-null    int64  
dtypes: fl

In [5]:
df.set_index('Student_ID', inplace=True)

In [6]:
df.head()

Unnamed: 0_level_0,Age,Gender,Academic_Level,Country,Avg_Daily_Usage_Hours,Most_Used_Platform,Affects_Academic_Performance,Sleep_Hours_Per_Night,Mental_Health_Score,Relationship_Status,Conflicts_Over_Social_Media,Addicted_Score
Student_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1,19,Female,Undergraduate,Bangladesh,5.2,Instagram,Yes,6.5,6,In Relationship,3,8
2,22,Male,Graduate,India,2.1,Twitter,No,7.5,8,Single,0,3
3,20,Female,Undergraduate,USA,6.0,TikTok,Yes,5.0,5,Complicated,4,9
4,18,Male,High School,UK,3.0,YouTube,No,7.0,7,Single,1,4
5,21,Male,Graduate,Canada,4.5,Facebook,Yes,6.0,6,In Relationship,2,7


In [7]:
unique_values = {col: df[col].unique() for col in df.columns}
unique_values

{'Age': array([19, 22, 20, 18, 21, 23, 24], dtype=int64),
 'Gender': array(['Female', 'Male'], dtype=object),
 'Academic_Level': array(['Undergraduate', 'Graduate', 'High School'], dtype=object),
 'Country': array(['Bangladesh', 'India', 'USA', 'UK', 'Canada', 'Australia',
        'Germany', 'Brazil', 'Japan', 'South Korea', 'France', 'Spain',
        'Italy', 'Mexico', 'Russia', 'China', 'Sweden', 'Norway',
        'Denmark', 'Netherlands', 'Belgium', 'Switzerland', 'Austria',
        'Portugal', 'Greece', 'Ireland', 'New Zealand', 'Singapore',
        'Malaysia', 'Thailand', 'Vietnam', 'Philippines', 'Indonesia',
        'Taiwan', 'Hong Kong', 'Turkey', 'Israel', 'UAE', 'Egypt',
        'Morocco', 'South Africa', 'Nigeria', 'Kenya', 'Ghana',
        'Argentina', 'Chile', 'Colombia', 'Peru', 'Venezuela', 'Ecuador',
        'Uruguay', 'Paraguay', 'Bolivia', 'Costa Rica', 'Panama',
        'Jamaica', 'Trinidad', 'Bahamas', 'Iceland', 'Finland', 'Poland',
        'Romania', 'Hungary', 'C

In [8]:
country_name_corrections = {
    'USA': 'United States',
    'UK': 'United Kingdom',
    'Trinidad': 'Trinidad and Tobago',
    'South Korea': 'Korea, South',
    'Russia': 'Russian Federation',
    'UAE': 'United Arab Emirates',
    'Hong Kong': 'Hong Kong SAR China',
    'North Macedonia': 'North Macedonia',
    'Kosovo': 'Kosovo',
    'Bosnia': 'Bosnia and Herzegovina',
    'Czech Republic': 'Czechia',
    'Vatican City': 'Vatican',
    'Taiwan': 'Taiwan',
    'South Africa': 'South Africa',
    'Iran': 'Iran, Islamic Republic of',
    'Syria': 'Syrian Arab Republic',
    'Laos': "Lao People's Democratic Republic",
    'Moldova': 'Republic of Moldova',
    'Palestine': 'Palestine',
    'Macau': 'Macao SAR China',
    'Bolivia': 'Bolivia',
    'Venezuela': 'Venezuela (Bolivarian Republic of)',
    'Micronesia': 'Micronesia (Federated States of)',
    'Brunei': 'Brunei Darussalam',
    'Libya': 'Libya',
    'Cape Verde': 'Cabo Verde',
    'Swaziland': 'Eswatini',
    'Burma': 'Myanmar',
    'East Timor': 'Timor-Leste'
}
df['Country'] = df['Country'].replace(country_name_corrections)

In [9]:
df.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Age,705.0,20.659574,1.399217,18.0,19.0,21.0,22.0,24.0
Avg_Daily_Usage_Hours,705.0,4.918723,1.257395,1.5,4.1,4.8,5.8,8.5
Sleep_Hours_Per_Night,705.0,6.868936,1.126848,3.8,6.0,6.9,7.7,9.6
Mental_Health_Score,705.0,6.22695,1.105055,4.0,5.0,6.0,7.0,9.0
Conflicts_Over_Social_Media,705.0,2.849645,0.957968,0.0,2.0,3.0,4.0,5.0
Addicted_Score,705.0,6.436879,1.587165,2.0,5.0,7.0,8.0,9.0


In [10]:
df.describe(include=['object'])

Unnamed: 0,Gender,Academic_Level,Country,Most_Used_Platform,Affects_Academic_Performance,Relationship_Status
count,705,705,705,705,705,705
unique,2,3,110,12,2,3
top,Female,Undergraduate,India,Instagram,Yes,Single
freq,353,353,53,249,453,384


In [11]:
app = dash.Dash(__name__)
app.layout = html.Div([
    html.H1("Social media impact analysis dashboard",
            style={'textAlign': 'center', 'marginBottom': 30, 'color': '#2c3e50'}),

    # Control panel
    html.Div([
        html.Div([
            html.Label("Select Platform:", style={'fontWeight': 'bold'}),
            dcc.Dropdown(
                id='platform-dropdown',
                options=[{'label': platform, 'value': platform} for platform in df['Most_Used_Platform'].unique()],
                value=df['Most_Used_Platform'].unique().tolist(),
                multi=True,
                style={'marginBottom': 10}
            )
        ], className='four columns'),

        html.Div([
            html.Label("Select Academic Level:", style={'fontWeight': 'bold'}),
            dcc.Dropdown(
                id='academic-dropdown',
                options=[{'label': level, 'value': level} for level in df['Academic_Level'].unique()],
                value=df['Academic_Level'].unique().tolist(),
                multi=True,
                style={'marginBottom': 10}
            )
        ], className='four columns'),

        html.Div([
            html.Label("Usage Hours Range:", style={'fontWeight': 'bold'}),
            dcc.RangeSlider(
                id='usage-slider',
                min=df['Avg_Daily_Usage_Hours'].min(),
                max=df['Avg_Daily_Usage_Hours'].max(),
                step=0.5,
                marks={i: f'{i}h' for i in range(int(df['Avg_Daily_Usage_Hours'].min()), int(df['Avg_Daily_Usage_Hours'].max()) + 1, 2)},
                value=[df['Avg_Daily_Usage_Hours'].min(), df['Avg_Daily_Usage_Hours'].max()],
                tooltip={"placement": "bottom", "always_visible": True}
            )
        ], className='four columns')
    ], className='row', style={'marginBottom': 30, 'padding': '20px', 'backgroundColor': '#f8f9fa', 'borderRadius': '10px'}),

    # Key metrics
    html.Div([
        html.Div([
            html.H3(id='avg-usage', style={'color': '#e74c3c', 'textAlign': 'center'}),
            html.P("Avg Daily Usage", style={'textAlign': 'center', 'fontWeight': 'bold'})
        ], className='three columns', style={'backgroundColor': '#fff', 'padding': '20px', 'borderRadius': '5px',
                                              'boxShadow': '0 2px 4px rgba(0,0,0,0.1)'}),

        html.Div([
            html.H3(id='academic-impact', style={'color': '#f39c12', 'textAlign': 'center'}),
            html.P("Academic Impact %", style={'textAlign': 'center', 'fontWeight': 'bold'})
        ], className='three columns', style={'backgroundColor': '#fff', 'padding': '20px', 'borderRadius': '5px',
                                              'boxShadow': '0 2px 4px rgba(0,0,0,0.1)'}),

        html.Div([
            html.H3(id='avg-mental-health', style={'color': '#27ae60', 'textAlign': 'center'}),
            html.P("Avg Mental Health", style={'textAlign': 'center', 'fontWeight': 'bold'})
        ], className='three columns', style={'backgroundColor': '#fff', 'padding': '20px', 'borderRadius': '5px',
                                              'boxShadow': '0 2px 4px rgba(0,0,0,0.1)'}),

        html.Div([
            html.H3(id='avg-addiction', style={'color': '#8e44ad', 'textAlign': 'center'}),
            html.P("Avg Addiction Score", style={'textAlign': 'center', 'fontWeight': 'bold'})
        ], className='three columns', style={'backgroundColor': '#fff', 'padding': '20px', 'borderRadius': '5px',
                                              'boxShadow': '0 2px 4px rgba(0,0,0,0.1)'})
    ], className='row', style={'marginBottom': 30}),

    # First row
    html.Div([
        html.Div([
            dcc.Graph(id='usage-vs-sleep')
        ], className='six columns'),

        html.Div([
            dcc.Graph(id='platform-usage-bar')
        ], className='six columns')
    ], className='row', style={'marginBottom': 20}),

    # Second row
    html.Div([
        html.Div([
            dcc.Graph(id='addiction-heatmap')
        ], className='six columns'),

        html.Div([
            dcc.Graph(id='conflicts-scatter')
        ], className='six columns')
    ], className='row', style={'marginBottom': 20}),

    # Third row (map)
    html.Div([
        html.Div([
            dcc.Graph(id='addiction-map')
        ], className='twelve columns')
    ], className='row', style={'marginBottom': 20}),

    # Summary table
    html.Div([
        html.H3("Detailed Data Summary", style={'color': '#2c3e50'}),
        dash_table.DataTable(
            id='summary-table',
            page_size=15,
            style_cell={'textAlign': 'left', 'padding': '10px'},
            style_header={'backgroundColor': '#3498db', 'color': 'white', 'fontWeight': 'bold'},
            style_data_conditional=[
                {
                    'if': {'filter_query': '{Affects_Academic_Performance} = Yes'},
                    'backgroundColor': '#ffebee',
                }
            ]
        )
    ], style={'marginTop': 30, 'backgroundColor': '#fff', 'padding': '20px', 'borderRadius': '10px'})
])

@app.callback(
    [Output('avg-usage', 'children'),
     Output('academic-impact', 'children'),
     Output('avg-mental-health', 'children'),
     Output('avg-addiction', 'children'),
     Output('usage-vs-sleep', 'figure'),
     Output('platform-usage-bar', 'figure'),
     Output('addiction-map', 'figure'),
     Output('addiction-heatmap', 'figure'),
     Output('conflicts-scatter', 'figure'),
     Output('summary-table', 'data'),
     Output('summary-table', 'columns')],
    [Input('platform-dropdown', 'value'),
     Input('academic-dropdown', 'value'),
     Input('usage-slider', 'value')]
)
def update_dashboard(selected_platforms, selected_academic, usage_range):
    filtered_df = df[
        (df['Most_Used_Platform'].isin(selected_platforms)) &
        (df['Academic_Level'].isin(selected_academic)) &
        (df['Avg_Daily_Usage_Hours'] >= usage_range[0]) &
        (df['Avg_Daily_Usage_Hours'] <= usage_range[1])
    ]

    avg_usage = f"{filtered_df['Avg_Daily_Usage_Hours'].mean():.1f}h"
    academic_impact = f"{(filtered_df['Affects_Academic_Performance'] == 'Yes').mean() * 100:.1f}%"
    avg_mental_health = f"{filtered_df['Mental_Health_Score'].mean():.1f}/10"
    avg_addiction = f"{filtered_df['Addicted_Score'].mean():.1f}/10"

    # Chart 1
    fig_usage_sleep = px.scatter(
        filtered_df,
        x='Avg_Daily_Usage_Hours',
        y='Sleep_Hours_Per_Night',
        color='Mental_Health_Score',
        size='Addicted_Score',
        hover_data=['Age', 'Most_Used_Platform'],
        title='Daily usage vs sleep hours',
        color_continuous_scale='RdYlBu_r'
    )
    fig_usage_sleep.update_layout(height=400)

    # Chart 2
    platform_data = (
    filtered_df.groupby('Most_Used_Platform')
    .agg({
        'Avg_Daily_Usage_Hours': 'mean',
        'Mental_Health_Score': 'mean'
    })
    .reset_index()
    .sort_values(by='Avg_Daily_Usage_Hours', ascending=False)
    )

    fig_platform = px.bar(
        platform_data,
        x='Most_Used_Platform',
        y='Avg_Daily_Usage_Hours',
        color='Mental_Health_Score',
        title='Average usage hours by platform',
        color_continuous_scale='viridis'
    )
    fig_platform.update_layout(height=400, xaxis_tickangle=-45)

    # Chart 3 - Map
    map_data = filtered_df.groupby('Country')['Addicted_Score'].mean().reset_index()
    fig_map = px.choropleth(
        map_data,
        locations='Country',
        locationmode='country names',
        color='Addicted_Score',
        color_continuous_scale='OrRd',
        title='Average addiction score by country'
    )
    fig_map.update_layout(height=450)

    # Chart 4
    heatmap_data = filtered_df.pivot_table(
        values='Addicted_Score',
        index='Academic_Level',
        columns='Relationship_Status',
        aggfunc='mean'
    )

    fig_heatmap = px.imshow(
        heatmap_data,
        title='Average addiction score by academic level and relationship status',
        color_continuous_scale='Reds',
        aspect='auto'
    )
    fig_heatmap.update_layout(height=400)

    # Chart 5
    fig_conflicts = px.scatter(
        filtered_df,
        x='Addicted_Score',
        y='Conflicts_Over_Social_Media',
        color='Relationship_Status',
        size='Avg_Daily_Usage_Hours',
        title='Addiction vs social media conflicts',
        hover_data=['Age', 'Mental_Health_Score']
    )
    fig_conflicts.update_layout(height=400)

    # Summary table
    summary_data = filtered_df.groupby(['Most_Used_Platform', 'Academic_Level']).agg({
        'Avg_Daily_Usage_Hours': 'mean',
        'Mental_Health_Score': 'mean',
        'Addicted_Score': 'mean',
        'Sleep_Hours_Per_Night': 'mean',
        'Affects_Academic_Performance': lambda x: (x == 'Yes').mean() * 100
    }).round(2).reset_index()

    summary_data.columns = ['Platform', 'Academic_Level', 'Avg_Usage_Hours', 'Mental_Health', 'Addiction_Score', 'Sleep_Hours', 'Academic_Impact_%']
    table_columns = [{"name": i, "id": i} for i in summary_data.columns]
    table_data = summary_data.to_dict('records')

    return (avg_usage, academic_impact, avg_mental_health, avg_addiction,
            fig_usage_sleep, fig_platform, fig_map,
            fig_heatmap, fig_conflicts, table_data, table_columns)

# CSS styling
app.index_string = '''
<!DOCTYPE html>
<html>
    <head>
        {%metas%}
        <title>{%title%}</title>
        {%favicon%}
        {%css%}
        <link rel="stylesheet" href="https://codepen.io/chriddyp/pen/bWLwgP.css">
        <style>
            body { font-family: 'Arial', sans-serif; background-color: #f5f6fa; }
            .row { margin-bottom: 20px; }
        </style>
    </head>
    <body>
        {%app_entry%}
        <footer>
            {%config%}
            {%scripts%}
            {%renderer%}
        </footer>
    </body>
</html>
'''

if __name__ == '__main__':
    app.run(debug=True)