# Antidepressant Use in Children and Teens: Scandinavia

Title: Antidepressant Use in Children and Teens: Scandinavia

Author: Grace Rowan 

Last Modified: March 11, 2025

Description: DATA 237 Final Project Analysis Notebook


## Motivation

Anitdepressants are a first-line treatment for children and teenagers suffering from a variety of mental health problems, including anxiety disorders and depression Given their generally low cost, some worry that antidepressents are overprescribed or incorrectly prescribed in cases where other treatments for mental health disorders may be more effective. This is particularly relevant in Scandinavia, which has some of the highest rates of antidepressant use in the world. This poses risks to young people, as antidepressant use can cause rare but severe side effects in children and teens, side effects that are not present in adult users. The causes of side effects in young people are not well understood, indicating the need for more rigorous research into the impact of these medications on vulnerable populations. 

In this project, I seek to better understand the rates of antidepressant prescription among young people in Scandinavia. Can it be determined through existing literature and the obtained data whether the trends in antidepressant usage in young people in these countries are cause for concern, especially regarding first-line versus second- and third-line treatments? How do these trends vary by country and age group, and how do they correspond to the type of antidepressant prescribed? With relatively comparable demographics and health care systems, why might these differences among Denmark, Norway, and Sweden?  


## Data Preparation

To start, I load in the necessary packages and read in the data. In addition to the drug use data, I obtained a json map of Scandinavia to use in my analysis. 

In [None]:
import pandas as pd
import numpy as np
import altair as alt
import geojson as gjs
import geopandas as gpd
import json

In [2]:
census = pd.read_csv("/Users/gracerowan/DATA_237/my_assignments/archive_anti/census.csv")
drug_names = pd.read_csv("/Users/gracerowan/DATA_237/my_assignments/archive_anti/drug_names.csv")
drug_use = pd.read_csv("/Users/gracerowan/DATA_237/my_assignments/archive_anti/drug_use.csv")
scandinavia = gpd.read_file("scandinavia.topo.json")

## Usage by Country

First, I analyzed the drug use data by country. I computed and visualized the percentages of the populations of young people in each country who are using antidepressants. I chose to visualize this data using a stacked bar chart for easy readibilty and I used a standard color scheme with nameable colors and high contrast to differentiate between countries. 

Steps: 
- Compute population/antidepressant useage totals
    - compute census totals by country 
    - filter antidepressant data frame by atc = 'N06A' and compute totals by country 
    - merge and compute percentage: ((ad_percentages['nusers'] / ad_percentages['cnt']) * 100)
    - visualize results


In [3]:
census_totals = census.groupby(['year','country'])['cnt'].sum().reset_index()

In [4]:
antidepressants = drug_use.loc[drug_use['atc']=='N06A']
antidepressants_grouped = antidepressants.groupby(['year', 'country'])['nusers'].sum().reset_index()

In [5]:
ad_percentages = antidepressants_grouped.merge(census_totals, on=['year', 'country'])
ad_percentages['Percent Users'] = (ad_percentages['nusers'] / ad_percentages['cnt']) * 100

In [6]:
alt.Chart(ad_percentages, width=400, height=alt.Step(8)).mark_bar(size=20).encode(
    alt.Y('Percent Users').title("Percent of Country Population"),
    alt.X("year", axis=alt.Axis(
            format='d',  
            values=[2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017],  
            labelExpr="datum.value")).title("Year"),
    alt.Color("country").title("Country").legend(orient="bottom", titleOrient="left"),
    
).properties(
    title='Percentage of Young Antidepressant Users Per Year by Country'
)

I also visualized the data geogrpahically by combining the drug use data to the json by country. I visualized the data from 2017 to make this map. I chose continuous color scheem in a green hue to represent the continuous data using an approachable, neutral color. 

Steps: 
- Year filtering
    - sort data by year=2017
- Mapping 
    - locate country centroids to apply labels
    - merge map and percentage data 
    - combine map layers to make visualization 

In [None]:
ad_percentages_2017 = ad_percentages.loc[ad_percentages['year'] == 2017]

In [8]:
scandinavia= scandinavia.rename(columns={'id':'country'})
mapping_percent = scandinavia.merge(ad_percentages_2017, on=['country'])

mapping_percent["lon"] = mapping_percent.geometry.centroid.x
mapping_percent["lat"] = mapping_percent.geometry.centroid.y
labels = mapping_percent[["name", "lon", "lat"]]

map = alt.Chart(mapping_percent).mark_geoshape(
    stroke='black',
    strokeWidth=1
).encode(
    color=alt.Color('Percent Users', scale=alt.Scale(scheme="yellowgreen", domain=[ad_percentages_2017['Percent Users'].min(), ad_percentages_2017['Percent Users'].max()], clamp=True))
).properties(
    title="Young Antidepressant Users", 
    width = 350, 
    height = 415
).project(
    type='mercator'
)

text_shadow = alt.Chart(labels).mark_text(
    font='Arial',
    fontSize=14,
    color='white',  
    fontWeight='bold',
    align='center',
    dy=-10,  
    stroke='white',  
    strokeWidth=3,  
    opacity=0.7  
).encode(
    longitude='lon:Q',
    latitude='lat:Q',
    text='name:N'
)

text_chart = alt.Chart(labels).mark_text(
    font='Arial',
    fontSize=12,
    color='black',
    fontWeight='bold',
    align='center',
    dy=-10
).encode(
    longitude='lon:Q',
    latitude='lat:Q',
    text='name:N'
)

final_chart = map + text_shadow + text_chart
final_chart

## Usage by Drug Group 

Next, I analyzed and visualized the data by drug group. The four groups are SSRI, TCA, MAOI, and Other. I performed some data cleaning to organize the antidepressants into these groups and proceeded to visualize the data in a stacked bar chart by number of users of each drug type. I thought that the number of users was an appropriate metric here since I am not comparing across countries/populations, and simply wanted a visual reference point for the usage by group. I also converted to percentages in another chart but did not include this in my report to avoid redundancy.

Steps: 

- Data cleaning 
    - filter out the 4 classes and map drug types 'N06AF' and 'N06AG' to 'MAOI'
    - group by year and drug_group
- Visualize by year



In [9]:

drug_groups = drug_use.loc[drug_use['drug_group'].isin(['TCA','SSRI', 'N06AF', 'N06AG', 'Other'])]
drug_groups.loc[drug_groups['drug_group'].isin(['N06AF', 'N06AG']), 'drug_group'] = 'MAOI'

drug_groups_grouped = drug_groups.groupby(['year', 'drug_group'])['nusers'].sum().reset_index()


alt.Chart(drug_groups_grouped, width=400, height=alt.Step(8)).mark_bar(size=20).encode(
    alt.Y("nusers").title("Number of Users"),
    alt.X("year", axis=alt.Axis(
            format='d',  
            values=[2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017],  
            labelExpr="datum.value")).title("Year"),
    alt.Color("drug_group").title("Drug Group").legend(orient="bottom", titleOrient="left"),
    
).properties(
    title='Antidepressant Usage per Year by Drug Type'
)

In [10]:
census_totals_small = census.groupby('year')['cnt'].sum().reset_index()

drug_groups_grouped.head()

type_percentages = drug_groups_grouped.merge(census_totals_small, on='year')
type_percentages['percent_users'] = type_percentages['nusers'] / type_percentages['cnt']

alt.Chart(type_percentages, width=400, height=alt.Step(8)).mark_bar(size=20).encode(
    alt.Y("percent_users").title("Percent of Users"),
    alt.X("year", axis=alt.Axis(
            format='d',  
            values=[2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017],  
            labelExpr="datum.value")).title("Year"),
    alt.Color("drug_group").title("Drug Group").legend(orient="bottom", titleOrient="left"),
    
).properties(
    title='Antidepressant Usage per Year by Drug Type'
)

## Usage by Drug Type and Country

As a supplemental visualization, I created 4 maps that show the usage of different drug groups by country. This summarizes the findings well, but I did not include it in my report to save space. 

In [11]:
drug_groups_countries = drug_groups.groupby(['year', 'drug_group', 'country'])['nusers'].sum().reset_index()
drug_groups_countries

drug_groups_countries_percentages = drug_groups_countries.merge(census_totals, on=['year', 'country'])
drug_groups_countries_percentages['Proportion Users'] = drug_groups_countries_percentages['nusers'] / drug_groups_countries_percentages['cnt']

drug_groups_countries_percentages_2017 = drug_groups_countries_percentages.loc[drug_groups_countries_percentages['year'] == 2017]

In [12]:
types = ["MAOI", "SSRI", "Other", "TCA"]

min_value = drug_groups_countries_percentages_2017['Proportion Users'].min()
max_value = drug_groups_countries_percentages_2017['Proportion Users'].max()

for i in types: 

    drug = drug_groups_countries_percentages_2017.loc[drug_groups_countries_percentages_2017['drug_group'] == i]
    mapping_percent = scandinavia.merge(drug, on=['country'])

    map = alt.Chart(mapping_percent).mark_geoshape(
        stroke='black',
        strokeWidth=1
    ).encode(
        color=alt.Color('Proportion Users', scale=alt.Scale(scheme="yellowgreen", domain=[drug_groups_countries_percentages_2017['Proportion Users'].min(), drug_groups_countries_percentages_2017['Proportion Users'].max()], clamp=True))
    ).properties(
        title=f"Proportion {i} Users", 
        width = 350, 
        height = 415
    ).project(
        type='mercator'
    )

    text_shadow = alt.Chart(labels).mark_text(
        font='Arial',
        fontSize=14,
        color='white',  
        fontWeight='bold',
        align='center',
        dy=-10,  
        stroke='white',  
        strokeWidth=3,  
        opacity=0.7  
    ).encode(
        longitude='lon:Q',
        latitude='lat:Q',
        text='name:N'
    )

    text_chart = alt.Chart(labels).mark_text(
        font='Arial',
        fontSize=12,
        color='black',
        fontWeight='bold',
        align='center',
        dy=-10
    ).encode(
        longitude='lon:Q',
        latitude='lat:Q',
        text='name:N'
    )

    final_chart = map + text_shadow + text_chart
    
    final_chart.show()


## Usage by Age Group

Lastly, I analyzed drug usage by age group, looking at both age group by country and by drug group. 

I created a facet plot of stacked bar charts that shows the trend of usage over time by country, with the segmented bars showing age group differences. 

Steps: 
- Create df_grouped 
    - Groupby year, age group, country, drug group
- Compute normalized users
    - merge with cenus_totals, df_grouped['norm_nusers'] = users / total users 

Next, I created a simialar facet plot of stacked bar charts to show the progression over time of usage in each drug group, segmented by age group. I omitted MAOI from the final verison of the visualization because the results were so small they were not visible in the plot. I normalized the results of this plot 

Steps: 
- Calculate Total Users per Year
    - group_totals aggregates the total number of users (nusers) per year across all drug groups.
- Merge Data to Compute Normalized Users
    - The df_grouped is merged with group_totals on "year," allowing for the calculation of normalized values (norm_nusers = nusers / nusers_total).
- Filter Out MAOI
    - MAOI drug group and the youngest age category ("5-9") are removed due to low representation


In [13]:
df = drug_groups.merge(census, on=['year', 'sex', 'age', 'country'])

df_grouped = (
    df.groupby(["year", "age", "country", "drug_group"], as_index=False)
    .agg({"nusers": "sum"})
)

df_grouped = df_grouped.merge(census_totals, on=["year", "country"])
df_grouped['norm_nusers'] = df_grouped['nusers']/df_grouped["cnt"]

chart = (
    alt.Chart(df_grouped)
    .mark_bar()
    .encode(
        alt.X("year:O"),  
        alt.Y("norm_nusers:Q").title(" "),  
        color="age:N",  
        facet=alt.Facet("country:N", columns=3),  
        row="country:N", 
        tooltip=["year", "age", "norm_nusers", "drug_group", "country"],  
    )
    .properties(
        width=150, height=125, title="Users per Age Group by Country"
    )
)

chart

In [14]:

chart = (
    alt.Chart(df_grouped)
    .mark_bar()
    .encode(
        x="year:O",  
        y="nusers:Q",  
        color="age:N",  
        facet=alt.Facet("drug_group:N", columns=4), 
        row="drug_group:N", 
        tooltip=["year", "age", "nusers", "drug_group", "country"], 
    )
    .properties(
        width=150, height=150, title="Number of Users per Age Group Over Time by Country & Drug Type"
    )
)
chart

In [17]:
group_totals = drug_groups_grouped.groupby(['year'])['nusers'].sum().reset_index()

In [18]:

df_merged = df_grouped.merge(group_totals, on="year", suffixes=("", "_total"))
df_merged_no = df_merged.loc[df_merged['drug_group'] != "MAOI"]
df_merged_no = df_merged_no.loc[df_merged['age'] != "5-9"]
df_merged_no = df_merged_no.rename(columns={'drug_group': "Drug Group"})

In [22]:
chart = (
    alt.Chart(df_merged_no)
    .mark_bar()
    .encode(
        alt.X("year:O"),  
        alt.Y("norm_nusers:Q").title(" "),  
        color="age:N",  
        facet=alt.Facet("Drug Group:N", columns=3),  
        row="Drug Group:N",  
    )
    .properties(
        width=150, height=150, title="Normalized Users by Age and Drug Group"
    )
)
chart