## Investigating Relationships Between The Backdrop of Provincial Economic Volatility and Wider Mental Health in Alberta:


##### Introduction
Our proposed digital exploration will analyze the linkage between economic factors and mental health trends within the wider Alberta population. The Mental Health Commission of Canada conducted an economic review in 2011 and provided an estimate of the impact of mental health illnesses on lost productivity due to absenteeism, presenteeism (present but less than fully productive at work) and turnover;  in 2011 alone, the cost to the economy was 6.3 billion dollars. This value is projected to rise to 16 billion dollars in 2041. In any given year, 1 in 5 Canadians experiences a mental illness or addiction problem and by the time Canadians reach 40 years of age, 1 in 2 have, or have had, a mental illness. This means that more than 6.7 million people in Canada are living with a mental health problem or illness today. That is 19.8% of Canada’s population in any given year. (*Why Investing In Mental Health Will Contribute To Canada’s Economic Prosperity and To the Sustainability of Our Healthcare System*, 2021, pp. 2-3) It is likely that because there is stigma attached to harbouring a mental health diagnosis, that reported metrics are understated.  According to the World Health Organization (WHO) the incidence of mental illness is expected to rise as economic drivers become increasingly dynamic, and the “gig economy” becomes more commonplace. (*Mental health action plan*, 2013, pp. 6-7) The collective provincial population would benefit vastly from applied data analytics in order to address the compounding mental health crisis that is ongoing within the province. It is possible that in the future, applied analytics will be able to guide policy makers to utilize provincially budgeted resources in a more targeted and efficient manner. (Smetanin et al, 2011, pp. 45 -55) 

On an aggregate level, it is well documented that the relationship between economic inequality and mental health exists, but despite this, a reductionist biomedical model assessing mental health on an individual and physiological basis has persisted within the academic medical community. This has limited the ability of corporate entities and policy makers to address inequalities within the mental health sphere. (Macintyre et al, 2018, p. 4)  Another driving factor of mental health inequality has been economic volatility. The province of Alberta has experienced significant economic hardship following the collapse of Western Canadian Select (WCS)  oil prices and NOVA/AECO-C gas prices in 2014. The new commodity price environment also spurred a wider thematic global investment shift away from the energy industry and mounted pressure on corporate entities to support ESG driven narratives.

##### Guiding Questions
Our analysis will look to capture the essence of economic reality which exists within municipalities and the wider province, and overlay that theme with mental health related data. The analysis will attempt to answer three core questions: 

- Is there an identifiable relationship between the dynamic economic situation of the province and the wider mental health trend within Alberta?
- Which subpopulations have been the most adversely affected by the economic volatility in Alberta within the last decade?
- What have been the trends in more granular hospital-based mental health data, and do those findings relate to the wider economic analysis and subgroup analysis?


In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os

import plotly.express as px
from plotly.subplots import make_subplots
import ipywidgets as widgets
from ipywidgets import interact
import plotly.graph_objs as go

The original dataset that was provided by Stats Canada was a simple table with indicators such as 'perceived mental health' broken into eithr fair/poor or good/excellant. The datset likewise was broken up by year with values and year-to year percentage changes for eacha ge group. However, due the current structure, the data needed to be wrangled in order to provide a year to year change across the rows, with our indicators making up the columns. This willl requires the use of a tranpose as well as some trimming and merging in order to make this data table usable for the necessary analysis.

In [2]:
main_MH =pd.read_csv("CanadaMentalHealth.csv")
main_MH["Title"] = main_MH["Ag group"] + main_MH["Indicators"]
main_MH.drop(columns = ['Ag group', 'Indicators'], inplace = True)
main_MH = main_MH.transpose()
main_MH.rename(columns=dict(main_MH.iloc[-1,:]), inplace = True)
main_MH.drop(main_MH.tail(1).index,inplace=True)
tempDF = main_MH.iloc[6:12,]
tempDF.index = range(2015,2021)
tempDF.columns=['%Total+', '%Total-','%1217+', '%1217-','%1834+', '%1834-','%3549+', '%3549-', '%5064+', '%5064-', '%65+', '%65-']
main_MH = main_MH.drop(main_MH.tail(6).index)
main_MH['year'] = main_MH.index
tempDF['year'] = tempDF.index
tempDF['year'] = tempDF['year'].astype('int64')
main_MH['year'] = main_MH['year'].astype('int64')
main_MH = main_MH.merge(tempDF, on='year')
main_MH.index = range(2015,2021)

When looking for additional data and information to accompany the above dataset, we managed to come across a more accurate data set that provided more infomration already provided in a usable format. Likewise, this data set provided more infomration that provided a deeper understadning of current mental health makeup such as 'Perceived health' and 'Perceived life stress'. With this in mind, it wa decided that the below dataset would replace the previously used dataset and the 'main' mental health dataset. The previous dataset is kept as there is still usefull information that is usable later on within the report, such as the precent change in percevied mental health. Reported below is the avaiable information from the new data set.

In [3]:
reportedMH = pd.read_csv("Alberta_StatCan_MentalHealthData.csv")


reportedMH_agg = reportedMH.groupby(['REF_DATE', 'Age group', 'Indicators']).sum().drop(['UOM_ID', 'SCALAR_ID','SYMBOL',  'TERMINATED', 'DECIMALS'], axis = 1)
reportedMH_agg.reset_index(inplace=True)

reportedMH_agg['Indicators'].unique()

array(['Contact with a medical doctor in the past 12 months',
       'Current smoker, daily', 'Current smoker, daily or occasional',
       'Heavy drinking', 'Life satisfaction, satisfied or very satisfied',
       'Mood disorder', 'Perceived health, fair or poor',
       'Perceived health, very good or excellent',
       'Perceived life stress, most days quite a bit or extremely stressful',
       'Perceived mental health, fair or poor',
       'Perceived mental health, very good or excellent',
       'Sense of belonging to local community, somewhat strong or very strong'],
      dtype=object)

With the goal of understanding the general economic trends within Alberta overall, data provided by the Governement of Alberta was utilized and aggregated on a year to year basis in order to compare similar trends to the above mental health data. The Alberta Acitivty Index is a monthly weighted average of 9 different indicators (employment, average weekly earnings, retail trade, wholesale trade, manufacturing, new truck sales, housing starts, rigs drilling and oil production). This data provides a reference for very genereal economic trends within Alberta, but deeper econimic impacts will be explored later in this report. The data is trimmed down for 

https://open.alberta.ca/opendata/alberta-activity-index-data-tables

In [18]:
AA = pd.read_excel("ActivityIndex.xlsx")
AA.dropna(axis=1, inplace=True)
AA['year'] = pd.DatetimeIndex(AA['Date']).year
AA['month'] = pd.DatetimeIndex(AA['Date']).month
#AA = AA.groupby('year', as_index=False).aggregate('mean').drop(['month'], axis = 1)
AA.columns=['date', 'AA', 'year', 'month']
AA = AA[AA['year'] >= 2015]
AA =  AA[AA['year'] < 2021]
AA.groupby('year', as_index=False).aggregate('mean').drop(['month'], axis = 1)

Unnamed: 0,year,AA
0,2015,274.412729
1,2016,262.841788
2,2017,278.21563
3,2018,284.736207
4,2019,283.664514
5,2020,268.912193


It is noted that within the general trend, or overall mean activity index for each year, starts with a dip in 2015 following the collapse of oil and gas. however, the econonmy recovered in later years, but saw gradual decline between 2018 and 2019, with a large dip in 2020 following the covid-19 pandemic.

To compare these trends to those of general mental health trends, a 2x2 grid of line graphs will be created to represent overall trends in mental health and the economy. Specifically used to showcase mental health trends are 'Perceived life stress,' 'Perceived health,' and 'Perceived mental health.' These will be used in conjunction with the Alberta Acitivity Index provded above.

In [16]:
stress = reportedMH_agg[reportedMH_agg['Indicators'] == 'Perceived life stress, most days quite a bit or extremely stressful']
poor = reportedMH_agg[reportedMH_agg['Indicators'] == 'Perceived health, fair or poor']
good = reportedMH_agg[reportedMH_agg['Indicators'] == 'Perceived mental health, very good or excellent']

Stackedfig = make_subplots(rows=2, cols=2)

def addTraceMH(df = stress, age = '12 to 17 years', r = 1, c = 1, col = 'red', sl=True):
    if sl == True:
        Test.append_trace(go.Scatter(
            x=df[df["Age group"] == age]['REF_DATE'],
            y=df[df["Age group"] == age]['VALUE'],
            name = age, legendgroup=age, marker=dict(color=col)), row=r, col=c)
    else:
        Test.append_trace(go.Scatter(
            x=df[df["Age group"] == age]['REF_DATE'],
            y=df[df["Age group"] == age]['VALUE'],
            name = age, legendgroup=age, marker=dict(color=col), showlegend = False), row=r, col=c)
        
Test = make_subplots(rows=2, cols=2,
                          subplot_titles=("Reported Stess Most Days", 
                                          "Reported Poor Mental Health", 
                                          "Reported Good Mental Health", 
                                          "Alberta Economy"))
# Reported stress "most days quite a bit or extremely"
addTraceMH(df = stress, age = '12 to 17 years', r = 1, c = 1, col = 'red')
addTraceMH(df = stress, age = '18 to 34 years', r = 1, c = 1, col = 'blue')
addTraceMH(df = stress, age = '35 to 49 years', r = 1, c = 1, col = 'green')
addTraceMH(df = stress, age = '50 to 64 years', r = 1, c = 1, col = 'black')
addTraceMH(df = stress, age = '65 years and over', r = 1, c = 1, col = 'orange')

# reported MH fair or poor
addTraceMH(df = poor, age = '12 to 17 years', r = 1, c = 2, col = 'red', sl=False)
addTraceMH(df = poor, age = '18 to 34 years', r = 1, c = 2, col = 'blue', sl=False)
addTraceMH(df = poor, age = '35 to 49 years', r = 1, c = 2, col = 'green', sl=False)
addTraceMH(df = poor, age = '50 to 64 years', r = 1, c = 2, col = 'black', sl=False)
addTraceMH(df = poor, age = '65 years and over', r = 1, c = 2, col = 'orange', sl=False)

# reported MH very good or excellent
addTraceMH(df = good, age = '12 to 17 years', r = 2, c = 1, col = 'red', sl=False)
addTraceMH(df = good, age = '18 to 34 years', r = 2, c = 1, col = 'blue', sl=False)
addTraceMH(df = good, age = '35 to 49 years', r = 2, c = 1, col = 'green', sl=False)
addTraceMH(df = good, age = '50 to 64 years', r = 2, c = 1, col = 'black', sl=False)
addTraceMH(df = good, age = '65 years and over', r = 2, c = 1, col = 'orange', sl=False)


Test.append_trace(go.Scatter(
    x=AA['date'],
    y=AA["AA"],
    name = 'Alberta Economy',
    showlegend = True
), row=2, col=2)

Test.update_layout(height=800, width=800, title_text="Mental Health Versus Alberta Economy")
Test.show()

Albert'a Economy hit a peak in the year 2018 where we say a steady decrease in 2019 (with the exception of March and April 2019). The largest dip occured not in 2016 following the collapse of oil and gas, but in 2020 following the beginning of the covid-19 pandemic, with the index dropping roughly 40 points. Likewise, we see a gradual decrease in 'good' mental health reports for all aged 18-64, with no indication that economic troubles in 2018 through 2020 singaling a relationship with this group. However, for those between the age of 35-49, there are signals of rapid changes for both 'stressfull days' and 'poor' reports of mental health in both 2018 and 2020. All major working age groups liekwise reported more 'stressfull days' and 'poor' meantal health reprts following the 2016 collpase of oil and gas and the 2020 covid-19 pandemic.

Although correlation does not infer casuation, it is hard to ignore the overall rises in stress and poor mental health reports following the 2 major economic collapses in Alberta within this past decade. Those ost liekly effected are those within the working age ranges of 18-64 that are most likely impacted by dramatic changes in overall economic activity. Those aged 50-64 have had more reports of 'poor' mental health than all other age ranges throughout the years, signalig that this age roup specifcally coulg be facing the largest burden of economic diffucult in the province, as this is typcially the age range with the most financial responsibility with kids in college, a mortage, and much more. Thus, they ahve shown that economic change are more liekly to push them into a poor mental health state as well as experiencing larger increases in those experiencing stressfull days around these times as well.

Although the Alberta Activity index has been useful in identifying overall trends, the need for more specific information is needed in oder to definitively infer any influencial points within the economy. Data taken form Alberta's economic dashboard allows us to view provincial wide economic indicators throughout the years. We are interested in being able to visualize the mass amount of information provided in the best substettable format as possible. Below, we used the plotly library in an effort to create interactive graphs that will allow us to view all economic indicator information that we need in an effort to understand major economic trends throughout various sectors.

In [7]:
## Economic Indicators

econDF = pd.DataFrame(columns=['When','Alberta', 'Sector'])

for files in os.listdir("Economic"):
    df = pd.read_csv("Economic/" + files)
    df['When'] = pd.to_datetime(df['When'])
    tempDF = df[["When", "Alberta"]] # Get When and Alberta columns
    if files != 'Population.csv':
        tempDF = tempDF.join(df.iloc[:,2], lsuffix='_left', rsuffix='_right') # Get second column
    else:
        tempDF['Pop'] = 'Population'
        
    tempDF.columns = ["When", "Alberta", "SubSector"]
    tempDF['Sector'] = files.split('.')[0]
    econDF = econDF.append(tempDF) # append to final df
    
econDF['Month'] = econDF['When'].dt.month
econDF['Day'] = econDF['When'].dt.day
econDF['Year'] = econDF['When'].dt.year

econDF['Sector2'] = econDF['SubSector'].astype(str) + " (" + econDF['Sector'].astype(str) + ")"



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



From Alberta's economc dashboard, data is broken up between sectors an well as subsectors. In order to identify key trends iwthin each sector, the following dashboard will help identfy key events throughout various sectors over the years. The hope find more granular conclusion from the previous used Alberta Index about gerneral economic trends in Alberta, while allowing for indivudal comparisons of subsectors within each sector, in a hope to identfy key economic indicators that pulled economic trends down in 2015, 2018, and 2020.

In [20]:
graphDF2 = econDF.groupby(['Year', 'Month', 'Sector']).mean().drop('Day', axis = 1)

# Now let's explore the data with the help of a drop-down interactor.

# Build a bar chart that we'll then update with the help of a call-back function
li = go.Scatter()
fig_line = go.FigureWidget(data=li)

month_labels = {1:'Jan',2:'Feb',3:'Mar',4:'Apr',5:'May', 6:'Jun', 7:'Jul', 8:'Aug', 9:'Sep', 10:'Oct', 11:'Nov', 12:'Dec'}

# A list passed to interact() will yield a drop-down interactor
@interact(sector = list(econDF['Sector'].unique()))

def update_bar(sector):
    
    graphDF2 = econDF[econDF['Sector'] == sector]
    graphDF2 = graphDF2.groupby(['Year', 'SubSector']).mean().drop(['Day', 'Month'], axis = 1)
    graphDF2.reset_index(inplace = True)
    data = graphDF2
    fig_line.update_traces()
    fig_line.update_traces(x=[max(data['Year']),max(data['Year'])],
                       y=[max(data['Alberta']),max(data['Alberta'])])
    
    for subsector in data['SubSector'].unique():
        data1 = data[data['SubSector'] == subsector]
        fig_line.add_trace(
            go.Scatter(x=data1['Year'],
                       y=data1['Alberta'],
                      name = subsector))
        
    fig_line.update_layout(title_text="{0} results".format(sector))
    fig_line.update_layout(showlegend=False)

fig_line

interactive(children=(Dropdown(description='sector', options=('CattlePrice', 'NetMigration', 'Population', 'Pr…

FigureWidget({
    'data': [{'type': 'scatter',
              'uid': 'f167f438-1d35-44ee-812c-53719c675c93',
 …

Some key trends that we are able to indentify are the rise is cattle prices in 2015, the rise of Oil prices starting in 2008, and unemployment take a slow upturn starting in 2008, with another spike in 2016, then 2020. All of these are inline with the hypothesized dates for economic downturn that offered significant challanges for residents of Alberta. the rise in these specific indicators are the summation of key livelihood indicators for city residents, food, transportation, and living arrangments. The significant spikes in these sectors suggest that we would see increases in metnal health related hosptilizations and so on during these significant years 2018, 2015, and 2020.


Secondly, visualizing each individual sub-sector on a month to month basis will identify how individual subsectors fluctuate with time as well, this dives a level deeper comapred to the above bar graphs. This will allow us to backtrack to key dates in Alberta's economic history and track how differing sectors changed on a month to month basis. This will also allow us to visualize any interesting points our disperities that we notice from the above line plot for economic indicators.

In [19]:
graphDF = econDF.groupby(['Year', 'Month', 'Sector']).mean().drop('Day', axis = 1)

# Now let's explore the data with the help of a drop-down interactor.

# Build a bar chart that we'll then update with the help of a call-back function
bar = go.Bar()
fig_bar = go.FigureWidget(data=bar)
#fig_bar.update_yaxes(range=[0, 250])

month_labels = {1:'Jan',2:'Feb',3:'Mar',4:'Apr',5:'May', 6:'Jun', 7:'Jul', 8:'Aug', 9:'Sep', 10:'Oct', 11:'Nov', 12:'Dec'}

# A list passed to interact() will yield a drop-down interactor
@interact(year=[*range(max(econDF['Year']),min(econDF['Year']),-1)],
          sector = list(econDF['Sector2'].unique()))

def update_bar(sector, year=2019):
    graphDF =  econDF[econDF['Sector2'] == sector]
    graphDF = graphDF.groupby(['Year', 'Month']).mean().drop('Day', axis = 1)

    data = graphDF.loc[year].mean(axis=1,skipna=True)
    fig_bar.update_traces(x=pd.Series(data.index.values).values, #.map(month_labels).values,
                          y=data.values)
    fig_bar.update_layout(title_text="{0} results for {1}".format(sector, year))

fig_bar

interactive(children=(Dropdown(description='sector', options=('Slaughter, calves (CattlePrice)', 'Slaughter, c…

FigureWidget({
    'data': [{'type': 'bar',
              'uid': 'd0db9295-a1ac-43c6-9f9b-abf8a6bdc102',
     …

Looking back on the key economic indicators from above, we find that although we saw singificant spikes for unemployment in the years 2009 and 2016, there was no key contributing month of the year. Both years saw a steady incline in unemployment through the whole year. Meanhwile, 2020 saw a siginficant spike in April due to the Covid-19 pandemic. Oil prices saw a dramic rise in 2009 with prices of WTI almost doubling from 40\\$ to 80\\$, with a the prices never really dipping below 60\\$ after that point. and lastly, we find that the price of cattle slaughter likewise had a gradual gain, peaking in May at 193\\$.

In [6]:
## Hospital Data

MH = pd.read_excel("HMHDB_Mental_Health.xlsx", sheet_name="4 Combined LOS prov terr")
MH.rename(columns=dict(MH.loc[3,]), inplace = True)
MH = MH.iloc[4:46]
MH.dropna(axis=1, inplace = True)
MH.reset_index(drop = True, inplace = True)
MH['year'] = 2018
display(MH.head(1))


Data Validation extension is not supported and will be removed



Unnamed: 0,Hospital type,Province/territory,Median length of stay \n(days),Average length of stay \n(days),0.5% trimmed average \n(days),Total length of stay \n(days),year
0,General hospitals,Newfoundland and Labrador,6,14.22,12.93,26157,2018
