![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

<a href="https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fcallysto%2Fcurriculum-notebooks&branch=master&subPath=Health/AsthmaRates/asthma-rates.ipynb&depth=1" target="_parent"><img src="https://raw.githubusercontent.com/callysto/curriculum-notebooks/master/open-in-callysto-button.svg?sanitize=true" width="123" height="24" alt="Open in Callysto"/></a>

# Asthma Incidence Rate in Alberta

We are going to use data from the [Alberta Interactive Health Data Application](http://www.ahw.gov.ab.ca/IHDA_Retrieval) to investigate how many people in Alberta have [asthma](https://en.wikipedia.org/wiki/Asthma).

We will look at **incidence**, the number of new cases, and **prevalence**, the number of people with the condition. Both of these will be **rates**, meaning that they are per 1000 people in the population. For more information about the data, check out the [data notes](http://www.ahw.gov.ab.ca/IHDA_Retrieval/ShowMetaDataNotesServlet?3133).

The `Asthma - Age-Sex Specific Incidence Rate` data have been downloaded from that website as a CSV file, `▶Run` the following code cell to import and describe the data that we will use to make some visualizations.

In [None]:
import pandas as pd
dfi = pd.read_csv('https://raw.githubusercontent.com/callysto/data-files/main/Health/AsthmaRates/asthma-alberta-2004-2019.csv')
dfi.describe()

## Animated Bar Graph

Now that we have the data imported, let's create a bar graph of `Incidence Rate` versus `Age` animated by year.

In [None]:
import plotly.express as px
fig = px.bar(dfi, x='Age', y='Incidence Rate', animation_frame='Year', color='Sex', barmode='group')
fig.update_layout(title='Asthma Incidence Rate in Alberta')
fig.update_layout(yaxis_range=[0,dfi['Incidence Rate'].max()])
fig.show()

## Line Graph

The animated bar chart was interesting, but it isn't easy to see if the number of people diagnosed with asthma is increasing or decreasing. Let's create a line graph.

In [None]:
px.line(dfi[dfi['Sex']=='BOTH'], x='Year', y='Incidence Rate', color='Age', title='Asthma Incidence Rate in Alberta')

If you double-click on `ALL` in the legend on the right, you'll see just that line. It shows us that in general the incidence rate of asthma in Alberta decreased over this time period.

Can you see any age ranges or time periods over which it increased?

## Data by Zone

We can also import a data set that includes asthma prevalence rate by [AHS Zone](https://www.albertahealthservices.ca/ahs-map-ahs-zones.pdf).

|Zone|Area|
|-|-|
|Z1|South|
|Z2|Calgary|
|Z3|Central|
|Z4|Edmonton|
|Z5|North|

Let's import these data and create a line graph. We will filter the data by `Age` (`ALL`) and `Sex` (`BOTH`).

In [None]:
dfp = pd.read_csv('https://raw.githubusercontent.com/callysto/data-files/main/Health/AsthmaRates/asthma-alberta-zones-2004-2019.csv')
data_to_graph = dfp[(dfp['Age']=='ALL') & (dfp['Sex']=='BOTH')]
px.line(data_to_graph, x='Year', y='Prevalence Rate', color='Geography', title='Asthma Prevalence in Alberta')

We can see that the prevalence increases over this time period. However we noted before that the incidence was decreasing, how do we explain this?

Think back to the definitions of incidence and prevalence. Since incidence is the number of new cases, it should relate to the slope of the prevalence graph.

## Comparing Incidence and Prevalence

Let's compare incidence and prevalence over time by merging the data sets together. Since the `Prevalence Rate` values are much smaller, we will multiply them by a factor (the ratio of the averages of the two columns) in order to show them on the same scale.

In [None]:
age = 'ALL'  # or '00to39','40to44','45to49','50to54','55to59','60to64','65to69','70to74','75to79','80to84','85+'
sex = 'BOTH'  # or 'FEMALE' or 'MALE'
i39 = dfi[(dfi['Age']==age) & (dfi['Sex']==sex)]
p39 = dfp[(dfp['Age']==age) & (dfp['Geography']=='AB') & (dfp['Sex']==sex)]
cdf = pd.merge(i39, p39, on='Year')
cdf['Multiplied Prevalence Rate'] = cdf['Prevalence Rate']*cdf['Incidence Rate'].mean()/cdf['Prevalence Rate'].mean()
px.line(cdf, x='Year', y=['Incidence Rate','Multiplied Prevalence Rate'], title='Asthma in Alberta ('+sex+' '+age+')')

We can see that the two spikes in `Incident Rate` (2005 and 2010) corresponded to a steeper slope in `Prevalence Rate`. However the data for 2015 to 2019 may be harder to explain. What other factors do you think might affect the data about the prevalence rate of asthma in Alberta?

Try changing the values of `age` and `sex` in the code above to see what effect those have on the graph.

## Treemap

We can also visualize these data using an interactive [treemap](https://en.wikipedia.org/wiki/Treemapping) or [sunburst](https://en.wikipedia.org/wiki/Pie_chart#Ring) chart. Again we will use the Alberta Health Services zones.

|Zone|Area|
|-|-|
|Z1|South|
|Z2|Calgary|
|Z3|Central|
|Z4|Edmonton|
|Z5|North|

Click on different parts of the charts to see what happens. You can try changing the order of the columns in the `path =` variable, and even try removing the `#` in front of `color='Age',`.

In [None]:
tdfp = dfp[(dfp['Geography']!='AB') & (dfp['Sex']!='BOTH') & (dfp['Age']!='ALL')]
px.treemap(tdfp, values='Prevalence Rate', 
           path=['Geography', 'Year', 'Age', 'Sex'], 
           #color = 'Age', 
           title='Alberta Asthma Prevalence')

In [None]:
px.sunburst(tdfp, values='Prevalence Rate', 
            path = ['Geography', 'Age', 'Year', 'Sex'], 
            color = 'Year',
            title='Alberta Asthma Prevalence')



## Conclusion

In this notebook we investigated the incidence and prevalence of asthma using Alberta Health Services data for 2004 to 2019 and visualized the data using animated bars, lines, sunbursts, and treemaps. 

As an extension of this activity, try downloading and visualizing other data sets from the [Alberta Interactive Health Data Application](http://www.ahw.gov.ab.ca/IHDA_Retrieval).

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)