![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

# Callysto’s Weekly Data Visualization

## Disabilities

### Recommended Grade levels: 5-9
<br>

### Instructions

Click "Cell" and select "Run All".

This will import the data and run all the code, so you can see this week's data visualization. Scroll back to the top after you’ve run the cells.

![instructions](https://github.com/callysto/data-viz-of-the-week/blob/main/images/instructions.png?raw=true)

**You don't need to do any coding to view the visualizations**.

The plots generated in this notebook are interactive. You can hover over and click on elements to see more information. 

Email contact@callysto.ca if you experience issues.

### About this Notebook

Callysto's Weekly Data Visualization is a learning resource that aims to develop data literacy skills. We provide Grades 5-12 teachers and students with a data visualization, like a graph, to interpret. This companion resource walks learners through how the data visualization is created and interpreted by a data scientist. 

The steps of the data analysis process are listed below and applied to each weekly topic.

1. Question - What are we trying to answer?
2. Gather - Find the data source(s) you will need. 
3. Organize - Arrange the data, so that you can easily explore it. 
4. Explore - Examine the data to look for evidence to answer the question. This includes creating visualizations. 
5. Interpret - Describe what's happening in the data visualization. 
6. Communicate - Explain how the evidence answers the question. 

# Question



### Goal



# Gather

### Code: 

Run the code cells below to import the libraries we need for this project. Libraries are pre-made code that make it easier to analyze our data.

In [29]:
import pandas as pd
import plotly.express as px
import re

In [30]:
by_pop = pd.read_csv('https://raw.githubusercontent.com/callysto/data-files/main/data-viz-of-the-week/disabilities/by_pop.csv')
by_type = pd.read_csv('https://raw.githubusercontent.com/callysto/data-files/main/data-viz-of-the-week/disabilities/by_type.csv')
male_female = pd.read_csv('https://raw.githubusercontent.com/callysto/data-files/main/data-viz-of-the-week/disabilities/male_female_disabilities.csv')
employment = pd.read_csv("https://raw.githubusercontent.com/callysto/data-files/main/data-viz-of-the-week/disabilities/employment.csv")

In [31]:
display(by_pop.head(), by_type.head(), male_female.head(), employment.head())

Unnamed: 0,Geography,Disability,Number,Percent
0,"St. John's, Newfoundland and Labrador","Total population, with and without disabilities 7",167550,100.0
1,"St. John's, Newfoundland and Labrador",Persons with disabilities,37350,22.3
2,"St. John's, Newfoundland and Labrador",Persons without disabilities,130250,77.7
3,"Halifax, Nova Scotia","Total population, with and without disabilities 7",331300,100.0
4,"Halifax, Nova Scotia",Persons with disabilities,94350,28.5


Unnamed: 0,Disability type (grouped),Number
0,Total population with disabilities 7 8,3727920
1,Sensory disability 9,1364120
2,Physical disability 10,1958570
3,Pain-related disability,2512090
4,Mental health-related disability,1421270


Unnamed: 0,Sex,Potential to work,Number,Percent
0,Both Sexes,"Total, with or without work potential 6",1648460,100.0
1,Both Sexes,With work potential 7,644640,39.1
2,Both Sexes,Without work potential 8,1003820,60.9
3,Males,"Total, with or without work potential 6",700120,100.0
4,Males,With work potential 7,294440,42.1


Unnamed: 0,Age Group,Disabilities,Milder,Severe,Gender,Employment Percent
0,25-34 years,0,0,0,Women,77.3
1,25-34 years,0,0,0,Men,65.0
2,25-34 years,0,0,0,Both,81.8
3,35-44 years,0,0,0,Women,81.7
4,35-44 years,0,0,0,Men,89.5


# Organize

In [32]:
by_pop[["City", "Province"]] = by_pop['Geography'].str.split(",", n=1, expand=True)
by_pop['City'] = by_pop['City'].str.strip()
by_pop['Province'] = by_pop['Province'].str.strip()
by_pop

Unnamed: 0,Geography,Disability,Number,Percent,City,Province
0,"St. John's, Newfoundland and Labrador","Total population, with and without disabilities 7",167550,100,St. John's,Newfoundland and Labrador
1,"St. John's, Newfoundland and Labrador",Persons with disabilities,37350,22.3,St. John's,Newfoundland and Labrador
2,"St. John's, Newfoundland and Labrador",Persons without disabilities,130250,77.7,St. John's,Newfoundland and Labrador
3,"Halifax, Nova Scotia","Total population, with and without disabilities 7",331300,100,Halifax,Nova Scotia
4,"Halifax, Nova Scotia",Persons with disabilities,94350,28.5,Halifax,Nova Scotia
...,...,...,...,...,...,...
100,"Vancouver, British Columbia",Persons with disabilities,410510,20.5,Vancouver,British Columbia
101,"Vancouver, British Columbia",Persons without disabilities,1591390,79.5,Vancouver,British Columbia
102,"Victoria, British Columbia","Total population, with and without disabilities 7",307700,100,Victoria,British Columbia
103,"Victoria, British Columbia",Persons with disabilities,89250,29,Victoria,British Columbia


In [33]:
def remove_integers(string):
    return ''.join(i for i in string if not i.isdigit())

def remove_commas_and_letters(value):
    value = value.replace(',', '')  # Remove commas
    value = re.sub('[^0-9]', '', value)  # Remove non-digit characters using regex
    return int(value)

In [34]:
by_pop['Disability'] = by_pop["Disability"].apply(remove_integers)
by_pop['Number'] = by_pop["Number"].apply(remove_commas_and_letters)


by_type["Disability type (grouped)"] = by_type["Disability type (grouped)"].apply(remove_integers)
by_type['Number'] = by_type["Number"].apply(remove_commas_and_letters)

male_female['Potential to work'] = male_female["Potential to work"].apply(remove_integers)
male_female['Number'] = male_female["Number"].apply(remove_commas_and_letters)

In [35]:
display(by_pop.head(), by_type.head(), male_female.head(), employment.head())

Unnamed: 0,Geography,Disability,Number,Percent,City,Province
0,"St. John's, Newfoundland and Labrador","Total population, with and without disabilities",167550,100.0,St. John's,Newfoundland and Labrador
1,"St. John's, Newfoundland and Labrador",Persons with disabilities,37350,22.3,St. John's,Newfoundland and Labrador
2,"St. John's, Newfoundland and Labrador",Persons without disabilities,130250,77.7,St. John's,Newfoundland and Labrador
3,"Halifax, Nova Scotia","Total population, with and without disabilities",331300,100.0,Halifax,Nova Scotia
4,"Halifax, Nova Scotia",Persons with disabilities,94350,28.5,Halifax,Nova Scotia


Unnamed: 0,Disability type (grouped),Number
0,Total population with disabilities,3727920
1,Sensory disability,1364120
2,Physical disability,1958570
3,Pain-related disability,2512090
4,Mental health-related disability,1421270


Unnamed: 0,Sex,Potential to work,Number,Percent
0,Both Sexes,"Total, with or without work potential",1648460,100.0
1,Both Sexes,With work potential,644640,39.1
2,Both Sexes,Without work potential,1003820,60.9
3,Males,"Total, with or without work potential",700120,100.0
4,Males,With work potential,294440,42.1


Unnamed: 0,Age Group,Disabilities,Milder,Severe,Gender,Employment Percent
0,25-34 years,0,0,0,Women,77.3
1,25-34 years,0,0,0,Men,65.0
2,25-34 years,0,0,0,Both,81.8
3,35-44 years,0,0,0,Women,81.7
4,35-44 years,0,0,0,Men,89.5


In [36]:
by_type_fig = px.histogram(by_type, x="Disability type (grouped)", y="Number", color="Number")
by_type_fig.update_traces(showlegend=False).show()

In [37]:
provinces = px.treemap(by_pop, path=[px.Constant("Canada"), 'Province', 'City', 'Disability'], values='Number')
provinces.update_traces(root_color="lightgrey")
provinces.update_layout(margin = dict(t=50, l=35, r=35, b=35))
provinces.show()