![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)


# Callysto's Weekly Data Visualization


## Tornados


## Recommended Grade Levels 5-6


## Instructions

Click "Cell" and select "Run All".

This will import the data and run all the code, so you can see this week's data visualization. Scroll back to the top after you’ve run the cells.

instructions

You don't need to do any coding to view the visualizations.

The plots generated in this notebook are interactive. You can hover over and click on elements to see more information.

Email contact@callysto.ca if you experience issues.

## About this Notebook

Callysto's Weekly Data Visualization is a learning resource that aims to develop data literacy skills. We provide Grades 5-12 teachers and students with a data visualization, like a graph, to interpret. This companion resource walks learners through how the data visualization is created and interpreted by a data scientist.

The steps of the data analysis process are listed below and applied to each weekly topic.

1. Question - What are we trying to answer?
2. Gather - Find the data source(s) you will need.
3. Organize - Arrange the data, so that you can easily explore it.
4. Explore - Examine the data to look for evidence to answer the question. This includes creating visualizations.
5. Interpret - Describe what's happening in the data visualization.
6. Communicate - Explain how the evidence answers the question.


## Question 

Where in Canada have the most Tornados occured?




## Background

Tornados are an extreme weather event that occurs in Canada. The [Northern Tornados Project](https://uwo.ca/ntp/about.html) is a project founded at Western Unviersity in 2017. The projec t aims to better understand the occurance of tornados in Canada to provide more understadning on tornados and to increase the abilty to predict tornados so that the damage to property and danger to humans can be reduced. The Northern Tornados Project has published data they have on the occurence of Tornados in Canada.

For more interesting information on how tornados are measured you can [read this article.](https://www.nssl.noaa.gov/education/svrwx101/tornadoes/detection/)

## Gather

### Code 

Run the code cells below to import the libraries we need for this project. Libraries are pre-made code that make it easier to analyze our data. 

Pandas is a library that helps us with data analysis, and plotly express is a library that helps us to make visualizations. Requests and json help read the sunset and sunrise data from an external API. If you are not familiar with what an API is or does, it essentially sends information to one party from another. For example, many Twitter bots make use of the Twitter API where twitter sends information these bots may need in order to properly function. Datetime supplies information in regards to manipulating dates and times and tzfpy allows us to find timezone names by supplying longitude and latitude values.

Without importing these libraries we would have to use much more code to analyze our data and generate visualizations. We import the libraries with abbreviations, or aliases, so that we have less typing to do in each line of our code below.


### Code

In [None]:
import pandas as pd
import plotly.express as px
print('Libraries imported.')


### Data

#### Import the Data

We are using data from the [Northern Tornados Project](www.uwo.ca/ntp). Run the cell below to import the data. 

In [None]:
dataset = pd.read_csv('https://raw.githubusercontent.com/callysto/data-files/main/data-viz-of-the-week/tornadoes/tornadoes.csv',dtype='unicode')
dataset.sort_values('_date',inplace=True)
dataset

## Explore

The first visualization we made is a line graph on the total number of Tornados per year. Run the code below to generate this visualization. 

In [None]:
per_year = dataset.groupby('Year')['event_name'].count().reset_index(name='Number of Tornadoes')
px.line(per_year,x='Year',y='Number of Tornadoes',title='Number of Tornadoes in North America per year')

The next step in our data visualization process was to create a bar graph of the number of tornados per year. Note that the colors also represent number of tornadoes per month. If you only want to see the tornados that happened during certain months you can turn off the data for certain months that you don't want to see by clicking on the months you don't want to see in the legend. This works like a toggle switch and you can turn on the months to see them again by clicking on them again. 

The visualization organizes the number of tornadoes by year and creates a stacked bar graph based on the number of tornadoes seen in each month of that year. We can also double-click on a specific month to compare and contrast how each month compares throughout the years.

In [None]:
year_month = dataset.groupby(['Year','month'])['event_name'].count().reset_index(name='Number of Tornadoes')
year_month.sort_values('Year',inplace=True)
px.bar(year_month,x='Year',y='Number of Tornadoes',color='month',title='Number of Tornadoes per month in each year')

We created a map of all the tornados in this data set; each dot represents a tornado. If you don't want to see the tornados for a specific year you can turn that year off by clicking it in the legend. Remember you can turn it back on to see that year again by clicking on the year in the legend a second time. If you are only interested in seeing one specific year, you can double-click the year you are interested in.

In [None]:
dataset['Year'] = dataset['Year'].astype(str)
dataset['Y'] = dataset['Y'].astype(float)
dataset['X'] = dataset['X'].astype(float)
fig = px.scatter_mapbox(dataset,lat='Y',lon='X',color='Year',zoom=2,height=600)
fig.update_layout(mapbox_style='open-street-map')
fig.show()

The next step we used in our data analysis process was to create a map showing the different tornados by type. This visualization works just like the other ones and you can turn information on and off by clicking it in the legend (or double-clicking to look at one specific tornado type).


In [None]:
dataset = dataset[dataset['event_type'].notna()]
fig = px.scatter_mapbox(dataset,lat='Y',lon='X',color='event_type',zoom=2,height=600)
fig.update_layout(mapbox_style='open-street-map')
fig.show()

The final visualization we created is a graph of the number of tornados in each province. The graphic is stacked by type, each type of tornado is represented by a different color. 

The Northern Tornados Project defines each type of Tornado like this.

- Tornado (Over Land) includes tornadoes that occur over land at any point during their lifetime.

- Tornado (Over Water) includes tornadoes that occur entirely over water during their lifetime.
 
- Downburst includes microbursts, defined as downbursts with a maximum diameter of 4 km or less.

- Non-Tornadic Vortex includes VFCAs (vortex-funnel clouds aloft), sub-tornadic vortices, gustnadoes, and dust devils.

- Unclassified Wind Damage is assigned if there is not enough evidence to classify the cause of wind damage as either a tornado or a downburst, and additional evidence is not expected

- Unclassified Visual Vortex is assigned if there is not enough evidence to classify a reported vortex as either a tornado or a non-tornadic vortex, and additional evidence is not expected

In [None]:
by_province = dataset.groupby(['Province','event_type'])['Year'].count().reset_index(name='Number of Tornadoes')
by_province.sort_values('Number of Tornadoes',ascending=False,inplace=True)
px.bar(by_province,x='Province',y='Number of Tornadoes',color='event_type',title='Number of Tornadoes in each province')

## Interpret

## Reflect on What You See

After making your visualization the next step is to use the data and your visualization to answer the question. Look at and interact with the visualization above. When you hover your mouse over the plots, you’ll notice more information appears. You can also use the legend to make plots appear and disappear.

Think about the following questions.

What do you notice about these graphs?
What do you wonder about the data?
What kind of inferences can you make based on this data?
Is there another way to visualize this data that would change your inerpretation of the information?

Did some years seem to have more tornados than other years?
Did some times of year have more tornados than other times of year?
Were some parts of the country more likely to have tornaods than others?

Use the fill-in-the-blank prompts to summarize your thoughts.
"I used to think _______"
"Now I think _______"
"I wish I knew more about _______"
"These data visualizations remind me of _______"
"I really like _______"

## Communicate 

If you have not yet done this use the plot to answer our question on which natural disaster was the most expensive. Once we understand the costs of natural disasters how can we use that information?

How can you communicate that information? What kind of product could you create to share that information with your school community and wider community?

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)