I was interested in the how severe weather has changed in the US since record keeping began, specifically how many deaths,injuries and the amount of property damage has been done by severe weather in the US. NOAA data for storms and weather anomalies goes back to 1950. However, only tornadoes, hail and thunderstorm wind was recorded until 1996, after which 48 different weather events were recorded. Due to this, we will only look at data from 1996 to 2022. Data is Publically available on Google Cloud BigQuery, but each year has separate table so manually querying the data will be time consuming. This python code will run the same query on each years data table and combine the results in a single place.

Installing the google cloud library to access BigQuery through python


In [1]:
%%capture
%pip install --upgrade google-cloud-bigquery







Creating a google cloud bigquery client using a service account

In [2]:
from google.cloud import bigquery # to run queries on google clouds bigquery
from google.oauth2 import service_account # to access google cloud using a service account

credentials = service_account.Credentials.from_service_account_file(
    r"C:\Users\skicr\Downloads\alpine-tempo-392622-523114992507.json"
) 
google_cloud_project_id = 'alpine-tempo-392622' 
client = bigquery.Client(credentials=credentials,project=google_cloud_project_id)

Creating a list of identical SQL queries for each year of data being queried with the .format method.
1990-1995 will be included to allow for population interpolation between 1990 and 2000 and be removed after

In [3]:
years = range(1990,2023) #final argument of range function is not included
queries = []
for year in years: 
  query=('''
            SELECT
                states.state,
                storm_data.event_type,
                storm_data.deaths,
                storm_data.injuries,
                storm_data.damage
            FROM
              (
                SELECT
                  event_type,
                  SUM(deaths_direct+deaths_indirect) AS deaths, 
                  SUM(injuries_direct+injuries_indirect)  AS injuries, 
                  SUM(damage_crops+damage_property) AS damage,
                  LPAD(state_fips_code,2,'0') as fips_code
                FROM
                  `bigquery-public-data.noaa_historic_severe_storms.storms_{year}`
                
                GROUP BY
                fips_code,event_type
              ) as storm_data
            right JOIN 
              `bigquery-public-data.geo_us_boundaries.states` AS states 
            ON 
              states.state_fips_code=storm_data.fips_code                           
            ORDER by 
              states.state
            ''').format(year=year)
    
  queries.append(query)


Installing pandas to make working with the data more convenient

In [4]:
%%capture
%pip install pandas 

Performing each query and combining the data

In [5]:
import pandas as pd

# to store the results of each query
results_list=[] 


for x, query in enumerate(queries):
   query = client.query(queries[x])           #running the query
   query_res = query.result().to_dataframe()  #converting the results to a dataframe
   query_res.insert(1,'year',years[x])        #adds a column to the results to show what year its from
   results_list.append(query_res)             #adding the dataframe to a list of all the results

#combining the list of tables into a single dataframe
event_type_df=pd.concat(results_list,ignore_index=True) 
event_type_df.tail()

Unnamed: 0,state,year,event_type,deaths,injuries,damage
24788,WY,2022,winter storm,0,0,0
24789,WY,2022,winter weather,0,0,0
24790,WY,2022,cold/wind chill,0,0,0
24791,WY,2022,thunderstorm wind,0,0,29000
24792,WY,2022,extreme cold/wind chill,0,0,0


The NOAA's data for damage is not adjusted for inflation so the cpi library will be used to convert all dollar values to current values and add a row for it in the results dataframe


In [6]:
%%capture
%pip install cpi

In [7]:
import cpi

#creating an inflation factor for each years dollars to convert it to current dollars(i.e. $1950*x=$2023)
inflation_factor = {} # creating a dictionary to store the factor for each year

for year in years:
   inflation_factor[year] = cpi.inflate(1,year) 

#adding a new column for inflation adjusted damage values using the apply method and inflation factor 
event_type_df['infl_adj_damage'] = event_type_df.apply(lambda row: row['damage']*inflation_factor[row['year']],axis=1)

event_type_df.head()



Unnamed: 0,state,year,event_type,deaths,injuries,damage,infl_adj_damage
0,AK,1990,,,,,
1,AL,1990,thunderstorm wind,0.0,96.0,0.0,0.0
2,AL,1990,hail,0.0,0.0,0.0,0.0
3,AL,1990,tornado,0.0,74.0,17800000.0,39856610.558531
4,AR,1990,thunderstorm wind,1.0,6.0,0.0,0.0


Examining the different event types

In [8]:
for x in event_type_df['event_type'].drop_duplicates():
    print(x)

None
thunderstorm wind
hail
tornado
tornadoes, tstm wind, hail
hail/icy roads
thunderstorm winds/flooding
thunderstorm winds/heavy rain
thunderstorm wind/ tree
thunderstorm wind/ trees
tornado/waterspout
thunderstorm winds funnel clou
hail flooding
thunderstorm winds/flash flood
thunderstorm winds lightning
thunderstorm winds/ flood
thunderstorm winds heavy rain
heat
flood
drought
blizzard
wildfire
avalanche
high wind
ice storm
lightning
heavy rain
heavy snow
winter storm
cold/wind chill
storm surge/tide
waterspout
flash flood
rip current
funnel cloud
winter weather
dust devil
dust storm
strong wind
dense fog
high surf
marine high wind
frost/freeze
coastal flood
tropical storm
hurricane (typhoon)
freezing fog
debris flow
sleet
astronomical low tide
lake-effect snow
seiche
volcanic ash
extreme cold/wind chill
excessive heat
northern lights
tsunami
dense smoke
lakeshore flood
tropical depression
sneakerwave
hurricane
volcanic ashfall


Some of these can be combined due to how similar they are such as heavy snow, blizzard, and lake effect snow. This is easily done with a dictionary and a function.

In [9]:
#keys are original event types and values are less specific terms
weather_dict={
    'funnel cloud':'tornado',
    'tornadoes, tstm wind, hail':'tornado',
    'thunderstorm wind':'thunderstorm',
    'thunderstorm winds/flooding':'thunderstorm', 
    'thunderstorm wind/ tree':'thunderstorm',
    'thunderstorm wind/ trees':'thunderstorm',
    'thunderstorm winds lightning':'thunderstorm',
    'thunderstorm winds/ flood':'thunderstorm',
    'thunderstorm winds/heavy rain':'thunderstorm',
    'thunderstorm winds funnel clou':'thunderstorm',
    'thunderstorm winds/flash flood':'thunderstorm',
    'thunderstorm winds heavy rain':'thunderstorm',
    'hail/icy roads':'hail',
    'tornado/waterspout':'tornado',
    'hail flooding':'hail',
    'hurricane (typhoon)':'hurricane',
    'high wind':'high winds',
    'strong wind':'high winds',
    'marine high wind':'high winds',
    'blizzard':'winter storm',
    'heavy snow':'winter storm',
    'lake-effect snow':'winter storm',
    'heat':'excessive heat',
    'cold/wind chill':'extreme cold',
    'extreme cold/wind chill':'extreme cold',
    'sleet':'winter weather',
    'freezing fog':'winter weather',
    'frost/freeze':'winter weather',
    'sneakerwave':'extreme waves',
    'high surf':'extreme waves',
    'seiche':'extreme waves',
    'lakeshore flood':'flood',
    'coastal flood':'flood'
    }

Function to update the dataframe

In [10]:
def weather_categories(w_type):
    if w_type in weather_dict:
        return weather_dict[w_type]
    else:
        return w_type

Applying the function to the event type column   

In [11]:
event_type_df['event_type']=event_type_df['event_type'].apply(weather_categories)

#checking the results
event_type_df['event_type'].drop_duplicates()

0                         None
1                 thunderstorm
2                         hail
3                      tornado
896             excessive heat
897                      flood
898                    drought
899               winter storm
900                   wildfire
901                  avalanche
902                 high winds
903                  ice storm
904                  lightning
905                 heavy rain
908               extreme cold
909           storm surge/tide
917                 waterspout
918                flash flood
919                rip current
922             winter weather
943                 dust devil
944                 dust storm
959                  dense fog
960              extreme waves
1028            tropical storm
1054                 hurricane
1231               debris flow
1820     astronomical low tide
3297              volcanic ash
5698           northern lights
8934                   tsunami
9338               dense smoke
9919    

This is a better list, although it is not perfect due to some types being very specific like dust devil and others being vague such as winter weather. This consolidation will have created rows with the same state, year, and event type which can easily be consolidated using the groupby function.

In [12]:

#checking if there is any duplicate state, year, event type combinations
if True in [*event_type_df.duplicated(subset=['state','year','event_type'],keep=False)]:
    print('there is duplicate rows')
else:
    print('there are no duplicate rows')

there is duplicate rows


In [13]:
#list of columns to group by
cols_to_group_by = ['state','year','event_type']

#grouping by list above and returning the sum of matching columns
new_event_type_df=event_type_df.groupby(cols_to_group_by,as_index=False).sum()

#sorting
new_event_type_df.sort_values(cols_to_group_by,inplace=True)

#checking if there is any duplicate state, year, event type combinations
if True in [*new_event_type_df.duplicated(subset=['state','year','event_type'],keep=False)]:
    print('there is duplicate rows')
else:
    print('there are no duplicate rows')
#selecting all rows where the year is greater than or equal to 1996
new_event_type_df=new_event_type_df.loc[new_event_type_df['year']>=1996]

new_event_type_df.head()

there are no duplicate rows


Unnamed: 0,state,year,event_type,deaths,injuries,damage,infl_adj_damage
1,AK,1996,avalanche,0,0,20000,37304.652645
2,AK,1996,drought,0,0,0,0.0
3,AK,1996,excessive heat,0,0,0,0.0
4,AK,1996,extreme cold,0,0,6000,11191.395793
5,AK,1996,flood,0,0,31000,57822.2116


Almost ready, the inflation adjusted damage column will be changed to an integer data type and all rows with all zero values will be removed

In [14]:
empty_rows=new_event_type_df.loc[(new_event_type_df[['deaths','injuries','damage','infl_adj_damage']]==0).all(axis=1)].index
new_event_type_df.drop(empty_rows,inplace=True)
new_event_type_df=new_event_type_df.astype({'infl_adj_damage':'int64'})
new_event_type_df.head()

Unnamed: 0,state,year,event_type,deaths,injuries,damage,infl_adj_damage
1,AK,1996,avalanche,0,0,20000,37304
4,AK,1996,extreme cold,0,0,6000,11191
5,AK,1996,flood,0,0,31000,57822
7,AK,1996,high winds,0,0,233000,434599
8,AK,1996,ice storm,0,26,115000,214501


This dataframe is ready to export to create a dashboard

To create columns for deaths and injuries normalized by population for each state, a separate data frame will be created without the event_type column. All other columns will be totaled by state and year. Then the US Census website will be scraped to find the population of each state for each year.

In [15]:
states_df=event_type_df.loc[:,['state','year','deaths','injuries','damage','infl_adj_damage']].groupby(['state','year'],as_index=False).sum()
states_df.head()

Unnamed: 0,state,year,deaths,injuries,damage,infl_adj_damage
0,AK,1990,0,0,0,0.0
1,AK,1991,0,0,0,0.0
2,AK,1992,0,0,0,0.0
3,AK,1993,0,0,500000,1012647.058824
4,AK,1994,0,0,0,0.0


Using a table with US census data going back to 1910 to get the population for each state

In [16]:
import requests

#using the requests library to get the html from the website
census_url = 'https://www.census.gov/data/tables/time-series/dec/popchange-data-text.html'
census_html = requests.get(census_url).text

#using the read_html function to create a list of dataframes for each table on the website
pop_df_list=pd.read_html(census_html)

#selecting the first table in the list because there is only one on the webpage
pop_df=pop_df_list[0]

 #preview the data
pop_df.head(20)




Unnamed: 0,State or Region,2020 Census,2010 Census,2000 Census,1990 Census,1980 Census,1970 Census,1960 Census,1950 Census,1940 Census,1930 Census,1920 Census,1910 Census
0,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States
1,Resident Population,331449281,308745538,281421906,248709873,226545805,203211926,179323175,151325798,132165129,123202660,106021568,92228531
2,Percent Change,7.4%,9.7%,13.2%,9.8%,11.5%,13.3%,18.5%,14.5%,7.3%,16.2%,15.0%,21.0%
3,Northeast,Northeast,Northeast,Northeast,Northeast,Northeast,Northeast,Northeast,Northeast,Northeast,Northeast,Northeast,Northeast
4,Resident Population,57609148,55317240,53594378,50809229,49135283,49040703,44677819,39477986,35976777,34427091,29662053,25868573
5,Percent Change,4.1%,3.2%,5.5%,3.4%,0.2%,9.8%,13.2%,9.7%,4.5%,16.1%,14.7%,22.9%
6,Midwest,Midwest,Midwest,Midwest,Midwest,Midwest,Midwest,Midwest,Midwest,Midwest,Midwest,Midwest,Midwest
7,Resident Population,68985454,66927001,64392776,59668632,58865670,56571663,51619139,44460762,40143332,38594100,34019792,29888542
8,Percent Change,3.1%,3.9%,7.9%,1.4%,4.1%,9.6%,16.1%,10.8%,4.0%,13.4%,13.8%,13.5%
9,South,South,South,South,South,South,South,South,South,South,South,South,South


This table has an interesting layout and will need to be rearranged to join it with the dataframe of results and
state names need to be converted to codes to match the dataframe. This will be done with the US library and a function


Installing the US library to convert between state names and codes

In [17]:
%%capture
%pip install us 

Defining the function to convert state names into state codes

In [18]:
%env DC_STATEHOOD = 1 #define before importing the us library to treat DC as a state, otherwise it returns NA
import us

def state_code_lookup(state_name):
    
    if type(state_name)==str:  #making sure the state name is a string
        state_name=state_name.lstrip('.')   #data from a table below had state names preceded by a period, this removes that
        if us.states.lookup(state_name) is not None:  #checking that the function will return a state
            return us.states.lookup(state_name).abbr  #returns the two letter state code
    else:
        return "Unknown" #to avoid returning any NA values if the string isnt a state

env: DC_STATEHOOD=1 #define before importing the us library to treat DC as a state, otherwise it returns NA


Now the table will be converted the format of the other dataframes with rows for each state and year

In [19]:
#setting the index as the state or region column for easier referencing
pop_df.set_index('State or Region',inplace=True)

newrow=[] # list of dictionaries to add the extracted data to

for col in pop_df.columns: #iterating over each column
    for i,x in  enumerate(pop_df[col]): #iterating over each row for the column
        if x == str(us.states.lookup(x)): #checks if the row is a state
            if int(col[:4]) < 1990: #checks that the year is 1990 or later 
                break
            else:
                #adding a dict containing state, population and year to the dict
                newrow.append(
                    
                    {'state':state_code_lookup(x),'population':int((pop_df[col])[i+1]),'year':int(col[:4])}
                    
                    )

#combining all the data into a single data frame
consol_df=pd.DataFrame(newrow)


consol_df.head()


Unnamed: 0,state,population,year
0,AL,5024279,2020
1,AK,733391,2020
2,AZ,7151502,2020
3,AR,3011524,2020
4,CA,39538223,2020


Merging the two sets of data.

In [20]:
merged_df=pd.DataFrame()
merged_df = states_df.merge(consol_df,on=['state','year'],how='left')
merged_df.head()



Unnamed: 0,state,year,deaths,injuries,damage,infl_adj_damage,population
0,AK,1990,0,0,0,0.0,550043.0
1,AK,1991,0,0,0,0.0,
2,AK,1992,0,0,0,0.0,
3,AK,1993,0,0,500000,1012647.058824,
4,AK,1994,0,0,0,0.0,


Because there is only a census every 10 years, there is no population data for the years between censuses. However, an estimate can be made by interpolating the data between census years. This is why 1990 census data was included even though we are not including 1990-1995 in this analysis

In [21]:
%%capture
%pip install numpy
import numpy as np

In [22]:

#merged_df.set_index(['state'],inplace=True) # setting the index to state for easier reference
states =merged_df['state'].drop_duplicates() #list of all states

merged_df.sort_values(['state','year'],inplace=True) #sorting data by state an then year
for state in states:
    mask = merged_df['state'] == state
    merged_df.loc[mask, ['population']] = merged_df.loc[mask, ['population']].interpolate(method='linear')
merged_df.head(11),merged_df.tail()


(   state  year  deaths  injuries    damage  infl_adj_damage  population
 0     AK  1990       0         0         0                0    550043.0
 1     AK  1991       0         0         0                0    557731.9
 2     AK  1992       0         0         0                0    565420.8
 3     AK  1993       0         0    500000   1012647.058824    573109.7
 4     AK  1994       0         0         0                0    580798.6
 5     AK  1995       0         0         0                0    588487.5
 6     AK  1996       1        27  11538000  21521054.110899    596176.4
 7     AK  1997       0         0   2092000   3814543.676012    603865.3
 8     AK  1998       0         0   2791000   5011043.588957    611554.2
 9     AK  1999      18        14   2197000   3859321.938776    619243.1
 10    AK  2000      13        18  10095500  17157366.739257    626932.0,
      state  year  deaths  injuries    damage  infl_adj_damage  population
 1843    WY  2018       4        10  10382000  1

This worked well except for the last two years of data because there wasn't a final value to interpolate between. The census also publishes population estimates for each year which we can access to fix the last two years of population data. This will be done by importing the correct excel file with data from the census bureau's website using beautiful soup

In [23]:
%%capture
%pip install beautifulsoup4
from bs4 import BeautifulSoup as bs

In [24]:
#url for the webpage with the file needed
url = 'https://www.census.gov/data/tables/time-series/demo/popest/2020s-state-total.html#v2022' 


html=requests.get(url).text #extracting the html
soup = bs(html,'html.parser') #creating a beautiful soup object for the webpage
filetracks = soup.find_all('a',filetrack=True,href=True) #selecting all links with a filetrack attribute

#the link we need is the second one on the webpage
filetrack=filetracks[1].get('href')
print(filetrack)


//www2.census.gov/programs-surveys/popest/tables/2020-2022/state/totals/NST-EST2022-POP.xlsx


The url has no scheme so it will not work to download the excel file. The urllib.parse library will be used to add the scheme

In [25]:

from urllib.parse import urlparse, urlunparse


In [26]:
filetrack=urlunparse(urlparse(filetrack,scheme='https'))
filetrack


'https://www2.census.gov/programs-surveys/popest/tables/2020-2022/state/totals/NST-EST2022-POP.xlsx'

Now we can import the file to a dataframe with the openpyxl library

In [27]:
%%capture
%pip install openpyxl



In [28]:
import openpyxl
more_pop_data = pd.read_excel(filetrack)
more_pop_data.head(10)


Unnamed: 0,table with row headers in column A and column headers in rows 3 through 4. (leading dots indicate sub-parts),Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4
0,Annual Estimates of the Resident Population fo...,,,,
1,Geographic Area,"April 1, 2020 Estimates Base",Population Estimate (as of July 1),,
2,,,2020,2021.0,2022.0
3,United States,331449520,331511512,332031554.0,333287557.0
4,Northeast,57609156,57448898,57259257.0,57040406.0
5,Midwest,68985537,68961043,68836505.0,68787595.0
6,South,126266262,126450613,127346029.0,128716192.0
7,West,78588565,78650958,78589763.0,78743364.0
8,.Alabama,5024356,5031362,5049846.0,5074296.0
9,.Alaska,733378,732923,734182.0,733583.0


Only data from columns 1, 2, and 3 are needed so these will be sliced, renamed, and state names converted to codes using the previously defined function.

In [29]:
#trimming to only the data needed
trmd_pop_data=more_pop_data.iloc[:,[0,3,4]] 

#renaming the columns
trmd_pop_data=trmd_pop_data.rename(columns={'table with row headers in column A and column headers in rows 3 through 4. (leading dots indicate sub-parts)':'State','Unnamed: 3':'2021','Unnamed: 4':'2022'})

#changing to state code instead state name to add to other table
trmd_pop_data['State']=trmd_pop_data.iloc[:,0].apply(state_code_lookup) 


trmd_pop_data.head(10)




Unnamed: 0,State,2021,2022
0,,,
1,,,
2,Unknown,2021.0,2022.0
3,,332031554.0,333287557.0
4,,57259257.0,57040406.0
5,,68836505.0,68787595.0
6,,127346029.0,128716192.0
7,,78589763.0,78743364.0
8,AL,5049846.0,5074296.0
9,AK,734182.0,733583.0


This dataframe will need to be converted to the same format as the other dataframe to join them together

In [30]:
#setting state as the index to find population values based on state and year
trmd_pop_data.set_index('State',inplace=True)

row_update=[] # to store dictionaries with data
for state in states: #using a list of states from earlier
    if state in trmd_pop_data.index: #checking if the state is in the new dataframe
        for year in trmd_pop_data.columns: #iterating over each column excluding the first
            row_update.append({'state':state,'year':int(year),'population':trmd_pop_data.loc[state,year]}) #creating a dictionary and adding it to the list

In [31]:
#converting the list of dictionaries to a dataframe
row_update_df = pd.DataFrame(row_update)

#setting the indices of each data frame to the same columns so the update function works properly
merged_df.set_index('year',append=True,inplace=True)
row_update_df.set_index(['state','year'], inplace=True)

#updating the dataframe with the new population data
merged_df.update(row_update_df)
merged_df.tail()


Unnamed: 0_level_0,Unnamed: 1_level_0,state,deaths,injuries,damage,infl_adj_damage,population
Unnamed: 0_level_1,year,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1843,2018,WY,4,10,10382000,12099798.930336,574206.0
1844,2019,WY,6,4,3212000,3676832.083612,575528.5
1845,2020,WY,2,35,27000,30530.715464,576851.0
1846,2021,WY,5,2,5314000,5739265.1216,576851.0
1847,2022,WY,3,2,25392380,25392380.0,576851.0


Now that we have population data for each year we can calculate deaths and injuries per 100k residents

In [32]:
#all remaining NA values will be 0
merged_df.fillna(0,inplace=True)

In [33]:

#resetting the index so state and year are columns
final_df=merged_df.reset_index()

#converting all number columns to integer data types
final_df = final_df.astype({'population':np.int64,'infl_adj_damage':np.int64})

#creating columns for deaths and injuries per 100k people
final_df['deaths/100k']=final_df.apply(lambda x: x['deaths']/x['population']*100000, axis=1)  
final_df['injuries/100k']=final_df.apply(lambda x: x['injuries']/x['population']*100000, axis=1)
final_df.head()

ZeroDivisionError: division by zero

Some rows still don't have population data, these will be selected 

In [34]:
final_df.loc[final_df['population']==0,['state']].drop_duplicates()

Unnamed: 0,state
99,AS
396,GU
891,MP
1650,VI


These are small US territories where population data is hard to find so they will be removed

In [35]:
#creating a list of all row indices where the population is 0
rows_to_drop=final_df.loc[final_df['population']==0].index

#using that list to drop the rows
final_df.drop(index=rows_to_drop,axis=0,inplace=True)

In [36]:
final_df['deaths/100k']=final_df.apply(lambda x: x['deaths']/x['population']*100000, axis=1)  
final_df['injuries/100k']=final_df.apply(lambda x: x['injuries']/x['population']*100000, axis=1)
final_df.tail()

Unnamed: 0,level_0,year,state,deaths,injuries,damage,infl_adj_damage,population,deaths/100k,injuries/100k
1843,1843,2018,WY,4,10,10382000,12099798,574206,0.696614,1.741535
1844,1844,2019,WY,6,4,3212000,3676832,575528,1.042521,0.695014
1845,1845,2020,WY,2,35,27000,30530,576851,0.34671,6.067425
1846,1846,2021,WY,5,2,5314000,5739265,576851,0.866775,0.34671
1847,1847,2022,WY,3,2,25392380,25392380,576851,0.520065,0.34671


Now that there is population data for all years, years 1990-1995 can be dropped as the event types were not recorded as thoroughly according to the NOAA

In [37]:
#selecting all rows where the year is greater than or equal to 1996
final_df=final_df.loc[final_df['year']>=1996]
#checking if there is any duplicate state, year combinations
if True in [*final_df.duplicated(subset=['state','year'],keep=False)]:
    print('there is duplicate rows')
else:
    print('there are no duplicate rows')

there are no duplicate rows


This dataframe is ready to be exported to create visualizations

In [38]:
%%capture
%pip install XlsxWriter


In [39]:
import xlsxwriter
with pd.ExcelWriter(r'c:\Users\skicr\Documents\Python Scripts\NOAA Storm Data\storm_data_by_year.xlsx') as writer:
    final_df.to_excel(writer,sheet_name='state_population_data',index=False,)
    new_event_type_df.to_excel(writer,sheet_name='event_type_data',index=False)  


All Done! This data will be used to create tableau visualizations. [Heat Maps by State](https://public.tableau.com/views/NOAASevereWeatherEventDeathsInjuriesandPropertyDamage1950-2022/DeathsInjuriesandPropertyDamageCausedbySevereWeather1950-2022?:language=en-US&:display_count=n&:origin=viz_share_link), [Heat Maps Story](https://public.tableau.com/shared/KQCJ3WZDG?:display_count=n&:origin=viz_share_link), [Event type Dashboard](https://public.tableau.com/shared/996QDKRMD?:display_count=n&:origin=viz_share_link)

Citations:



NOAA. (2023, April). Storm events database. National Centers for Environmental Information. https://www.ncdc.noaa.gov/stormevents/ftp.jsp 



United States Census Bureau. (2022, August 6). Historical Population Change Data (1910-2020). https://www.census.gov/data/tables/time-series/dec/popchange-data-text.html
			


Annual Estimates of the Resident Population for the United States, Regions, States, District of Columbia, and Puerto Rico: April 1, 2020 to July 1, 2022 (NST-EST2022-POP)				
Source: U.S. Census Bureau, Population Division				
Release Date: December 2022				
