# Consecutive Hot Days, 2013-2022 — Analysis

The code computes the number of consecutive days where the heat index was above 105°F — the limit above which sunstrokes, heat cramps, or heat exhaustion is likely and heat stroke is possible with prolonged exposure and/or physical activity. It also computes the average number of instances of these consecutive days per year for each of the counties and how high the temperatures went during that period.

Read more on the thresholds [here](https://web.archive.org/web/20230619070053/https://www.noaa.gov/jetstream/global/heat-index).

## Importing libraries

In [1]:
import pandas as pd
import time
import statistics
from statistics import mode
import numpy as np

## Importing dataset

In [2]:
data_rec = pd.read_csv("generated_data/recent_hi.csv")
data_rec = data_rec.sort_values(by = ["COUNTY", "DATE"], ascending = True).reset_index(drop = True) 

data_rec

Unnamed: 0,COUNTY,LONG,LAT,ELEV,DATE,RAINFALL,TMIN,TMEAN,TMAX,TDMEAN,VPDMIN,VPDMAX,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY
0,Alachua,-82.3576,29.6748,147.0,2013-01-01,0.0,36.8,53.1,69.3,47.9,0.16,11.53,2013-01,2013,2013-01,2013,35.4,69.6,52.5,82.5
1,Alachua,-82.3576,29.6748,147.0,2013-01-02,0.0,52.4,63.3,74.3,55.6,0.59,13.48,2013-01,2013,2013-01,2013,51.7,74.7,63.1,76.0
2,Alachua,-82.3576,29.6748,147.0,2013-01-03,0.0,58.3,66.9,75.5,62.3,0.51,8.84,2013-01,2013,2013-01,2013,58.1,76.1,67.1,85.2
3,Alachua,-82.3576,29.6748,147.0,2013-01-04,0.2,46.4,53.2,60.0,51.9,0.27,1.42,2013-01,2013,2013-01,2013,45.8,60.1,52.9,95.3
4,Alachua,-82.3576,29.6748,147.0,2013-01-05,0.0,40.9,50.1,59.2,45.7,0.38,5.09,2013-01,2013,2013-01,2013,39.8,59.0,49.4,84.8
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
244679,Washington,-85.6654,30.6106,121.0,2022-12-27,0.0,20.1,36.5,53.0,22.4,0.43,9.86,2022-12,2022,2022-12,2022,17.3,51.8,34.5,56.2
244680,Washington,-85.6654,30.6106,121.0,2022-12-28,0.0,28.1,43.7,59.3,27.9,0.45,11.78,2022-12,2022,2022-12,2022,25.6,58.4,42.0,53.4
244681,Washington,-85.6654,30.6106,121.0,2022-12-29,0.0,29.0,48.0,66.9,41.4,0.57,13.38,2022-12,2022,2022-12,2022,27.1,66.9,47.1,77.8
244682,Washington,-85.6654,30.6106,121.0,2022-12-30,0.0,46.7,61.3,76.0,49.6,0.66,18.21,2022-12,2022,2022-12,2022,45.4,76.2,60.8,65.4


## Marking consecutive days with heat index >= 105°F

### Creating copy of dataframe

In [3]:
df_recent = data_rec.copy()

### Creating list of counties

In [4]:
### loop runs through dataframe
for col in df_recent:
    ### stores county name
    counties = df_recent["COUNTY"].unique()

### prints recorded data
print("Counties: " + str(len(counties)))
print(counties)

Counties: 67
['Alachua' 'Baker' 'Bay' 'Bradford' 'Brevard' 'Broward' 'Calhoun'
 'Charlotte' 'Citrus' 'Clay' 'Collier' 'Columbia' 'DeSoto' 'Dixie' 'Duval'
 'Escambia' 'Flagler' 'Franklin' 'Gadsden' 'Gilchrist' 'Glades' 'Gulf'
 'Hamilton' 'Hardee' 'Hendry' 'Hernando' 'Highlands' 'Hillsborough'
 'Holmes' 'Indian River' 'Jackson' 'Jefferson' 'Lafayette' 'Lake' 'Lee'
 'Leon' 'Levy' 'Liberty' 'Madison' 'Manatee' 'Marion' 'Martin'
 'Miami-Dade' 'Monroe' 'Nassau' 'Okaloosa' 'Okeechobee' 'Orange' 'Osceola'
 'Palm Beach' 'Pasco' 'Pinellas' 'Polk' 'Putnam' 'Santa Rosa' 'Sarasota'
 'Seminole' 'St. Johns' 'St. Lucie' 'Sumter' 'Suwannee' 'Taylor' 'Union'
 'Volusia' 'Wakulla' 'Walton' 'Washington']


### Recording consecutive days

In [5]:
### creates column to mark consecutive days with default as "N"
df_recent["CONSECUTIVE"] = "N"

### loop runs through each row in the dataframe containing county names
for i in range(0, len(counties)):
    
    ### stores county 
    location = counties[i]
    
    ### loop runs through each row in the temperature dataframe
    for j in range(0, (len(df_recent)-1)):
        
        ### checks if the coordinates of two consecutive rows are the same as the location stored from the threshold value dataframe row
        if (df_recent["COUNTY"][j] == location) and (df_recent["COUNTY"][j+1] == location):
            
            ### checks if the maximum heat index of two consecutive rows are above the threshold value
            if df_recent["TMAX_INDEX"][j] >= 105 and df_recent["TMAX_INDEX"][j+1] >= 105:
                
                ### marks both rows as consecutive
                df_recent["CONSECUTIVE"][j] = "Y"
                df_recent["CONSECUTIVE"][j+1] = "Y"

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_recent["CONSECUTIVE"][j] = "Y"
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_recent["CONSECUTIVE"][j+1] = "Y"


In [6]:
### displays dataframe
df_recent[df_recent["CONSECUTIVE"] == "Y"]

Unnamed: 0,COUNTY,LONG,LAT,ELEV,DATE,RAINFALL,TMIN,TMEAN,TMAX,TDMEAN,...,VPDMAX,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY,CONSECUTIVE
153,Alachua,-82.3576,29.6748,147.0,2013-06-03,0.69,70.2,80.5,90.7,72.1,...,23.73,2013-06,2013,2013-06,2013,70.3,112.4,84.6,75.7,Y
154,Alachua,-82.3576,29.6748,147.0,2013-06-04,0.50,71.4,80.8,90.2,72.7,...,22.46,2013-06,2013,2013-06,2013,71.6,111.3,85.4,76.5,Y
159,Alachua,-82.3576,29.6748,147.0,2013-06-09,0.24,70.5,80.7,90.9,71.0,...,24.39,2013-06,2013,2013-06,2013,70.6,110.6,84.5,72.5,Y
160,Alachua,-82.3576,29.6748,147.0,2013-06-10,0.10,71.8,80.6,89.5,73.1,...,20.56,2013-06,2013,2013-06,2013,72.1,110.0,85.2,78.0,Y
161,Alachua,-82.3576,29.6748,147.0,2013-06-11,0.40,71.3,80.2,89.1,72.9,...,20.44,2013-06,2013,2013-06,2013,71.6,109.0,84.4,78.5,Y
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
244582,Washington,-85.6654,30.6106,121.0,2022-09-21,0.00,66.4,79.3,92.2,69.5,...,27.69,2022-09,2022,2022-09,2022,66.3,114.6,79.8,72.1,Y
244583,Washington,-85.6654,30.6106,121.0,2022-09-22,0.00,65.6,79.6,93.5,69.4,...,31.82,2022-09,2022,2022-09,2022,65.4,118.3,82.4,71.1,Y
244584,Washington,-85.6654,30.6106,121.0,2022-09-23,0.00,66.6,80.8,95.0,70.9,...,31.82,2022-09,2022,2022-09,2022,66.5,124.7,84.6,72.0,Y
244587,Washington,-85.6654,30.6106,121.0,2022-09-26,0.00,61.2,76.1,91.0,68.8,...,25.67,2022-09,2022,2022-09,2022,60.9,115.6,76.6,78.2,Y


### Filtering for only consecutive days above threshold

In [7]:
### creates new dataframe after dropping all rows which are not above threshold
df_recent_H = df_recent[(df_recent["CONSECUTIVE"] == "Y")]
df_recent_H.reset_index(drop = True, inplace = True) 

### displays dataframe
df_recent_H

Unnamed: 0,COUNTY,LONG,LAT,ELEV,DATE,RAINFALL,TMIN,TMEAN,TMAX,TDMEAN,...,VPDMAX,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY,CONSECUTIVE
0,Alachua,-82.3576,29.6748,147.0,2013-06-03,0.69,70.2,80.5,90.7,72.1,...,23.73,2013-06,2013,2013-06,2013,70.3,112.4,84.6,75.7,Y
1,Alachua,-82.3576,29.6748,147.0,2013-06-04,0.50,71.4,80.8,90.2,72.7,...,22.46,2013-06,2013,2013-06,2013,71.6,111.3,85.4,76.5,Y
2,Alachua,-82.3576,29.6748,147.0,2013-06-09,0.24,70.5,80.7,90.9,71.0,...,24.39,2013-06,2013,2013-06,2013,70.6,110.6,84.5,72.5,Y
3,Alachua,-82.3576,29.6748,147.0,2013-06-10,0.10,71.8,80.6,89.5,73.1,...,20.56,2013-06,2013,2013-06,2013,72.1,110.0,85.2,78.0,Y
4,Alachua,-82.3576,29.6748,147.0,2013-06-11,0.40,71.3,80.2,89.1,72.9,...,20.44,2013-06,2013,2013-06,2013,71.6,109.0,84.4,78.5,Y
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
73936,Washington,-85.6654,30.6106,121.0,2022-09-21,0.00,66.4,79.3,92.2,69.5,...,27.69,2022-09,2022,2022-09,2022,66.3,114.6,79.8,72.1,Y
73937,Washington,-85.6654,30.6106,121.0,2022-09-22,0.00,65.6,79.6,93.5,69.4,...,31.82,2022-09,2022,2022-09,2022,65.4,118.3,82.4,71.1,Y
73938,Washington,-85.6654,30.6106,121.0,2022-09-23,0.00,66.6,80.8,95.0,70.9,...,31.82,2022-09,2022,2022-09,2022,66.5,124.7,84.6,72.0,Y
73939,Washington,-85.6654,30.6106,121.0,2022-09-26,0.00,61.2,76.1,91.0,68.8,...,25.67,2022-09,2022,2022-09,2022,60.9,115.6,76.6,78.2,Y


## Computing number of instances of consecutive days above threshold per county

In [8]:
### creates new column to be written into
df_recent_H["SL_COUNTER"] = ""

### converts data in the "DATE" column to date-time format
df_recent_H["DATE"] =  pd.to_datetime(df_recent_H["DATE"], infer_datetime_format = True)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_recent_H["SL_COUNTER"] = ""
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_recent_H["DATE"] =  pd.to_datetime(df_recent_H["DATE"], infer_datetime_format = True)


### Recording number of instances

In [9]:
### loop runs through each row in the dataset containing county names
for i in range(0, len(counties)):
    
    ### stores county
    location = counties[i]
    
    ### initializes a variable to count number of heat waves in that location
    count = 1
    
    ### loop runs through each row in the temperature dataframe
    for j in range(0, (len(df_recent_H)-1)):

        ### checks if the coordinates of two consecutive rows are the same as the location stored from the threshold value dataframe row
        if (df_recent_H["COUNTY"][j] == location) and (df_recent_H["COUNTY"][j+1] == location):
            
                ### calculates and stores the difference between the two dates in the consecutive row
                result = str(df_recent_H["DATE"][j+1] - df_recent_H["DATE"][j])
                
                ### checks if the difference between the dates is 1 day
                if result == "1 days 00:00:00":
                    
                    ### marks both as experiencing the same heat wave
                    df_recent_H["SL_COUNTER"][j] = count
                    df_recent_H["SL_COUNTER"][j+1] = count
                
                ### moves on to next heat wave number if condition is not met
                else:
                    ### increases counter
                    count = count + 1

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_recent_H["SL_COUNTER"][j] = count
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_recent_H["SL_COUNTER"][j+1] = count


In [10]:
### displays dataframe
df_recent_H

Unnamed: 0,COUNTY,LONG,LAT,ELEV,DATE,RAINFALL,TMIN,TMEAN,TMAX,TDMEAN,...,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY,CONSECUTIVE,SL_COUNTER
0,Alachua,-82.3576,29.6748,147.0,2013-06-03,0.69,70.2,80.5,90.7,72.1,...,2013-06,2013,2013-06,2013,70.3,112.4,84.6,75.7,Y,1
1,Alachua,-82.3576,29.6748,147.0,2013-06-04,0.50,71.4,80.8,90.2,72.7,...,2013-06,2013,2013-06,2013,71.6,111.3,85.4,76.5,Y,1
2,Alachua,-82.3576,29.6748,147.0,2013-06-09,0.24,70.5,80.7,90.9,71.0,...,2013-06,2013,2013-06,2013,70.6,110.6,84.5,72.5,Y,2
3,Alachua,-82.3576,29.6748,147.0,2013-06-10,0.10,71.8,80.6,89.5,73.1,...,2013-06,2013,2013-06,2013,72.1,110.0,85.2,78.0,Y,2
4,Alachua,-82.3576,29.6748,147.0,2013-06-11,0.40,71.3,80.2,89.1,72.9,...,2013-06,2013,2013-06,2013,71.6,109.0,84.4,78.5,Y,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
73936,Washington,-85.6654,30.6106,121.0,2022-09-21,0.00,66.4,79.3,92.2,69.5,...,2022-09,2022,2022-09,2022,66.3,114.6,79.8,72.1,Y,104
73937,Washington,-85.6654,30.6106,121.0,2022-09-22,0.00,65.6,79.6,93.5,69.4,...,2022-09,2022,2022-09,2022,65.4,118.3,82.4,71.1,Y,104
73938,Washington,-85.6654,30.6106,121.0,2022-09-23,0.00,66.6,80.8,95.0,70.9,...,2022-09,2022,2022-09,2022,66.5,124.7,84.6,72.0,Y,104
73939,Washington,-85.6654,30.6106,121.0,2022-09-26,0.00,61.2,76.1,91.0,68.8,...,2022-09,2022,2022-09,2022,60.9,115.6,76.6,78.2,Y,105


### Exporting dataframe

In [11]:
df_recent_H.to_csv("generated_data/hotdays_consecutive.csv", index = False)

## Computing county figures

### Creating spreadsheet to write into

In [12]:
with open("generated_data/hotdays_countydetails.csv", "w") as f1:
    f1.write("COUNTY" + "\t" + "MIN_TEMP_HIGHEST" + "\t" + "MAX_TEMP_HIGHEST" + "\t" + "MEAN_TEMP_HIGHEST" + "\t" + "MIN_INDEX_HIGHEST" + "\t" + "MAX_INDEX_HIGHEST" + "\t" + "MEAN_INDEX_HIGHEST" + "\t" + "SL_COUNTER" + "\t" + "HOT_AVG" + "\t" + "HOT_LONGEST" + "\t" + "HOT_LONGEST_DAYS" + "\n")

### Computing figures and recording in spreadsheet

In [13]:
### loop runs through each row in the dataset containing county names
for i in range(0, len(counties)):
    
    ### stores county
    location = counties[i]
    
    ### creates empty list to later store temperatures
    temps_min = []
    temps_max = []
    temps_mean = []
    
    ### creates empty list to later store heat indices
    index_min = []
    index_max = []
    index_mean = []

    ### creates empty list to later store the serial number from the heat wave counter
    temps_hot_no = []
    
    ### loop runs through each row of the above-threshold dataframe
    for j in range(0, len(df_recent_H)):
        
        ### checks if the coordinates is the same as the stored location 
        if df_recent_H["COUNTY"][j] == location:
            
            ### fills up temperature details in the created lists
            temps_min.append(df_recent_H["TMIN"][j])
            temps_max.append(df_recent_H["TMAX"][j])
            temps_mean.append(df_recent_H["TMEAN"][j])
            
            ### fills up heat indices details in the created lists
            index_min.append(df_recent_H["TMIN_INDEX"][j])
            index_max.append(df_recent_H["TMAX_INDEX"][j])
            index_mean.append(df_recent_H["TMEAN_INDEX"][j])

            ### fills up instance number in the created list
            temps_hot_no.append(df_recent_H["SL_COUNTER"][j])
    
    ### calculates and stores the highest temperatures, heat indices and threshold values recorded during the periods
    min_max = np.max(temps_min)
    max_max = np.max(temps_max)
    mean_max = np.max(temps_mean)
    indexmin_max = np.max(index_min)
    indexmax_max = np.max(index_max)
    indexmean_max = np.max(index_mean)
    
    ### stores the total number of instances of consecutive hot days per location
    hot_count = np.max(temps_hot_no)
    ### calculates and stores the average number of instances per year
    hot_avg = int(hot_count/10)
    
    ### stores the instance number for the one that lasts the longest
    hot_mode = mode(temps_hot_no)
    ### calculates the number of days the longest instance lasted
    hot_max = temps_hot_no.count(hot_mode)
    
    ### writes into spreadsheet
    with open("generated_data/hotdays_countydetails.csv", "a") as f2:
        f2.write(str(location) + "\t" + str(min_max) + "\t" + str(max_max) + "\t" + str(mean_max) + "\t"
                 + str(indexmin_max) + "\t" + str(indexmax_max) + "\t" + str(indexmean_max) + "\t" + str(hot_count) + 
                 "\t" + str(hot_avg) + "\t" + str(hot_mode) + "\t" + str(hot_max) + "\n")

### Imports and displays dataset

In [14]:
df_hotdays_details = pd.read_csv("generated_data/hotdays_countydetails.csv", sep = "\t")

df_hotdays_details

Unnamed: 0,COUNTY,MIN_TEMP_HIGHEST,MAX_TEMP_HIGHEST,MEAN_TEMP_HIGHEST,MIN_INDEX_HIGHEST,MAX_INDEX_HIGHEST,MEAN_INDEX_HIGHEST,SL_COUNTER,HOT_AVG,HOT_LONGEST,HOT_LONGEST_DAYS
0,Alachua,79.2,99.6,88.2,79.7,133.2,98.3,119,11,66,62
1,Baker,77.6,100.8,88.4,78.2,137.3,98.9,112,11,109,71
2,Bay,80.2,98.6,88.5,84.1,131.3,99.3,101,10,26,40
3,Bradford,78.1,99.9,88.6,78.7,138.2,100.1,115,11,65,61
4,Brevard,81.2,96.4,86.6,86.4,126.0,98.9,94,9,91,84
...,...,...,...,...,...,...,...,...,...,...,...
62,Union,78.4,102.0,89.8,79.0,143.0,101.7,108,10,105,74
63,Volusia,78.9,98.6,87.1,79.6,130.3,98.5,113,11,38,69
64,Wakulla,77.1,101.9,88.0,77.3,136.7,96.9,107,10,28,37
65,Walton,77.5,100.4,87.6,77.9,133.1,97.4,107,10,72,31
