# State Median Temperatures — Analysis

The code analyzes the daily temperature data to compute Florida's daily median temperatures, heat indices and days above certain threshold heat index values. 

Read more on the thresholds [here](https://web.archive.org/web/20230619070053/https://www.noaa.gov/jetstream/global/heat-index).

## Importing libraries

In [1]:
import pandas as pd
import time
from modules import relHumidity, heatIndex

## Importing datasets and creating master dataframe

In [2]:
data_hist = pd.read_csv("generated_data/historic_hi.csv")
data_rec = pd.read_csv("generated_data/recent_hi.csv")

Note: Data from 2023 is excluded since it is incomplete

In [3]:
### creating master dataframe
master = pd.concat([data_hist, data_rec], ignore_index = True)

### displays dataframe
master

Unnamed: 0,COUNTY,LONG,LAT,ELEV,DATE,RAINFALL,TMIN,TMEAN,TMAX,TDMEAN,VPDMIN,VPDMAX,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY
0,Alachua,-82.3576,29.6748,147.0,1990-01-01,0.24,51.7,65.8,79.8,60.9,0.16,18.00,1990-01,1990,1990-01,1990,51.1,84.3,65.9,84.2
1,Alachua,-82.3576,29.6748,147.0,1990-01-02,0.00,35.3,47.1,58.9,34.9,0.93,10.33,1990-01,1990,1990-01,1990,33.4,58.2,45.8,62.3
2,Alachua,-82.3576,29.6748,147.0,1990-01-03,0.00,40.0,52.6,65.3,44.0,1.15,11.13,1990-01,1990,1990-01,1990,38.6,65.1,51.8,72.5
3,Alachua,-82.3576,29.6748,147.0,1990-01-04,0.00,47.6,60.4,73.1,51.4,0.57,14.42,1990-01,1990,1990-01,1990,46.5,73.3,60.0,72.2
4,Alachua,-82.3576,29.6748,147.0,1990-01-05,0.00,52.0,64.4,76.8,60.3,0.47,13.53,1990-01,1990,1990-01,1990,51.5,77.5,64.5,86.6
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
978865,Washington,-85.6654,30.6106,121.0,2013-12-27,0.00,37.1,47.9,58.7,39.1,0.55,7.94,2013-12,2013,2013-12,2013,35.5,58.2,46.8,71.4
978866,Washington,-85.6654,30.6106,121.0,2013-12-28,0.00,44.1,51.7,59.2,38.4,1.55,9.30,2013-12,2013,2013-12,2013,42.6,58.4,50.6,60.3
978867,Washington,-85.6654,30.6106,121.0,2013-12-29,2.09,47.3,54.1,61.0,49.4,0.29,4.85,2013-12,2013,2013-12,2013,46.5,60.9,53.6,84.1
978868,Washington,-85.6654,30.6106,121.0,2013-12-30,0.02,37.0,46.8,56.6,43.7,0.21,4.11,2013-12,2013,2013-12,2013,35.8,56.4,46.1,88.9


## Computing average daily state figures

### Creates dataframe

In [4]:
### selects columns
columns = ["TMIN", "TMAX", "TMEAN", "TDMEAN"]

### computes mean values for that date
state_daily_med = master.groupby("DATE", as_index = False)[columns].mean()

### displays dataframe
state_daily_med

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN
0,1983-01-01,55.591045,67.102985,61.346269,57.285075
1,1983-01-02,57.086567,68.802985,62.944776,59.405970
2,1983-01-03,56.510448,73.928358,65.225373,60.722388
3,1983-01-04,46.028358,63.697015,54.868657,49.065672
4,1983-01-05,45.962687,63.850746,54.904478,46.338806
...,...,...,...,...,...
14605,2022-12-27,31.180597,53.091045,42.131343,28.841791
14606,2022-12-28,36.585075,63.014925,49.802985,35.816418
14607,2022-12-29,39.023881,70.553731,54.779104,47.120896
14608,2022-12-30,50.322388,75.943284,63.134328,55.567164


### Computing heat indices

In [5]:
### computes length of dataframe
len_st_daily_med = len(state_daily_med)

In [6]:
### creates columns to write into
state_daily_med["TMIN_INDEX"] = ""
state_daily_med["TMAX_INDEX"] = ""
state_daily_med["TMEAN_INDEX"] = ""
state_daily_med["REL_HUMIDITY"] = ""

### loop runs through dataframe
for i in range(0, len_st_daily_med):
    
    ### stores mean temperature and mean dewpoint
    temp_mean = state_daily_med["TMEAN"][i]
    dew = state_daily_med["TDMEAN"][i]
    
    ### calls predefined method and computes relative humidity
    state_daily_med["REL_HUMIDITY"][i] = round(relHumidity(temp_mean, dew), 1)
    
    ### calls predefined method and computes heat indices 
    state_daily_med["TMIN_INDEX"][i] = heatIndex(state_daily_med["TMIN"][i], state_daily_med["REL_HUMIDITY"][i])
    state_daily_med["TMAX_INDEX"][i] = heatIndex(state_daily_med["TMAX"][i], state_daily_med["REL_HUMIDITY"][i])
    state_daily_med["TMEAN_INDEX"][i] = heatIndex(state_daily_med["TMEAN"][i], state_daily_med["REL_HUMIDITY"][i])
    
    ### prints out a progress report showing number of rows processed    
    if(i%500 == 0):
        print("Completed: ", str(int(i/len_st_daily_med * 100)), "%")
        print("Rows completed: ", str(i))
        print("_______")
    elif(i == (len_st_daily_med - 1)):
        print("Completed: 100%")
        print("_______")
        print("_______") 

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  state_daily_med["REL_HUMIDITY"][i] = round(relHumidity(temp_mean, dew), 1)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  state_daily_med["TMIN_INDEX"][i] = heatIndex(state_daily_med["TMIN"][i], state_daily_med["REL_HUMIDITY"][i])
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  state_daily_med["TMAX_INDEX"][i] = heatIndex(state_daily_med["TMAX"][i], state_daily_med["REL_HUMIDITY"][i])
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats 

Completed:  0 %
Rows completed:  0
_______
Completed:  3 %
Rows completed:  500
_______
Completed:  6 %
Rows completed:  1000
_______
Completed:  10 %
Rows completed:  1500
_______
Completed:  13 %
Rows completed:  2000
_______
Completed:  17 %
Rows completed:  2500
_______
Completed:  20 %
Rows completed:  3000
_______
Completed:  23 %
Rows completed:  3500
_______
Completed:  27 %
Rows completed:  4000
_______
Completed:  30 %
Rows completed:  4500
_______
Completed:  34 %
Rows completed:  5000
_______
Completed:  37 %
Rows completed:  5500
_______
Completed:  41 %
Rows completed:  6000
_______
Completed:  44 %
Rows completed:  6500
_______
Completed:  47 %
Rows completed:  7000
_______
Completed:  51 %
Rows completed:  7500
_______
Completed:  54 %
Rows completed:  8000
_______
Completed:  58 %
Rows completed:  8500
_______
Completed:  61 %
Rows completed:  9000
_______
Completed:  65 %
Rows completed:  9500
_______
Completed:  68 %
Rows completed:  10000
_______
Completed:  71 %
Ro

In [7]:
### displays dataframe
state_daily_med

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY
0,1983-01-01,55.591045,67.102985,61.346269,57.285075,55.3,67.3,61.3,86.5
1,1983-01-02,57.086567,68.802985,62.944776,59.405970,56.9,69.2,63.0,88.2
2,1983-01-03,56.510448,73.928358,65.225373,60.722388,56.2,74.5,65.3,85.4
3,1983-01-04,46.028358,63.697015,54.868657,49.065672,45.1,63.6,54.4,80.8
4,1983-01-05,45.962687,63.850746,54.904478,46.338806,44.8,63.6,54.2,72.8
...,...,...,...,...,...,...,...,...,...
14605,2022-12-27,31.180597,53.091045,42.131343,28.841791,29.0,52.0,40.5,59.0
14606,2022-12-28,36.585075,63.014925,49.802985,35.816418,34.6,62.4,48.5,58.4
14607,2022-12-29,39.023881,70.553731,54.779104,47.120896,37.6,70.7,54.1,75.3
14608,2022-12-30,50.322388,75.943284,63.134328,55.567164,49.5,76.4,62.9,76.3


### Finalizing dataframe

In [8]:
### converts date to date-time format
state_daily_med["DATE"] = pd.to_datetime(state_daily_med["DATE"])

### converts and stores dates in YYYY and MM-YYYY formats
state_daily_med["MONTH_YEAR"] = pd.to_datetime(state_daily_med["DATE"]).dt.to_period("M")
state_daily_med["YEAR"] = pd.to_datetime(state_daily_med["DATE"]).dt.to_period("Y")

### converts and stores the dates as strings
state_daily_med["MONTH_YEAR_STR"] = state_daily_med["MONTH_YEAR"].astype(str)
state_daily_med["YEAR_STR"] = state_daily_med["YEAR"].astype(str)

### displays dataframe
state_daily_med

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR
0,1983-01-01,55.591045,67.102985,61.346269,57.285075,55.3,67.3,61.3,86.5,1983-01,1983,1983-01,1983
1,1983-01-02,57.086567,68.802985,62.944776,59.405970,56.9,69.2,63.0,88.2,1983-01,1983,1983-01,1983
2,1983-01-03,56.510448,73.928358,65.225373,60.722388,56.2,74.5,65.3,85.4,1983-01,1983,1983-01,1983
3,1983-01-04,46.028358,63.697015,54.868657,49.065672,45.1,63.6,54.4,80.8,1983-01,1983,1983-01,1983
4,1983-01-05,45.962687,63.850746,54.904478,46.338806,44.8,63.6,54.2,72.8,1983-01,1983,1983-01,1983
...,...,...,...,...,...,...,...,...,...,...,...,...,...
14605,2022-12-27,31.180597,53.091045,42.131343,28.841791,29.0,52.0,40.5,59.0,2022-12,2022,2022-12,2022
14606,2022-12-28,36.585075,63.014925,49.802985,35.816418,34.6,62.4,48.5,58.4,2022-12,2022,2022-12,2022
14607,2022-12-29,39.023881,70.553731,54.779104,47.120896,37.6,70.7,54.1,75.3,2022-12,2022,2022-12,2022
14608,2022-12-30,50.322388,75.943284,63.134328,55.567164,49.5,76.4,62.9,76.3,2022-12,2022,2022-12,2022


## Computing overall state figures for 1983 to 2022

### Median temperature and dew point temperature

In [9]:
### computes median temperatures
med_temp = round(state_daily_med["TMEAN"].median(), 3)
med_dp = round(state_daily_med["TDMEAN"].median(), 3)

print("Median temperature (1983-2022):", med_temp)
print("Median dew point temperature (1983-2022):", med_dp)

Median temperature (1983-2022): 72.429
Median dew point temperature (1983-2022): 63.721


### Median relative humidity

In [10]:
### computes relative humidity 
rel_humidity = round(relHumidity(med_temp, med_dp), 1)

print("Median relative humidity (1983-2022):", rel_humidity)

Median relative humidity (1983-2022): 74.2


### Median heat index

In [11]:
### computes heat index
hi_med = heatIndex(med_temp, rel_humidity)

print("Median heat index (1983-2022):", hi_med)

Median heat index (1983-2022): 72.6


## Computing median yearly figures

### Creating dataframe

In [12]:
### displays dataframe
state_daily_med.head()

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR
0,1983-01-01,55.591045,67.102985,61.346269,57.285075,55.3,67.3,61.3,86.5,1983-01,1983,1983-01,1983
1,1983-01-02,57.086567,68.802985,62.944776,59.40597,56.9,69.2,63.0,88.2,1983-01,1983,1983-01,1983
2,1983-01-03,56.510448,73.928358,65.225373,60.722388,56.2,74.5,65.3,85.4,1983-01,1983,1983-01,1983
3,1983-01-04,46.028358,63.697015,54.868657,49.065672,45.1,63.6,54.4,80.8,1983-01,1983,1983-01,1983
4,1983-01-05,45.962687,63.850746,54.904478,46.338806,44.8,63.6,54.2,72.8,1983-01,1983,1983-01,1983


In [13]:
### selects columns 
columns = ["TMEAN", "TDMEAN"]

### computes yearly averages
state_yearly_med = state_daily_med.groupby("YEAR_STR", as_index = False)[columns].median()

### displays dataframe
state_yearly_med

Unnamed: 0,YEAR_STR,TMEAN,TDMEAN
0,1983,71.141791,61.568657
1,1984,71.402239,62.556716
2,1985,73.859701,64.768657
3,1986,72.861194,64.076119
4,1987,70.235821,61.68209
5,1988,70.301493,61.035821
6,1989,72.155224,62.595522
7,1990,72.473134,63.99403
8,1991,73.877612,65.377612
9,1992,70.945522,61.955224


### Computing heat indices

In [14]:
### creates column to write into
state_yearly_med["TMEAN_INDEX"] = " "

### loop runs through dataframe
for i in range(0, len(state_yearly_med)):
    
    ### stores median temperature and median dewpoint
    temp_med = state_yearly_med["TMEAN"][i]
    dew = state_yearly_med["TDMEAN"][i]
    
    ### calls predefined method and computes relative humidity
    rh = round(relHumidity(temp_med, dew), 1)
    
    ### calls predefined method and computes heat indices
    state_yearly_med["TMEAN_INDEX"][i] = heatIndex(state_yearly_med["TMEAN"][i], rh)

state_yearly_med.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  state_yearly_med["TMEAN_INDEX"][i] = heatIndex(state_yearly_med["TMEAN"][i], rh)


Unnamed: 0,YEAR_STR,TMEAN,TDMEAN,TMEAN_INDEX
0,1983,71.141791,61.568657,71.2
1,1984,71.402239,62.556716,71.6
2,1985,73.859701,64.768657,74.1
3,1986,72.861194,64.076119,73.1
4,1987,70.235821,61.68209,70.3


### Computing deviations

In [15]:
### creates column to write into
state_yearly_med["OVERALL_MED_TEMP"] = med_temp
state_yearly_med["OVERALL_MED_INDEX"] = hi_med
state_yearly_med["DEVIATION_TEMP"] = ""
state_yearly_med["DEVIATION_INDEX"] = ""

### loop runs through dataframe
for i in range(0, len(state_yearly_med)):
    
    ### computes and stores difference in temperature between that year and the overall state's average
    state_yearly_med["DEVIATION_TEMP"][i] = state_yearly_med["TMEAN"][i] - med_temp
    
    ### computes and stores difference in heat index between that year and the overall state's average
    state_yearly_med["DEVIATION_INDEX"][i] = state_yearly_med["TMEAN_INDEX"][i] - hi_med

### displays dataframe
state_yearly_med

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  state_yearly_med["DEVIATION_TEMP"][i] = state_yearly_med["TMEAN"][i] - med_temp
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  state_yearly_med["DEVIATION_INDEX"][i] = state_yearly_med["TMEAN_INDEX"][i] - hi_med


Unnamed: 0,YEAR_STR,TMEAN,TDMEAN,TMEAN_INDEX,OVERALL_MED_TEMP,OVERALL_MED_INDEX,DEVIATION_TEMP,DEVIATION_INDEX
0,1983,71.141791,61.568657,71.2,72.429,72.6,-1.287209,-1.4
1,1984,71.402239,62.556716,71.6,72.429,72.6,-1.026761,-1.0
2,1985,73.859701,64.768657,74.1,72.429,72.6,1.430701,1.5
3,1986,72.861194,64.076119,73.1,72.429,72.6,0.432194,0.5
4,1987,70.235821,61.68209,70.3,72.429,72.6,-2.193179,-2.3
5,1988,70.301493,61.035821,70.4,72.429,72.6,-2.127507,-2.2
6,1989,72.155224,62.595522,72.3,72.429,72.6,-0.273776,-0.3
7,1990,72.473134,63.99403,72.7,72.429,72.6,0.044134,0.1
8,1991,73.877612,65.377612,74.2,72.429,72.6,1.448612,1.6
9,1992,70.945522,61.955224,71.1,72.429,72.6,-1.483478,-1.5


## Computing day counts above certain thresholds

In [16]:
### displays dataframe of daily averages
state_daily_med.head()

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR
0,1983-01-01,55.591045,67.102985,61.346269,57.285075,55.3,67.3,61.3,86.5,1983-01,1983,1983-01,1983
1,1983-01-02,57.086567,68.802985,62.944776,59.40597,56.9,69.2,63.0,88.2,1983-01,1983,1983-01,1983
2,1983-01-03,56.510448,73.928358,65.225373,60.722388,56.2,74.5,65.3,85.4,1983-01,1983,1983-01,1983
3,1983-01-04,46.028358,63.697015,54.868657,49.065672,45.1,63.6,54.4,80.8,1983-01,1983,1983-01,1983
4,1983-01-05,45.962687,63.850746,54.904478,46.338806,44.8,63.6,54.2,72.8,1983-01,1983,1983-01,1983


### Mean heat index >= 90°F

#### Filtering data

In [17]:
### filters dataframe
mean90up = state_daily_med[state_daily_med["TMEAN_INDEX"] >= 90].reset_index(drop = True)

### displays dataframe
mean90up

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR
0,1983-07-17,73.471642,94.268657,83.868657,73.219403,73.7,120.4,90.4,70.5,1983-07,1983,1983-07,1983
1,1983-07-18,73.870149,95.753731,84.805970,72.994030,74.0,123.2,91.5,67.9,1983-07,1983,1983-07,1983
2,1983-07-19,73.020896,96.050746,84.537313,72.188060,73.1,122.9,90.5,66.6,1983-07,1983,1983-07,1983
3,1983-07-23,73.704478,94.937313,84.323881,74.262687,73.9,124.4,91.8,71.9,1983-07,1983,1983-07,1983
4,1983-07-24,73.886567,92.150746,83.019403,75.200000,74.2,119.2,90.6,77.4,1983-07,1983,1983-07,1983
...,...,...,...,...,...,...,...,...,...,...,...,...,...
689,2022-08-23,73.955224,91.825373,82.895522,74.879104,74.3,117.5,90.1,76.9,2022-08,2022,2022-08,2022
690,2022-08-25,74.220896,91.404478,82.808955,75.417910,74.6,117.3,90.4,78.5,2022-08,2022,2022-08,2022
691,2022-08-31,74.694030,91.956716,83.329851,75.883582,75.1,119.4,91.6,78.4,2022-08,2022,2022-08,2022
692,2022-09-06,74.459701,92.546269,83.505970,75.720896,74.9,120.8,91.8,77.5,2022-09,2022,2022-09,2022


#### Creating dataframe

In [18]:
### computes yearly figures
days_90up = mean90up["YEAR_STR"].value_counts().sort_index(ascending = True).to_frame().reset_index()

### renames columns
days_90up.columns = ["YEAR", "MEANINDEX_90UP"]

### displays dataframe
days_90up.head()

Unnamed: 0,YEAR,MEANINDEX_90UP
0,1983,18
1,1985,7
2,1986,6
3,1987,23
4,1988,5


### Minimum heat index >= 105°F

In [19]:
### filters dataframe
state_daily_med[state_daily_med["TMIN_INDEX"] >= 105].reset_index(drop = True)

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR


There were no days which meet the criterion.

### Mean heat index >= 105°F

In [20]:
### filters dataframe
state_daily_med[state_daily_med["TMEAN_INDEX"] >= 105].reset_index(drop = True)

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR


There were no days which meet the criterion.

### Maximum heat index >= 105°F

#### Filtering data

In [21]:
### filters dataframe
days105up = state_daily_med[state_daily_med["TMAX_INDEX"] >= 105].reset_index(drop = True)

### displays dataframe
days105up

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR
0,1983-06-04,67.949254,90.682090,79.322388,70.428358,67.9,111.2,79.9,74.3,1983-06,1983,1983-06,1983
1,1983-06-05,68.294030,90.371642,79.328358,69.791045,68.3,109.0,79.9,72.7,1983-06,1983,1983-06,1983
2,1983-06-06,69.350746,88.801493,79.070149,70.674627,69.4,106.0,79.7,75.6,1983-06,1983,1983-06,1983
3,1983-06-07,70.449254,89.114925,79.783582,71.973134,70.6,108.1,83.4,77.2,1983-06,1983,1983-06,1983
4,1983-06-23,71.695522,89.928358,80.811940,73.768657,72.0,112.5,85.9,79.3,1983-06,1983,1983-06,1983
...,...,...,...,...,...,...,...,...,...,...,...,...,...
4240,2022-09-22,70.926866,90.279104,80.605970,72.165672,71.1,110.9,84.8,75.6,2022-09,2022,2022-09,2022
4241,2022-09-23,70.950746,92.165672,81.558209,72.341791,71.1,115.9,86.4,73.7,2022-09,2022,2022-09,2022
4242,2022-09-24,68.874627,88.822388,78.846269,69.895522,68.9,105.1,79.4,74.1,2022-09,2022,2022-09,2022
4243,2022-09-26,69.395522,89.532836,79.464179,70.940299,69.5,108.2,82.6,75.3,2022-09,2022,2022-09,2022


#### Creating dataframe

In [22]:
### computes yearly figures
days_105up = days105up["YEAR_STR"].value_counts().sort_index(ascending = True).to_frame().reset_index()

### renames columns
days_105up.columns = ["YEAR", "MAXINDEX_105UP"]

### displays dataframe
days_105up.head()

Unnamed: 0,YEAR,MAXINDEX_105UP
0,1983,86
1,1984,78
2,1985,111
3,1986,130
4,1987,98


### Maximum heat index >= 130°F

#### Filtering data

In [23]:
### filters dataframe
days130up = state_daily_med[state_daily_med["TMAX_INDEX"] >= 130].reset_index(drop = True)

### displays dataframe
days130up

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR
0,1985-06-04,73.143284,97.744776,85.446269,73.543284,73.2,130.6,92.9,67.8,1985-06,1985,1985-06,1985
1,1985-06-05,72.146269,99.643284,85.88806,73.38806,72.2,136.4,93.3,66.5,1985-06,1985,1985-06,1985
2,1995-08-16,73.268657,97.504478,85.389552,74.780597,73.4,133.1,93.9,70.7,1995-08,1995,1995-08,1995
3,1997-07-05,72.910448,96.031343,84.479104,74.950746,73.1,130.1,92.7,73.2,1997-07,1997,1997-07,1997
4,1998-06-18,73.514925,98.192537,85.858209,73.313433,73.6,130.6,93.2,66.4,1998-06,1998,1998-06,1998
5,1998-06-19,72.885075,99.147761,86.014925,72.758209,72.9,132.3,92.9,64.8,1998-06,1998,1998-06,1998
6,1998-06-20,70.7,98.753731,84.722388,72.138806,70.6,132.4,90.7,66.1,1998-06,1998,1998-06,1998
7,1999-08-01,76.41194,97.877612,87.144776,76.51791,76.7,134.9,98.4,70.9,1999-08,1999,1999-08,1999
8,1999-08-02,75.401493,97.708955,86.555224,75.783582,75.7,133.7,96.7,70.5,1999-08,1999,1999-08,1999
9,1999-08-03,73.201493,97.450746,85.328358,74.523881,73.4,132.4,93.6,70.3,1999-08,1999,1999-08,1999


#### Creating dataframe

In [24]:
### computes yearly figures
days_130up = days130up["YEAR_STR"].value_counts().sort_index(ascending = True).to_frame().reset_index()

### renames columns
days_130up.columns = ["YEAR", "MAXINDEX_130UP"]

### displays dataframe
days_130up.head()

Unnamed: 0,YEAR,MAXINDEX_130UP
0,1985,2
1,1995,1
2,1997,1
3,1998,3
4,1999,4


### Merging dataframes

In [25]:
### merges dataframes
yearly_dayscount = pd.merge(days_90up, days_105up, on = "YEAR", how = "outer")
yearly_dayscount = pd.merge(yearly_dayscount, days_130up, on = "YEAR", how = "outer")

### resets index
yearly_dayscount = yearly_dayscount.sort_values("YEAR").reset_index(drop = True)

### adds columns for filters with no results
yearly_dayscount["MININDEX_105UP"] = 0
yearly_dayscount["MEANINDEX_105UP"] = 0

### controls for null values
yearly_dayscount = yearly_dayscount.fillna(0)
yearly_dayscount = yearly_dayscount.round(0).astype(int)

In [26]:
leap = [1984, 1988, 1992, 1996, 2000, 2004, 2008, 2012, 2016, 2020]
yearly_dayscount["TOTAL_DAYS"] = 365
for i in range(0, len(yearly_dayscount)):
    
    if yearly_dayscount["YEAR"][i] in leap:
        yearly_dayscount["TOTAL_DAYS"][i] = 366

### displays dataframe
yearly_dayscount

Unnamed: 0,YEAR,MEANINDEX_90UP,MAXINDEX_105UP,MAXINDEX_130UP,MININDEX_105UP,MEANINDEX_105UP,TOTAL_DAYS
0,1983,18,86,0,0,0,365
1,1984,0,78,0,0,0,366
2,1985,7,111,2,0,0,365
3,1986,6,130,0,0,0,365
4,1987,23,98,0,0,0,365
5,1988,5,95,0,0,0,366
6,1989,8,114,0,0,0,365
7,1990,14,129,0,0,0,365
8,1991,5,119,0,0,0,365
9,1992,9,97,0,0,0,366


## Exporting dataframes

In [27]:
state_daily_med.to_csv("generated_data/state_daily_med.csv", index = False)
state_yearly_med.to_csv("generated_data/state_yearly_med.csv", index = False)
yearly_dayscount.to_csv("generated_data/state_dayscount.csv", index = False)