# State Average Temperatures — Analysis

The code analyzes the daily temperature data to compute Florida's daily average temperatures and heat indices. 

Read more on the thresholds [here](https://web.archive.org/web/20230619070053/https://www.noaa.gov/jetstream/global/heat-index).

## Importing libraries

In [1]:
import pandas as pd
import time
from modules import relHumidity, heatIndex

## Importing datasets and creating master dataframe

In [2]:
data_hist = pd.read_csv("generated_data/historic_hi.csv")
data_rec = pd.read_csv("generated_data/recent_hi.csv")

Note: Data from 2023 is excluded since it is incomplete

In [3]:
### creating master dataframe
master = pd.concat([data_hist, data_rec], ignore_index = True)

### displays dataframe
master

Unnamed: 0,COUNTY,LONG,LAT,ELEV,DATE,RAINFALL,TMIN,TMEAN,TMAX,TDMEAN,VPDMIN,VPDMAX,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY
0,Alachua,-82.3576,29.6748,147.0,1990-01-01,0.24,51.7,65.8,79.8,60.9,0.16,18.00,1990-01,1990,1990-01,1990,51.1,84.3,65.9,84.2
1,Alachua,-82.3576,29.6748,147.0,1990-01-02,0.00,35.3,47.1,58.9,34.9,0.93,10.33,1990-01,1990,1990-01,1990,33.4,58.2,45.8,62.3
2,Alachua,-82.3576,29.6748,147.0,1990-01-03,0.00,40.0,52.6,65.3,44.0,1.15,11.13,1990-01,1990,1990-01,1990,38.6,65.1,51.8,72.5
3,Alachua,-82.3576,29.6748,147.0,1990-01-04,0.00,47.6,60.4,73.1,51.4,0.57,14.42,1990-01,1990,1990-01,1990,46.5,73.3,60.0,72.2
4,Alachua,-82.3576,29.6748,147.0,1990-01-05,0.00,52.0,64.4,76.8,60.3,0.47,13.53,1990-01,1990,1990-01,1990,51.5,77.5,64.5,86.6
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
978865,Washington,-85.6654,30.6106,121.0,2013-12-27,0.00,37.1,47.9,58.7,39.1,0.55,7.94,2013-12,2013,2013-12,2013,35.5,58.2,46.8,71.4
978866,Washington,-85.6654,30.6106,121.0,2013-12-28,0.00,44.1,51.7,59.2,38.4,1.55,9.30,2013-12,2013,2013-12,2013,42.6,58.4,50.6,60.3
978867,Washington,-85.6654,30.6106,121.0,2013-12-29,2.09,47.3,54.1,61.0,49.4,0.29,4.85,2013-12,2013,2013-12,2013,46.5,60.9,53.6,84.1
978868,Washington,-85.6654,30.6106,121.0,2013-12-30,0.02,37.0,46.8,56.6,43.7,0.21,4.11,2013-12,2013,2013-12,2013,35.8,56.4,46.1,88.9


## Computing average daily state figures

### Creates dataframe

In [4]:
### selects columns
columns = ["TMIN", "TMAX", "TMEAN", "TDMEAN"]

### computes mean values for that date
state_daily_avg = master.groupby("DATE", as_index = False)[columns].mean()

### displays dataframe
state_daily_avg

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN
0,1983-01-01,55.591045,67.102985,61.346269,57.285075
1,1983-01-02,57.086567,68.802985,62.944776,59.405970
2,1983-01-03,56.510448,73.928358,65.225373,60.722388
3,1983-01-04,46.028358,63.697015,54.868657,49.065672
4,1983-01-05,45.962687,63.850746,54.904478,46.338806
...,...,...,...,...,...
14605,2022-12-27,31.180597,53.091045,42.131343,28.841791
14606,2022-12-28,36.585075,63.014925,49.802985,35.816418
14607,2022-12-29,39.023881,70.553731,54.779104,47.120896
14608,2022-12-30,50.322388,75.943284,63.134328,55.567164


### Computing heat indices

In [5]:
### computes length of dataframe
len_st_daily_avg = len(state_daily_avg)

In [6]:
### creates columns to write into
state_daily_avg["TMIN_INDEX"] = ""
state_daily_avg["TMAX_INDEX"] = ""
state_daily_avg["TMEAN_INDEX"] = ""
state_daily_avg["REL_HUMIDITY"] = ""

### loop runs through dataframe
for i in range(0, len_st_daily_avg):
    
    ### stores mean temperature and mean dewpoint
    temp_mean = state_daily_avg["TMEAN"][i]
    dew = state_daily_avg["TDMEAN"][i]
    
    ### calls predefined method and computes relative humidity
    state_daily_avg["REL_HUMIDITY"][i] = round(relHumidity(temp_mean, dew), 1)
    
    ### calls predefined method and computes heat indices 
    state_daily_avg["TMIN_INDEX"][i] = heatIndex(state_daily_avg["TMIN"][i], state_daily_avg["REL_HUMIDITY"][i])
    state_daily_avg["TMAX_INDEX"][i] = heatIndex(state_daily_avg["TMAX"][i], state_daily_avg["REL_HUMIDITY"][i])
    state_daily_avg["TMEAN_INDEX"][i] = heatIndex(state_daily_avg["TMEAN"][i], state_daily_avg["REL_HUMIDITY"][i])
    
    ### prints out a progress report showing number of rows processed    
    if(i%500 == 0):
        print("Completed: ", str(int(i/len_st_daily_avg * 100)), "%")
        print("Rows completed: ", str(i))
        print("_______")
    elif(i == (len_st_daily_avg - 1)):
        print("Completed: 100%")
        print("_______")
        print("_______") 

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  state_daily_avg["REL_HUMIDITY"][i] = round(relHumidity(temp_mean, dew), 1)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  state_daily_avg["TMIN_INDEX"][i] = heatIndex(state_daily_avg["TMIN"][i], state_daily_avg["REL_HUMIDITY"][i])
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  state_daily_avg["TMAX_INDEX"][i] = heatIndex(state_daily_avg["TMAX"][i], state_daily_avg["REL_HUMIDITY"][i])
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats 

Completed:  0 %
Rows completed:  0
_______
Completed:  3 %
Rows completed:  500
_______
Completed:  6 %
Rows completed:  1000
_______
Completed:  10 %
Rows completed:  1500
_______
Completed:  13 %
Rows completed:  2000
_______
Completed:  17 %
Rows completed:  2500
_______
Completed:  20 %
Rows completed:  3000
_______
Completed:  23 %
Rows completed:  3500
_______
Completed:  27 %
Rows completed:  4000
_______
Completed:  30 %
Rows completed:  4500
_______
Completed:  34 %
Rows completed:  5000
_______
Completed:  37 %
Rows completed:  5500
_______
Completed:  41 %
Rows completed:  6000
_______
Completed:  44 %
Rows completed:  6500
_______
Completed:  47 %
Rows completed:  7000
_______
Completed:  51 %
Rows completed:  7500
_______
Completed:  54 %
Rows completed:  8000
_______
Completed:  58 %
Rows completed:  8500
_______
Completed:  61 %
Rows completed:  9000
_______
Completed:  65 %
Rows completed:  9500
_______
Completed:  68 %
Rows completed:  10000
_______
Completed:  71 %
Ro

In [7]:
### displays dataframe
state_daily_avg

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY
0,1983-01-01,55.591045,67.102985,61.346269,57.285075,55.3,67.3,61.3,86.5
1,1983-01-02,57.086567,68.802985,62.944776,59.405970,56.9,69.2,63.0,88.2
2,1983-01-03,56.510448,73.928358,65.225373,60.722388,56.2,74.5,65.3,85.4
3,1983-01-04,46.028358,63.697015,54.868657,49.065672,45.1,63.6,54.4,80.8
4,1983-01-05,45.962687,63.850746,54.904478,46.338806,44.8,63.6,54.2,72.8
...,...,...,...,...,...,...,...,...,...
14605,2022-12-27,31.180597,53.091045,42.131343,28.841791,29.0,52.0,40.5,59.0
14606,2022-12-28,36.585075,63.014925,49.802985,35.816418,34.6,62.4,48.5,58.4
14607,2022-12-29,39.023881,70.553731,54.779104,47.120896,37.6,70.7,54.1,75.3
14608,2022-12-30,50.322388,75.943284,63.134328,55.567164,49.5,76.4,62.9,76.3


### Finalizing dataframe

In [8]:
### converts date to date-time format
state_daily_avg["DATE"] = pd.to_datetime(state_daily_avg["DATE"])

### converts and stores dates in YYYY and MM-YYYY formats
state_daily_avg["MONTH_YEAR"] = pd.to_datetime(state_daily_avg["DATE"]).dt.to_period("M")
state_daily_avg["YEAR"] = pd.to_datetime(state_daily_avg["DATE"]).dt.to_period("Y")

### converts and stores the dates as strings
state_daily_avg["MONTH_YEAR_STR"] = state_daily_avg["MONTH_YEAR"].astype(str)
state_daily_avg["YEAR_STR"] = state_daily_avg["YEAR"].astype(str)

### displays dataframe
state_daily_avg

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR
0,1983-01-01,55.591045,67.102985,61.346269,57.285075,55.3,67.3,61.3,86.5,1983-01,1983,1983-01,1983
1,1983-01-02,57.086567,68.802985,62.944776,59.405970,56.9,69.2,63.0,88.2,1983-01,1983,1983-01,1983
2,1983-01-03,56.510448,73.928358,65.225373,60.722388,56.2,74.5,65.3,85.4,1983-01,1983,1983-01,1983
3,1983-01-04,46.028358,63.697015,54.868657,49.065672,45.1,63.6,54.4,80.8,1983-01,1983,1983-01,1983
4,1983-01-05,45.962687,63.850746,54.904478,46.338806,44.8,63.6,54.2,72.8,1983-01,1983,1983-01,1983
...,...,...,...,...,...,...,...,...,...,...,...,...,...
14605,2022-12-27,31.180597,53.091045,42.131343,28.841791,29.0,52.0,40.5,59.0,2022-12,2022,2022-12,2022
14606,2022-12-28,36.585075,63.014925,49.802985,35.816418,34.6,62.4,48.5,58.4,2022-12,2022,2022-12,2022
14607,2022-12-29,39.023881,70.553731,54.779104,47.120896,37.6,70.7,54.1,75.3,2022-12,2022,2022-12,2022
14608,2022-12-30,50.322388,75.943284,63.134328,55.567164,49.5,76.4,62.9,76.3,2022-12,2022,2022-12,2022


### Exporting dataframe

In [9]:
state_daily_avg.to_csv("generated_data/state_daily_avg.csv", index = False)

## Computing overall state figures for 1983 to 2022

### Average temperature and dew point temperature

In [10]:
### computes average temperatures
avg_temp = round(state_daily_avg["TMEAN"].mean(), 3)
avg_dp = round(state_daily_avg["TDMEAN"].mean(), 3)

print("Average temperature (1983-2022):", avg_temp)
print("Average dew point temperature (1983-2022):", avg_dp)

Average temperature (1983-2022): 70.536
Average dew point temperature (1983-2022): 61.156


### Average relative humidity

In [11]:
### computes relative humidity 
rel_humidity = round(relHumidity(avg_temp, avg_dp), 1)

print("Average relative humidity (1983-2022):", rel_humidity)

Average relative humidity (1983-2022): 72.2


### Average heat index

In [12]:
### computes heat index
hi_avg = heatIndex(avg_temp, rel_humidity)

print("Average heat index (1983-2022):", hi_avg)

Average heat index (1983-2022): 70.6


## Computing average yearly figures

### Creating dataframe

In [13]:
### displays dataframe
state_daily_avg.head()

Unnamed: 0,DATE,TMIN,TMAX,TMEAN,TDMEAN,TMIN_INDEX,TMAX_INDEX,TMEAN_INDEX,REL_HUMIDITY,MONTH_YEAR,YEAR,MONTH_YEAR_STR,YEAR_STR
0,1983-01-01,55.591045,67.102985,61.346269,57.285075,55.3,67.3,61.3,86.5,1983-01,1983,1983-01,1983
1,1983-01-02,57.086567,68.802985,62.944776,59.40597,56.9,69.2,63.0,88.2,1983-01,1983,1983-01,1983
2,1983-01-03,56.510448,73.928358,65.225373,60.722388,56.2,74.5,65.3,85.4,1983-01,1983,1983-01,1983
3,1983-01-04,46.028358,63.697015,54.868657,49.065672,45.1,63.6,54.4,80.8,1983-01,1983,1983-01,1983
4,1983-01-05,45.962687,63.850746,54.904478,46.338806,44.8,63.6,54.2,72.8,1983-01,1983,1983-01,1983


In [14]:
### selects columns 
columns = ["TMEAN", "TDMEAN"]

### computes yearly averages
state_yearly_avg = state_daily_avg.groupby("YEAR_STR", as_index = False)[columns].mean()

### displays dataframe
state_yearly_avg

Unnamed: 0,YEAR_STR,TMEAN,TDMEAN
0,1983,68.879648,59.571675
1,1984,69.561365,59.768799
2,1985,70.474083,60.424277
3,1986,71.124412,61.022838
4,1987,69.624944,59.520609
5,1988,69.116499,59.065292
6,1989,70.345982,60.824314
7,1990,72.111135,62.093838
8,1991,71.305124,62.232349
9,1992,69.669998,60.23472


### Computing heat indices

In [15]:
### creates column to write into
state_yearly_avg["TMEAN_INDEX"] = " "

### loop runs through dataframe
for i in range(0, len(state_yearly_avg)):
    
    ### stores mean temperature and mean dewpoint
    temp_mean = state_yearly_avg["TMEAN"][i]
    dew = state_yearly_avg["TDMEAN"][i]
    
    ### calls predefined method and computes relative humidity
    rh = round(relHumidity(temp_mean, dew), 1)
    
    ### calls predefined method and computes heat indices
    state_yearly_avg["TMEAN_INDEX"][i] = heatIndex(state_yearly_avg["TMEAN"][i], rh)

state_yearly_avg.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  state_yearly_avg["TMEAN_INDEX"][i] = heatIndex(state_yearly_avg["TMEAN"][i], rh)


Unnamed: 0,YEAR_STR,TMEAN,TDMEAN,TMEAN_INDEX
0,1983,68.879648,59.571675,68.9
1,1984,69.561365,59.768799,69.6
2,1985,70.474083,60.424277,70.5
3,1986,71.124412,61.022838,71.2
4,1987,69.624944,59.520609,69.6


### Computing deviations

In [16]:
### creates column to write into
state_yearly_avg["OVERALL_AVG_TEMP"] = avg_temp
state_yearly_avg["OVERALL_AVG_INDEX"] = hi_avg
state_yearly_avg["DEVIATION_TEMP"] = ""
state_yearly_avg["DEVIATION_INDEX"] = ""

### loop runs through dataframe
for i in range(0, len(state_yearly_avg)):
    
    ### computes and stores difference in temperature between that year and the overall state's average
    state_yearly_avg["DEVIATION_TEMP"][i] = state_yearly_avg["TMEAN"][i] - avg_temp
    
    ### computes and stores difference in heat index between that year and the overall state's average
    state_yearly_avg["DEVIATION_INDEX"][i] = state_yearly_avg["TMEAN_INDEX"][i] - hi_avg

### displays dataframe
state_yearly_avg

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  state_yearly_avg["DEVIATION_TEMP"][i] = state_yearly_avg["TMEAN"][i] - avg_temp
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  state_yearly_avg["DEVIATION_INDEX"][i] = state_yearly_avg["TMEAN_INDEX"][i] - hi_avg


Unnamed: 0,YEAR_STR,TMEAN,TDMEAN,TMEAN_INDEX,OVERALL_AVG_TEMP,OVERALL_AVG_INDEX,DEVIATION_TEMP,DEVIATION_INDEX
0,1983,68.879648,59.571675,68.9,70.536,70.6,-1.656352,-1.7
1,1984,69.561365,59.768799,69.6,70.536,70.6,-0.974635,-1.0
2,1985,70.474083,60.424277,70.5,70.536,70.6,-0.061917,-0.1
3,1986,71.124412,61.022838,71.2,70.536,70.6,0.588412,0.6
4,1987,69.624944,59.520609,69.6,70.536,70.6,-0.911056,-1.0
5,1988,69.116499,59.065292,69.1,70.536,70.6,-1.419501,-1.5
6,1989,70.345982,60.824314,70.4,70.536,70.6,-0.190018,-0.2
7,1990,72.111135,62.093838,72.2,70.536,70.6,1.575135,1.6
8,1991,71.305124,62.232349,71.4,70.536,70.6,0.769124,0.8
9,1992,69.669998,60.23472,69.7,70.536,70.6,-0.866002,-0.9


## Exporting dataframes

In [17]:
state_daily_avg.to_csv("generated_data/state_daily_avg.csv", index = False)
state_yearly_avg.to_csv("generated_data/state_yearly_avg.csv", index = False)