# start_pakistan_correlations
## HW_calcTriggers.ipynb
This script calculates the triggers for the Pakistan heatwave locations using the new 1-in-2 years return period GFS 2-meter temperature thresholds. Data is for 2004-2022 and was provided by Ross Maidment (University of Reading). Data was aggregated from 3 hourly to daily values at each site. Here, we create a record of triggers / activations based on exceedance of the 2mT thresholds.

In [3]:
from pathlib import Path
import os
import sys
import pandas as pd
import numpy as np
from datetime import datetime
import matplotlib.pyplot as plt
from scipy import stats

In [None]:
# Set the root path
rootPath = Path('C:/Users/alexa/Documents/02_work/02_start/02_deliv/05_pk_correlation/hw/data')

## Load the thresholds

In [15]:
thresholds = pd.read_csv(rootPath/'city_thresholds.csv')

In [16]:
thresholds

Unnamed: 0,City,Threshold_2mT
0,Lahore,40.5
1,Multan,42.0
2,Sibi,35.75
3,Jacobabad,42.75
4,Nawabshah,39.75
5,Karachi_Jinnah_Airport,34.0


## Define the monitoring window
April to July inclusive

In [25]:
months = list(range(4,8))
months

[4, 5, 6, 7]

## Load the daily GFS data

In [8]:
dataPath = rootPath/'city_extracts_2023'
sites = [item.stem for item in list(dataPath.iterdir())]

In [130]:
# Function to identify triggers
def getTriggers(siteDataSub, siteThresh):
    excs=[]
    triggers=[]
    excBefore=0
    switch=1
    for j, dayTemp in enumerate(siteDataSub.t2m_cel):

        # First check if threshold is exceeded on the day
        if dayTemp>=siteThresh:
            exc=1
        else:
            exc=0

        # If threshold exceeded on two consecutive days, mark a trigger
        if exc==1 and excBefore==1 and switch==1:
            trig=1 
        else:
            trig=0

        # But, if a trigger has already been registered, and temp has not gone below threshold, do not register any more triggers until temp has dipped below
        if exc==1 and excBefore==1:
            switch=0
        else:
            switch=1

        # Record the result
        excs.append(exc)
        triggers.append(trig)

        # Remember the day before
        excBefore = exc
        
    return triggers

In [131]:
# Create folder for saving outputs
keepCols=['t2m_cel','trigger']
outPath=rootPath/'city_triggers'
outPath.mkdir(exist_ok=True)

In [136]:
# Loop through sites and compute triggers

# Record triggers for all sites in df
siteTriggers = pd.DataFrame(data=[], index=['site','threshold','time','t2m_cel','trigger']).T

for i, site in enumerate(sites):
    
    # Read the daily data
    siteFile = dataPath/'{0}/gfsanl_daily_{1}_allyears.csv'.format(site, site)
    siteData = pd.read_csv(siteFile)
    
    # Filter for monitoring months
    siteData.index = pd.to_datetime(siteData.time)
    siteData['month'] = [item.month for item in siteData.index]
    siteDataSub = siteData[siteData.month.isin(months)]
    
    # Get the threshold
    siteThresh = thresholds.Threshold_2mT[thresholds.City==site].to_list()[0]
    
    # Compute the number of triggers - must be exceeded on two consecutive days
    triggers = getTriggers(siteDataSub, siteThresh)
    siteDataSub['trigger']=triggers
    siteDataSub=siteDataSub[keepCols]
    
    # Write out as csv for each site
    siteDataSub.to_csv(outPath/(site+'.csv'), index=True)
    
    # Record the trigger days
    triggerData=siteDataSub[siteDataSub.trigger==1].reset_index()
    triggerData['site']=site
    triggerData['threshold']=siteThresh
    siteTriggers = pd.concat(objs=[siteTriggers, triggerData])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  siteDataSub['trigger']=triggers
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  siteDataSub['trigger']=triggers
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  siteDataSub['trigger']=triggers
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = 

In [139]:
siteTriggers['month'] = [item.month for item in siteTriggers.time]
siteTriggers['year'] = [item.year for item in siteTriggers.time]

In [141]:
# Save summary as csv file
siteTriggers.to_csv(outPath/('triggersSummary.csv'), index=False)

### Modify to alternative format with all years


In [179]:
# Create df of all years
allYears = pd.DataFrame(data=[list(range(2004,2023))], index=['year']).T
allYears.index = allYears.year
allYears = allYears.drop('year', axis=1)

# Loop through sites and extract trigger years and add to summary dataframe
for i, site in enumerate(sites):
    siteYears = pd.DataFrame(data=[siteTriggers[siteTriggers.site==site].groupby(by=['year']).count().trigger], index=[site]).T
    
    if i==0:
        triggerYears = allYears.merge(siteYears, on='year', how='outer')
    else:
        triggerYears = triggerYears.merge(siteYears, on='year', how='outer')
    
triggerYears = triggerYears.fillna(0)
triggerYears.to_csv(outPath/('triggerYears.csv'), index=True)

In [180]:
triggerYears

Unnamed: 0_level_0,Jacobabad,Karachi_Jinnah_Airport,Lahore,Multan,Nawabshah,Sibi
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2004,0.0,0.0,0.0,0.0,0.0,0.0
2005,0.0,1.0,1.0,1.0,0.0,0.0
2006,1.0,0.0,0.0,0.0,0.0,0.0
2007,0.0,1.0,1.0,1.0,0.0,0.0
2008,0.0,0.0,0.0,0.0,0.0,0.0
2009,1.0,0.0,1.0,0.0,0.0,0.0
2010,0.0,0.0,0.0,2.0,0.0,0.0
2011,0.0,0.0,0.0,1.0,0.0,0.0
2012,0.0,0.0,1.0,1.0,0.0,0.0
2013,0.0,1.0,2.0,0.0,0.0,0.0
