# Seasonal Flow Forecasts (SFFs) simulation



Now, we conduct the hydrological forecasts uising Tank model and Seasonal Meteorological Forecasts (SMFs) datasets. This code incorpolates the functions below;


1. <b> [SMFs data generation] </b> Create csv format files combining each ensemble data (precipitation, temperature and evapotranspiration) and convert csv file to the text format which is readable in the Tank model


2. <b> [Tank input data generation </b> Generate the input file of tank model using already estimated parameter and other details, then combine them with SMFs (.txt) data created by the process #1.


3. <b> [Run Tank model] </b> Create batch file for multiple simulation and run.


4. <b> [Output data management] </b> Collect the simulated flow data and save it as csv format. Then integrate the ensemble data at each month.

### 1. SMFs data preparation via  SEAFORM 

Please note that, before we start SFFs (Seasonal Flow Forecasts) simulation, seasonal precipitation (tp; total precipitation), temperature (t2m; 2m temperature) and evapotranspiration (ET; evaporation) forecasts datasets are required as input of hydrological model (Tank). SMFs data management incluing data download, time series generation and bias correction is readily availabe vias our SEAFORM package (SEAsonal FORcast Managment tool, https://github.com/uobwatergroup/seaform.git).

1. Download seasonal forecasts data (p, t, ET) from Copernicus CDS for certain forecasting centre. (SEAFORM module A)
2. Generate daily time series data for each month. (SEAFORM module B)
3. Apply linear scaling, and generate bias corrected time series. (SEAFORM module C)

Place the managed SMFs datasets in relevant folder (original and biascorrected data, refer to the folder path detail shown below '3. Simulation settings').

### 2. Import libraries
Now, we need to import the necessary libraries and tools (🚨 in order to run the code like in the box below, place the mouse pointer in the cell, then click on “run cell” button above or press shift + enter).

In [1]:
import os, re
import pandas as pd
from pandas import Series, DataFrame
import numpy as np
import datetime
from datetime import date

### 3. Simulation settings

In [2]:
forecast_center = 'ECMWF'

# Assign working directory and time series data
path = os.getcwd()

# Input simulation information
catchment_name = 'A'
ratio = {'A':0.23, 'B':0.605}         # This ratio represent Loss/PET and manually (see hydrologic_data.xlsx)
area = {'A':2073.0, 'B':1584.0}       # Catchment area (square kilometers)
sim_title = catchment_name + '_SFFs'
loss_et_ratio = ratio[catchment_name]
catch_area =  area[catchment_name]

warmup_year = 2009  # model warm up start year (model will start to run from this year Janurary first)

start_year = 2011
start_month = 1
start_day = 1
start_date = str(start_month).zfill(2) + '/' + str(start_day).zfill(2) + '/' + str(start_year)
end_year = 2020
end_month = 12
end_day = 31
end_date = str(end_month).zfill(2) + '/' + str(end_day).zfill(2) + '/' + str(end_year)

### 4. Tank model input file generation

#### 4.1. Read basic input file from ESP

Basic input file (without hydrological data) are the same as ESP, therefore, we just need to bring them to the SFFs folder.

In [6]:
import shutil

for num in range(1,13):
    for year in range(start_year, end_year+1):
        for month in range(start_month, end_month+1):
            esp = path + '/analysis/3.ESP/1_input/' + catchment_name + '_' + str(year) + '_' + str(month).zfill(2) + '.txt'
            sffs = path + '/analysis/4.SFFs/1_input/'
            shutil.copy2(esp, sffs)

#### 4.2. Generation of Seasoanl Meteorological Forecasts ensemble

Create hydrological ensemble of precipitation, temperature, ET and observed flow data. Hydrological data of each catchment is shown in /data/ folder. We generate this ensemble scenario every month with multiple SMFs timeseries.

In [10]:
pd.set_option('mode.chained_assignment', None)
folder = {1:'original',2:'biascorrected'}

for bc_type in range(1,3):                        # bias correction type {1:'before_bc',2:'after_bc'}
    for year in range(start_year, end_year+1):
        for month in range(start_month, end_month+1):
            # specify the number of ensemble of each forecasting centre
            # refer to the ensemble size and data availability
            if year > 2016 :
                ens_num = 51
            else :
                ens_num = 25
            # add '_bc' if the data is bias corrected
            if bc_type == 2:
                tail = '_bc'
            else :
                tail = ''
            # read seasonal forecasts data
            tp = pd.read_csv(path + '/data/SMFs/' + forecast_center + '/' + folder[bc_type] + '/tp/' + catchment_name + '_' 
                             + str(year) + '_' + str(month).zfill(2) + '_' + forecast_center.lower() + '_tp' + tail + '.csv')
            t2m = pd.read_csv(path + '/data/SMFs/' + forecast_center + '/' + folder[bc_type] + '/t2m/' + catchment_name + '_' 
                              + str(year) + '_' + str(month).zfill(2) + '_' + forecast_center.lower() + '_t2m' + tail + '.csv')
            et = pd.read_csv(path + '/data/SMFs/' + forecast_center + '/' + folder[bc_type] + '/et/' + catchment_name + '_' 
                             + str(year) + '_' + str(month).zfill(2) + '_' + forecast_center.lower() + '_et' + tail + '.csv')
            # Date format change
            tp['time2'] = pd.to_datetime(tp['time'], infer_datetime_format=True).dt.strftime("%Y-%m-%d")
            # Save the date data as merged
            merged = pd.DataFrame(data=tp['time2']).rename(columns = {'time2':'time'})

            for i in range(0,ens_num):         # iteration for all ensemble members
            
                name = 'sc_' + str(i)
                merged['prec'] = round(tp[name],2)                # allocate precipitation column
                merged['prec'][merged['prec'] < 0] = 0            # correct negative precipitation data
                merged['ET'] = round(et[name],2)                  # allocate ET column
                merged['temp'] = round(t2m[name],2)               # allocate temperature column
                
                date = list(merged['time'])
                rain = list(merged['prec'])
                ET = list(merged['ET'])
                temp = list(merged['temp'])

                file = open(path + '/analysis/4.SFFs/2_ensemble/' + folder[bc_type] +'/' + catchment_name + '_' + str(year) + '_' + str(month).zfill(2) + '_' + str(i) + '.txt', "w")
        
                for index in range(len(merged)):
                       file.write(str(date[index]).rjust(10) + str(rain[index]).rjust(10) + str(ET[index]).rjust(20) 
                                  + str(temp[index]).rjust(10) + "\n")
                file.close()
print('Ensemble datasets for SFFs Monthly ESP are created!')

Ensemble datasets for SFFs Monthly ESP are created!


#### 4.3. Combine the basic input (from 4.1) and ensemble data (from 4.2)

This code enables the combination between the basic simulation input file generated from process 4.1 and ensemble data from process 4.2. When you run this code, you can get the acutal input files for Tank model.

In [11]:
folder = {1:'original',2:'biascorrected'}

for bc_type in range(1,3):
    for month in range(start_month, end_month+1):
        for year in range(start_year, end_year+1):
    
            data = data2 = ""

            if year > 2016 : 
                lim = 51
            else :
                lim = 25
        
            for scenario in range(0, lim):
  
        # Reading data from file1
                with open(path + '/analysis/4.SFFs/1_input/' + catchment_name + '_' + str(year) + '_' + str(month).zfill(2) + '.txt') as fp:
                    data = fp.read()
  
                with open(path + '/analysis/4.SFFs/2_ensemble/' + folder[bc_type] +'/' + catchment_name + '_' + str(year) + '_' + str(month).zfill(2) + '_' + str(scenario) + '.txt') as fp:
                    data2 = fp.read()
  
                data += data2
  
                with open (path + '/analysis/4.SFFs/3_run/' + folder[bc_type] +'/' + catchment_name + '_' + str(year) + '_' + str(month).zfill(2) + '_' + str(scenario) + '.txt', 'w') as fp:
                    fp.write(data)
print('Now, Tank model input files are generated.')

Now, Tank model input files are generated.


### 5. Run Tank model

Input data for the SFFs simulation has created, and now we run the Tank model for each scenario. For the simulation, 'Sim_SMTank.exe' file should exist in '4.SFFs/3_run' folder.

#### 5.1 Batch file generation for the multiple simulation

To run the Tank model with multiple scenarios, this code generates the batch file. We define model name and input and out file names in order.

In [12]:
folder = {1:'original',2:'biascorrected'}

# allocate the input and output file names
for bc_type in range(1,3):     #bias correction type : original (non-bias corrected), biascorrected
    
    bat_file = open(path + '/analysis/4.SFFs/3_run/' + folder[bc_type] + '/run_sffs.bat', "w")
    
    for year in range(start_year, end_year +1):
        for month in range(1, 13):
            if year > 2016 : 
                lim = 51    # Number of ensemble after 2017
            else :
                lim = 25    # Number of ensemble before 2017
        
            for scenario in range(0, lim):
                bat_file.write(str('Sim_SMTank') + " " + catchment_name + '_' + str(year) + '_' + str(month).zfill(2) 
                               + '_' + str(scenario) + '.txt' + " " + catchment_name + '_' + str(year) + '_' 
                               + str(month).zfill(2) + '_' + str(scenario) + '.out'+ "\n")
bat_file.close()

#### 5.2 Execute batch file to run Tank model

Now, we run the Tank model using batch file that we have just made. If your are working with multiple years and months, it will take quite long time to terminate the whole simulation.

In [13]:
# set working directory and batch file

for bc_type in range(1,3):
    workingDir = (path + '/analysis/4.SFFs/3_run/' + folder[bc_type])
    executeFile = (path + '/analysis/4.SFFs/3_run/' + folder[bc_type] + '/run_sffs.bat')

    def run(path):
        os.chdir(workingDir)
        os.system(path)
    run(executeFile)

print('Tank model simulation has completed!')

Tank model simulation has completed!


### 6. Result data management

Simulation for ESP has terminated, and now it is time to manage the output data. 

#### 6.1 Select simulated flow data form output files

Output file contains diverse kinds of information such as precipitation, temperature and simulated flow. What we need is date and simulated flow. This code enables simulated flow data selection and add date information, then save the selected data to a csv file format.

Note that, the unit of simulated flow is Cubic Meters per Second (CMS).

In [14]:
for bc_type in range(1,3):
    for year in range(start_year, end_year+1):
        for month in range(start_month,end_month+1):
        
            if year > 2016 : 
                lim = 51
            else :
                lim = 25
                
            for scenario in range(0, lim):
                # calculate data gap between the warm up start date and the start of simulation
                gap = datetime.datetime(year, month, 1) - datetime.datetime(int(warmup_year), 1, 1)
                # select the column range (horizontal) for simulated flow data (No need to change this)
                colspecs = [(55,64)]
                # select the simulated flow data excluding the unnecessary rows (No need to chang this)
                data_sel = pd.read_fwf(path + '/analysis/4.SFFs/3_run/' + folder[bc_type] + '/' + catchment_name + '_' 
                                       + str(year) +  '_' + str(month).zfill(2) +  '_'  + str(scenario) + '.out', 
                                       skiprows=48+gap.days, skipfooter=41, colspecs=colspecs, 
                                       names=['Q_sim_' + str(scenario)])
                # allocate date
                index=pd.date_range(datetime.datetime(year, month, 1), periods=len(data_sel)) 
                data_sel['date'] = index
                data_sel = data_sel.set_index('date').reset_index()
                # save the selected data as csv format
                new_csv_file = data_sel.to_csv(path + '/analysis/4.SFFs/3_run/' + folder[bc_type] + '/' + catchment_name + '_' 
                                               + str(year) +  '_' + str(month).zfill(2) +  '_'  + str(scenario) + '.csv')
print('Data selection is completed!')

Data selection is completed!


#### 6.2 Monthly ensemble integration

By the previous process, we generated the ensemble of simulated flow for each month. However, there are too many datasets to deal with and it is not efficient to manage the data. Therefore, we need to integrate every ensemble at each month.

In [None]:
for bc_type in range(1,3):
    for year in range(start_year, end_year+1):
        for month in range(start_month, end_month+1):
            # read the observed flow data to insert in the dataset we are going to generate
            obs_flow = pd.read_csv(path + '/data/' + catchment_name + '_hydrologic_data.csv')
            obs_flow['date'] = obs_flow['date'].astype('datetime64[ns]')    # date type
        
            # read the first year (same as scenario 0) data to use it as the first column
            df_head = pd.read_csv(path + '/analysis/4.SFFs/3_run/'  + folder[bc_type] + '/' + catchment_name + '_' + str(year) 
                                  +  '_' + str(month).zfill(2) + '_' + str(0) + '.csv')
            df_head['date'] = df_head['date'].astype('datetime64[ns]')
            # insert lead time column
            df_head['leadtime'] = df_head['date'].dt.month - month + 1 + 12 * (df_head['date'].dt.year - year)
            df_head = df_head.iloc[:, [0,1,3,2]]

            if year > 2016 : 
                lim = 51
            else :
                lim = 25
            # fill the next year's ensemble next to the previous column
            for scenario in range(1, lim):
                # read simulated flow data one by one 
                df = pd.read_csv(path + '/analysis/4.SFFs/3_run/' + folder[bc_type] + '/' + catchment_name + '_' + str(year) 
                                 +  '_' + str(month).zfill(2) +  '_'  + str(scenario) + '.csv')
                df['date'] = pd.to_datetime(df['date'], infer_datetime_format=True, format='%m/%d/%Y', errors='ignore')
                df_head[scenario] = df.iloc[:,2]
                   
            del df_head['Unnamed: 0'] # delete unnecessary column 
            df_head.rename(columns = {'Q_sim_0':0},inplace=True)               # simplify the column name
            # insert forecasted mean column
            df_head['mean'] = round(df_head.loc[:,0:].mean(axis=1),2)
            # insert observed flow column
            df_head['obs'] = np.nan                                            # Insert observed data (time consuming)
            df_head['obs'] = np.where(df_head['obs'].isna(), df_head['date'].map(obs_flow.set_index('date')['obs_flow']), df_head['obs'])    # observed data referencing
            df_head.set_index('date', inplace=True)
            # save the integrated ESP ensemble datasets to csv format
            df_head.to_csv(path + '/analysis/4.SFFs/3_run/' + folder[bc_type] + '/[out]' + catchment_name +  '_' + str(year) 
                           +  '_' + str(month).zfill(2) + '.csv')
print('Monthly SFFs simulation ensemble datasets are created!')