# c) Ngonye Falls Flow Analysis

Load the synthetic historic daily flow series for Ngonye and produce various summary statistics for later presentation.

## Inputs

| Data                       | Source                                        | Description                                 |
|----------------------------|-----------------------------------------------|---------------------------------------------|
| ngonye_synthetic.csv  | Notebook: b_synthetic_flow_ngonye |Synthetic daily flow series for Ngonye  Falls 1924/25 - 2016/17  |
| selected_years.csv | Mott MacDonald - Ngonye Falls Hydropower Project - 2018 Feasibility Study Update - Final Report Version D | List of representative selected years |


## Outputs
| File                           | Description                                 |
|--------------------------------|---------------------------------------------|
| ngonye_flow_daily.csv          | Daily flow data  |
| ngonye_flow_monthly.csv        | Flow summaries by month  |
| ngonye_flow_yearly.csv         | Flow summaries by year  |
| ngonye_flow_calmonthly.csv     | Flow summaries by calendar month |
| ngonye_flow_selected_years.csv | Flow summaries for selected representative years  |



## Parameters

In [1]:
year = "2024"

In [2]:
input_data='./input_data/'
output_data='./output_data/'
if year != '':
    input_data+=year + '/'
    output_data+=year + '/'

## Libraries

In [3]:
import numpy as np
import pandas as pd
import datetime

## Load the Daily Data

In [23]:

#daily = pd.read_csv(output_data + "ngonye_synthetic.csv")
daily_file = "ngonye_synthetic"
if year != '':
    daily_file += '_' + year
daily_file += '.csv'

daily = pd.read_csv(output_data + daily_file)
daily.tail(4)

Unnamed: 0,Date,LaggedDate,VicFalls,Conversion,Flow,Exceedance
36510,2024-09-16,2024-09-27,238.4,0.961365,229.189501,0.91
36511,2024-09-17,2024-09-28,235.4,0.963129,226.720641,0.915
36512,2024-09-18,2024-09-29,228.5,0.973153,222.365504,0.921
36513,2024-09-19,2024-09-30,226.5,0.965849,218.764823,0.929


Index by date and add some other columns for later use. 

Add a column for *WaterYear* which starts on 1st October and runs to 31st September the following year.

In [24]:
daily['Date']=pd.to_datetime(daily['Date'],format="%Y-%m-%d")#"%d/%m/%Y")
daily=daily.set_index(pd.DatetimeIndex(daily['Date']))


In [25]:
daily['Year']=daily.index.year
daily['Month']=daily.index.month
daily['Day']=daily.index.day
daily['MonthId']=daily['Year']+daily['Month']/100
daily['WaterYear']=daily.apply((lambda x: (x['Year'] if x['Month']>=10 else x['Year']-1)),axis=1)
daily['WaterMonth']=daily.apply((lambda x: (x['Month']-9 if x['Month']>=10 else x['Month']+3)),axis=1)
daily['WaterDay']=daily.apply(lambda x: (x['Date']-pd.Timestamp(x['WaterYear'], 10, 1)).days+1,axis=1)
daily['WaterWeek']=np.floor((daily['WaterDay']-1)/7)+1
daily['Volume']=daily['Flow']*60*60*24/(1000*1000*1000)
daily=daily.astype({'WaterWeek': 'int32'})
daily=daily.drop('Date',axis=1)
daily

Unnamed: 0_level_0,LaggedDate,VicFalls,Conversion,Flow,Exceedance,Year,Month,Day,MonthId,WaterYear,WaterMonth,WaterDay,WaterWeek,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
1924-10-01,1924-10-12,100.0,1.364952,136.495189,0.998,1924,10,1,1924.10,1924,1,1,1,0.011793
1924-10-02,1924-10-13,100.0,1.364952,136.495189,0.998,1924,10,2,1924.10,1924,1,2,1,0.011793
1924-10-03,1924-10-14,100.0,1.364952,136.495189,0.998,1924,10,3,1924.10,1924,1,3,1,0.011793
1924-10-04,1924-10-15,100.0,1.364952,136.495189,0.998,1924,10,4,1924.10,1924,1,4,1,0.011793
1924-10-05,1924-10-16,100.0,1.364952,136.495189,0.998,1924,10,5,1924.10,1924,1,5,1,0.011793
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2024-09-15,2024-09-26,238.4,0.961365,229.189501,0.910,2024,9,15,2024.09,2023,12,351,51,0.019802
2024-09-16,2024-09-27,238.4,0.961365,229.189501,0.910,2024,9,16,2024.09,2023,12,352,51,0.019802
2024-09-17,2024-09-28,235.4,0.963129,226.720641,0.915,2024,9,17,2024.09,2023,12,353,51,0.019589
2024-09-18,2024-09-29,228.5,0.973153,222.365504,0.921,2024,9,18,2024.09,2023,12,354,51,0.019212


In [26]:
daily['Flow_difference']=np.abs(daily['Flow']-daily['Flow'].shift(1))
daily['Flow_difference_pct']=daily['Flow_difference']/daily['Flow']

## Setup the Monthly Data

Load the monthly data.

In [27]:

monthly=daily.groupby(['MonthId','Year','Month']).size().to_frame(name="Days").reset_index(['Month','Year'])
monthly

Unnamed: 0_level_0,Year,Month,Days
MonthId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1924.10,1924,10,31
1924.11,1924,11,30
1924.12,1924,12,31
1925.01,1925,1,31
1925.02,1925,2,28
...,...,...,...
2024.05,2024,5,31
2024.06,2024,6,30
2024.07,2024,7,31
2024.08,2024,8,31


Set the index and add additional columns for later use.

In [28]:

monthly['Day']=1
monthly['DateStart']=pd.to_datetime(monthly[['Year','Month','Day']])

monthly['WaterYear']=monthly.apply((lambda x: (x['Year'] if x['Month']>=10 else x['Year']-1)),axis=1)
monthly['WaterMonth']=monthly.apply((lambda x: (x['Month']-9 if x['Month']>=10 else x['Month']+3)),axis=1)

monthly = monthly.drop('Day',axis=1)

monthly

Unnamed: 0_level_0,Year,Month,Days,DateStart,WaterYear,WaterMonth
MonthId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1924.10,1924,10,31,1924-10-01,1924,1
1924.11,1924,11,30,1924-11-01,1924,2
1924.12,1924,12,31,1924-12-01,1924,3
1925.01,1925,1,31,1925-01-01,1924,4
1925.02,1925,2,28,1925-02-01,1924,5
...,...,...,...,...,...,...
2024.05,2024,5,31,2024-05-01,2023,8
2024.06,2024,6,30,2024-06-01,2023,9
2024.07,2024,7,31,2024-07-01,2023,10
2024.08,2024,8,31,2024-08-01,2023,11


## Monthly flow summaries

Add flow summaries to the monthly data

In [29]:
monthly['Flow_min']=daily[['MonthId','Flow']].groupby('MonthId').min()
monthly['Flow_mean']=daily[['MonthId','Flow']].groupby('MonthId').mean()
monthly['Flow_median']=daily[['MonthId','Flow']].groupby('MonthId').median()
monthly['Flow_max']=daily[['MonthId','Flow']].groupby('MonthId').max()
monthly['Volume']=daily[['MonthId','Volume']].groupby('MonthId').sum()
monthly['Flow_range']=monthly['Flow_max']-monthly['Flow_min']
monthly[['Flow_min','Flow_mean','Flow_median','Flow_max','Flow_range']]
monthly

Unnamed: 0_level_0,Year,Month,Days,DateStart,WaterYear,WaterMonth,Flow_min,Flow_mean,Flow_median,Flow_max,Volume,Flow_range
MonthId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1924.10,1924,10,31,1924-10-01,1924,1,127.731760,134.493603,136.495189,138.801881,0.360228,11.070120
1924.11,1924,11,30,1924-11-01,1924,2,130.058184,153.220935,147.271324,191.577672,0.397149,61.519488
1924.12,1924,12,31,1924-12-01,1924,3,195.246123,276.286727,269.886545,446.597147,0.740006,251.351024
1925.01,1925,1,31,1925-01-01,1924,4,422.232167,717.515348,631.830989,1204.278450,1.921793,782.046283
1925.02,1925,2,28,1925-02-01,1924,5,1126.839328,1217.500493,1187.050220,1434.746873,2.945377,307.907545
...,...,...,...,...,...,...,...,...,...,...,...,...
2024.05,2024,5,31,2024-05-01,2023,8,748.927190,827.433756,837.552351,875.472103,2.216199,126.544913
2024.06,2024,6,30,2024-06-01,2023,9,474.763172,661.708649,645.683617,864.179115,1.715149,389.415942
2024.07,2024,7,31,2024-07-01,2023,10,345.793207,399.628460,399.001860,471.077692,1.070365,125.284485
2024.08,2024,8,31,2024-08-01,2023,11,261.927202,300.203488,300.619108,343.698669,0.804065,81.771467


## Annual Flow

Create a data table for annual (water year) summaries and populate.

In [30]:
yearly=monthly[['WaterYear']].groupby('WaterYear').count()

yearly['Flow_min']=daily[['WaterYear','Flow']].groupby('WaterYear').min()
yearly['Flow_median']=daily[['WaterYear','Flow']].groupby('WaterYear').median()
yearly['Flow_mean']=daily[['WaterYear','Flow']].groupby('WaterYear').mean()
yearly['Flow_max']=daily[['WaterYear','Flow']].groupby('WaterYear').max()
yearly['Flow_range']=yearly['Flow_max']-yearly['Flow_min']
yearly['Volume']=monthly[['WaterYear','Volume']].groupby('WaterYear').sum()

yearly

Unnamed: 0_level_0,Flow_min,Flow_median,Flow_mean,Flow_max,Flow_range,Volume
WaterYear,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1924,127.731760,588.281801,998.207580,3456.336763,3328.605002,31.479474
1925,158.194815,512.185296,1124.989810,4581.589710,4423.394894,35.477679
1926,195.246123,588.281801,998.244554,3284.325999,3089.079876,31.480640
1927,178.257008,525.137271,858.865621,2291.304753,2113.047745,27.159392
1928,199.871920,394.012113,572.575542,1571.142409,1371.270489,18.056742
...,...,...,...,...,...,...
2019,128.064107,631.488477,1381.475806,4669.790989,4541.726882,43.685581
2020,260.019625,797.960628,1356.290955,3731.765568,3471.745943,42.771992
2021,264.119331,727.839240,1087.201593,3102.277326,2838.157995,34.285989
2022,228.804955,726.416382,1151.787847,2768.350384,2539.545429,36.322782


In [31]:
Flow_mean_mean=yearly['Flow_mean'].describe()['mean']
Flow_max_mean=yearly['Flow_max'].describe()['mean']
Flow_min_mean=yearly['Flow_min'].describe()['mean']
Volume_mean=yearly['Volume'].describe()['mean']


yearly['Flow_mean_pct_var']=(yearly['Flow_mean']-Flow_mean_mean)/Flow_mean_mean*100
yearly['Flow_max_pct_var']=(yearly['Flow_max']-Flow_max_mean)/Flow_max_mean*100
yearly['Flow_min_pct_var']=(yearly['Flow_min']-Flow_min_mean)/Flow_min_mean*100
yearly['Volume_pct_var']=(yearly['Volume']-Volume_mean)/Volume_mean*100


Flow_mean_mean

np.float64(1106.1911591895055)

In [32]:
yearly['Flow_mean_5yr_mvCoefVar']=yearly['Flow_mean'].rolling(5,center=True).std()/Flow_mean_mean*100
yearly.loc[:,['Flow_mean_pct_var','Volume_pct_var']]

Unnamed: 0_level_0,Flow_mean_pct_var,Volume_pct_var
WaterYear,Unnamed: 1_level_1,Unnamed: 2_level_1
1924,-9.761747,-9.811237
1925,1.699403,1.643627
1926,-9.758404,-9.807897
1927,-22.358300,-22.188282
1928,-48.239006,-48.267394
...,...,...
2019,24.885812,25.159284
2020,22.609094,22.541850
2021,-1.716662,-1.770565
2022,4.121954,4.064849


In [33]:
mins=daily[['Year','Flow']].groupby('Year').idxmin()
mins=mins.reset_index()
mins['DaysToStart']=mins.apply(lambda x: x['Flow']-pd.Timestamp(datetime.date(x['Year'], 10, 1)),axis=1)
mins=mins.set_index('Year')
yearly['DaysToStart']=mins['DaysToStart']
yearly['SeasonStart']=mins['Flow']
yearly

Unnamed: 0_level_0,Flow_min,Flow_median,Flow_mean,Flow_max,Flow_range,Volume,Flow_mean_pct_var,Flow_max_pct_var,Flow_min_pct_var,Volume_pct_var,Flow_mean_5yr_mvCoefVar,DaysToStart,SeasonStart
WaterYear,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
1924,127.731760,588.281801,998.207580,3456.336763,3328.605002,31.479474,-9.761747,-3.851637,-40.607539,-9.811237,,24 days,1924-10-25
1925,158.194815,512.185296,1124.989810,4581.589710,4423.394894,35.477679,1.699403,27.450645,-26.442888,1.643627,,29 days,1925-10-30
1926,195.246123,588.281801,998.244554,3284.325999,3089.079876,31.480640,-9.758404,-8.636632,-9.214844,-9.807897,19.083703,23 days,1926-10-24
1927,178.257008,525.137271,858.865621,2291.304753,2113.047745,27.159392,-22.358300,-36.260493,-17.114409,-22.188282,19.636906,8 days,1927-10-09
1928,199.871920,394.012113,572.575542,1571.142409,1371.270489,18.056742,-48.239006,-56.293967,-7.063950,-48.267394,15.642644,28 days,1928-10-29
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2019,128.064107,631.488477,1381.475806,4669.790989,4541.726882,43.685581,24.885812,29.904227,-40.453005,25.159284,38.025318,12 days,2019-10-13
2020,260.019625,797.960628,1356.290955,3731.765568,3471.745943,42.771992,22.609094,3.810240,20.903410,22.541850,31.568952,32 days,2020-11-02
2021,264.119331,727.839240,1087.201593,3102.277326,2838.157995,34.285989,-1.716662,-13.700861,22.809684,-1.770565,31.585380,8 days,2021-10-09
2022,228.804955,726.416382,1151.787847,2768.350384,2539.545429,36.322782,4.121954,-22.990040,6.389275,4.064849,,20 days,2022-10-21


In [34]:
annual_fdcs=pd.DataFrame(index=np.arange(0,1.01,0.01),columns=np.arange(yearly.index.min(),yearly.index.max()+1,1))
for col in annual_fdcs.columns:
    annual_fdcs[col]=np.percentile(daily.loc[daily['WaterYear']==col]['Flow'],((1-annual_fdcs.index)*100))

annual_fdcs

Unnamed: 0,1924,1925,1926,1927,1928,1929,1930,1931,1932,1933,...,2014,2015,2016,2017,2018,2019,2020,2021,2022,2023
0.00,3456.336763,4581.589710,3284.325999,2291.304753,1571.142409,2219.934107,3334.213303,3817.568446,1596.347813,5660.610649,...,1465.774290,2994.142736,3912.594729,4829.072451,1091.854150,4669.790989,3731.765568,3102.277326,2768.350384,875.472103
0.01,3405.036349,4519.664793,3219.481954,2291.304753,1553.819042,2172.623673,3284.325999,3758.426849,1571.142409,5538.904482,...,1465.774290,2994.142736,3891.864195,4718.167941,1063.069916,4594.424741,3688.125186,3084.696924,2758.675760,872.479036
0.02,3320.244858,4463.518874,3096.788220,2280.481232,1541.139182,2090.394064,3240.164096,3684.024713,1521.470905,5363.472195,...,1455.726450,2989.193143,3849.329267,4669.790989,1051.432177,4535.382989,3643.908773,3057.497404,2717.777839,867.169354
0.03,3284.325999,4446.860929,3009.493617,2247.724603,1505.868304,2032.360447,3201.694407,3660.955789,1505.868304,5130.056056,...,1437.296274,2977.644095,3756.783210,4612.543699,1041.571998,4468.287721,3595.346156,3028.106650,2685.026401,862.240415
0.04,3229.186199,4390.867393,2881.601495,2247.724603,1475.275875,1935.634031,3118.294753,3634.878694,1477.100832,4926.284777,...,1426.443296,2977.644095,3611.195148,4549.916867,1029.568238,4396.589687,3546.523941,3002.083733,2620.474848,850.842092
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
0.96,136.495189,164.862071,201.421090,182.720686,217.176184,195.986388,217.176184,237.165433,199.871920,170.141269,...,232.482472,183.476067,174.878285,182.944118,174.989825,143.183453,283.388689,270.470616,236.800331,241.259167
0.97,134.689211,164.862071,199.871920,182.720686,203.806146,188.837117,216.751818,237.165433,199.871920,170.141269,...,228.324272,182.770526,165.493726,179.472266,174.641448,138.030734,281.682757,270.069602,235.099385,238.468745
0.98,130.709583,163.756830,199.871920,182.720686,203.806146,187.970760,203.870454,237.165433,197.815960,163.327014,...,228.324272,175.597317,165.493726,179.472266,171.917830,136.720831,277.116092,269.953933,234.763575,231.282681
0.99,127.731760,160.702102,197.815960,181.080511,202.198697,187.633843,203.806146,237.165433,196.092140,160.702102,...,222.518639,174.950786,160.914110,174.989825,164.866332,133.520748,267.013188,266.022033,232.862656,229.189501


In [35]:
monthly_fdcs=pd.DataFrame(index=np.arange(0,1.01,0.01),columns=[1,2,3,4,5,6,7,8,9,10,11,12])
for col in monthly_fdcs.columns:
    monthly_fdcs[col]=np.percentile(daily.loc[daily['WaterMonth']==col]['Flow'],((1-monthly_fdcs.index)*100))

monthly_fdcs

Unnamed: 0,1,2,3,4,5,6,7,8,9,10,11,12
0.00,485.288747,684.511435,1155.303842,3908.572185,9912.215609,9499.656578,8557.992907,5744.292056,3973.851494,1846.584163,919.143599,570.163454
0.01,429.777322,561.471933,892.979994,2229.223020,5289.846312,7933.788119,6294.130796,5252.893027,2814.254193,1370.778567,735.526914,492.553133
0.02,394.053370,507.012910,839.102820,1753.491314,4642.471485,6892.322933,5967.374667,4640.451452,2539.648771,1238.903747,674.100374,469.078740
0.03,379.329461,477.815611,797.866684,1546.596285,4433.210320,6402.329044,5679.900165,4378.110444,2403.516274,1186.565039,636.760477,446.597147
0.04,364.579802,446.597147,763.089293,1396.699854,3880.952146,5949.608950,5537.410608,4154.639828,2321.114050,1124.341932,619.754519,436.683421
...,...,...,...,...,...,...,...,...,...,...,...,...
0.96,151.856495,167.108926,246.534838,406.719180,576.195919,772.481241,902.497860,834.563399,412.270891,306.136978,234.625394,189.816276
0.97,143.592939,160.914110,239.958962,396.074959,542.648122,756.680204,871.715491,782.319421,393.887175,290.383657,225.803194,179.472266
0.98,136.495189,155.560336,230.054228,382.509869,497.205949,674.047516,837.323076,710.954664,364.495188,265.983804,213.049318,166.272859
0.99,136.347567,141.586964,208.026015,331.992245,469.078740,629.733576,733.859193,576.195919,321.158520,250.254844,203.803160,156.633133


In [36]:
yearly['MeanQ3070']=annual_fdcs.loc[(annual_fdcs.index>=0.3) & (annual_fdcs.index<=0.7)].mean()
yearly

Unnamed: 0_level_0,Flow_min,Flow_median,Flow_mean,Flow_max,Flow_range,Volume,Flow_mean_pct_var,Flow_max_pct_var,Flow_min_pct_var,Volume_pct_var,Flow_mean_5yr_mvCoefVar,DaysToStart,SeasonStart,MeanQ3070
WaterYear,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
1924,127.731760,588.281801,998.207580,3456.336763,3328.605002,31.479474,-9.761747,-3.851637,-40.607539,-9.811237,,24 days,1924-10-25,680.119802
1925,158.194815,512.185296,1124.989810,4581.589710,4423.394894,35.477679,1.699403,27.450645,-26.442888,1.643627,,29 days,1925-10-30,576.118215
1926,195.246123,588.281801,998.244554,3284.325999,3089.079876,31.480640,-9.758404,-8.636632,-9.214844,-9.807897,19.083703,23 days,1926-10-24,699.026049
1927,178.257008,525.137271,858.865621,2291.304753,2113.047745,27.159392,-22.358300,-36.260493,-17.114409,-22.188282,19.636906,8 days,1927-10-09,571.820717
1928,199.871920,394.012113,572.575542,1571.142409,1371.270489,18.056742,-48.239006,-56.293967,-7.063950,-48.267394,15.642644,28 days,1928-10-29,437.303656
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2019,128.064107,631.488477,1381.475806,4669.790989,4541.726882,43.685581,24.885812,29.904227,-40.453005,25.159284,38.025318,12 days,2019-10-13,716.042048
2020,260.019625,797.960628,1356.290955,3731.765568,3471.745943,42.771992,22.609094,3.810240,20.903410,22.541850,31.568952,32 days,2020-11-02,926.635406
2021,264.119331,727.839240,1087.201593,3102.277326,2838.157995,34.285989,-1.716662,-13.700861,22.809684,-1.770565,31.585380,8 days,2021-10-09,731.695869
2022,228.804955,726.416382,1151.787847,2768.350384,2539.545429,36.322782,4.121954,-22.990040,6.389275,4.064849,,20 days,2022-10-21,893.468925


In [37]:
fdc=pd.DataFrame({'Exceedance': np.arange(0,1.001,0.001)}).set_index('Exceedance')
fdc['Mean']=np.percentile(yearly['Flow_mean'],((1-fdc.index)*100))
fdc['Max']=np.percentile(yearly['Flow_max'],((1-fdc.index)*100))
fdc['Min']=np.percentile(yearly['Flow_min'],((1-fdc.index)*100))
fdc['Median']=np.percentile(yearly['Flow_median'],((1-fdc.index)*100))
fdc['MeanQ3070']=np.percentile(yearly['MeanQ3070'],((1-fdc.index)*100))

fdc

Unnamed: 0_level_0,Mean,Max,Min,Median,MeanQ3070
Exceedance,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
0.000,2342.622725,9912.215609,341.556844,1105.667804,1273.272149
0.001,2324.551317,9834.228701,341.125269,1099.540534,1265.611172
0.002,2306.479908,9756.241792,340.693693,1093.413265,1257.950196
0.003,2288.408500,9678.254883,340.262118,1087.285996,1250.289220
0.004,2270.337092,9600.267974,339.830542,1081.158727,1242.628243
...,...,...,...,...,...
0.996,389.042310,860.910146,127.731760,269.419001,297.035460
0.997,384.471170,858.523336,127.731760,268.562868,295.342461
0.998,379.900030,856.136525,127.731760,267.706735,293.649463
0.999,375.328889,853.749714,127.731760,266.850602,291.956464


In [38]:
yearly['ExceedanceMean']=pd.merge_asof(yearly.reset_index().sort_values('Flow_mean'),fdc.reset_index().sort_values('Mean'),left_on='Flow_mean',right_on='Mean').set_index('WaterYear')['Exceedance']
yearly['ExceedanceMedian']=pd.merge_asof(yearly.reset_index().sort_values('Flow_median'),fdc.reset_index().sort_values('Median'),left_on='Flow_median',right_on='Median').set_index('WaterYear')['Exceedance']
yearly['ExceedanceMeanQ3070']=pd.merge_asof(yearly.reset_index().sort_values('MeanQ3070'),fdc.reset_index().sort_values('MeanQ3070'),left_on='MeanQ3070',right_on='MeanQ3070').set_index('WaterYear')['Exceedance']
yearly

Unnamed: 0_level_0,Flow_min,Flow_median,Flow_mean,Flow_max,Flow_range,Volume,Flow_mean_pct_var,Flow_max_pct_var,Flow_min_pct_var,Volume_pct_var,Flow_mean_5yr_mvCoefVar,DaysToStart,SeasonStart,MeanQ3070,ExceedanceMean,ExceedanceMedian,ExceedanceMeanQ3070
WaterYear,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
1924,127.731760,588.281801,998.207580,3456.336763,3328.605002,31.479474,-9.761747,-3.851637,-40.607539,-9.811237,,24 days,1924-10-25,680.119802,0.586,0.588,0.516
1925,158.194815,512.185296,1124.989810,4581.589710,4423.394894,35.477679,1.699403,27.450645,-26.442888,1.643627,,29 days,1925-10-30,576.118215,0.445,0.728,0.738
1926,195.246123,588.281801,998.244554,3284.325999,3089.079876,31.480640,-9.758404,-8.636632,-9.214844,-9.807897,19.083703,23 days,1926-10-24,699.026049,0.576,0.588,0.475
1927,178.257008,525.137271,858.865621,2291.304753,2113.047745,27.159392,-22.358300,-36.260493,-17.114409,-22.188282,19.636906,8 days,1927-10-09,571.820717,0.697,0.708,0.758
1928,199.871920,394.012113,572.575542,1571.142409,1371.270489,18.056742,-48.239006,-56.293967,-7.063950,-48.267394,15.642644,28 days,1928-10-29,437.303656,0.920,0.960,0.950
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2019,128.064107,631.488477,1381.475806,4669.790989,4541.726882,43.685581,24.885812,29.904227,-40.453005,25.159284,38.025318,12 days,2019-10-13,716.042048,0.223,0.455,0.435
2020,260.019625,797.960628,1356.290955,3731.765568,3471.745943,42.771992,22.609094,3.810240,20.903410,22.541850,31.568952,32 days,2020-11-02,926.635406,0.243,0.182,0.172
2021,264.119331,727.839240,1087.201593,3102.277326,2838.157995,34.285989,-1.716662,-13.700861,22.809684,-1.770565,31.585380,8 days,2021-10-09,731.695869,0.485,0.314,0.425
2022,228.804955,726.416382,1151.787847,2768.350384,2539.545429,36.322782,4.121954,-22.990040,6.389275,4.064849,,20 days,2022-10-21,893.468925,0.405,0.324,0.192


## Calendar months

Produce summaries of flow by calendar month

In [39]:
calmonthly=pd.DataFrame({'WaterMonth': [1,2,3,4,5,6,7,8,9,10,11,12],'MonthName': ['Oct','Nov','Dec','Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep'], 'Month':[10,11,12,1,2,3,4,5,6,7,8,9]})
calmonthly=calmonthly.set_index('WaterMonth')
calmonthly['Flow_min']=daily[['WaterMonth','Flow']].groupby('WaterMonth').min()
calmonthly['Flow_mean']=daily[['WaterMonth','Flow']].groupby('WaterMonth').mean()
calmonthly['Flow_median']=daily[['WaterMonth','Flow']].groupby('WaterMonth').median()
calmonthly['Flow_max']=daily[['WaterMonth','Flow']].groupby('WaterMonth').max()
calmonthly['Flow_std']=daily[['WaterMonth','Flow']].groupby('WaterMonth').std()
calmonthly['Flow_coefvar']=(calmonthly['Flow_std']/calmonthly['Flow_mean']*100).round(1)
calmonthly['Flow_difference_median']=daily[['WaterMonth','Flow_difference']].groupby('WaterMonth').median()
calmonthly['Flow_difference_mean']=daily[['WaterMonth','Flow_difference']].groupby('WaterMonth').mean()
calmonthly['Flow_difference_pct_mean']=daily[['WaterMonth','Flow_difference_pct']].groupby('WaterMonth').mean()
calmonthly

Unnamed: 0_level_0,MonthName,Month,Flow_min,Flow_mean,Flow_median,Flow_max,Flow_std,Flow_coefvar,Flow_difference_median,Flow_difference_mean,Flow_difference_pct_mean
WaterMonth,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
1,Oct,10,127.73176,239.052329,229.631097,485.288747,59.316979,24.8,0.500476,2.075379,0.008884
2,Nov,11,127.73176,276.050585,260.604337,684.511435,82.135951,29.8,2.555949,4.354045,0.015329
3,Dec,12,188.837117,438.63772,412.270891,1155.303842,145.725938,33.2,5.842028,9.065609,0.020227
4,Jan,1,269.886545,737.524624,667.641871,3908.572185,334.278918,45.3,9.060193,16.3,0.01954
5,Feb,2,394.012113,1368.533829,982.050312,9912.215609,1143.996274,83.6,13.497258,34.149638,0.020672
6,Mar,3,525.137271,2527.2707,2229.22302,9499.656578,1635.751481,64.7,25.343473,54.384356,0.020652
7,Apr,4,646.284527,2953.606288,2949.258926,8557.992907,1374.956015,46.6,22.388346,35.157539,0.011477
8,May,5,424.994262,2244.176063,2197.52496,5744.292056,957.016844,42.6,30.87265,35.523461,0.016017
9,Jun,6,269.886545,1225.975808,1170.398795,3973.851494,551.390073,45.0,26.446092,29.530189,0.024175
10,Jul,7,221.684297,613.185315,555.769442,1846.584163,241.950723,39.5,9.490327,11.909224,0.017738


## Calendar Month Flow exceedance

Flow exceedance values by calendar month.

P90 is flow which is exceeded for 90% of the time.

In [40]:

calmonthly['Flow_P95']=monthly[['WaterMonth','Flow_mean']].groupby('WaterMonth').quantile(0.05)
calmonthly['Flow_P90']=monthly[['WaterMonth','Flow_mean']].groupby('WaterMonth').quantile(0.1)
#calmonthly['Flow_P80']=monthly[['Month','Flow_mean']].groupby('Month').quantile(0.2)
calmonthly['Flow_P75']=monthly[['WaterMonth','Flow_mean']].groupby('WaterMonth').quantile(0.25)
calmonthly['Flow_P50']=monthly[['WaterMonth','Flow_mean']].groupby('WaterMonth').quantile(0.5)
calmonthly['Flow_P25']=monthly[['WaterMonth','Flow_mean']].groupby('WaterMonth').quantile(0.75)
#calmonthly['Flow_P20']=monthly[['Month','Flow_mean']].groupby('Month').quantile(0.8)
calmonthly['Flow_P10']=monthly[['WaterMonth','Flow_mean']].groupby('WaterMonth').quantile(0.9)
calmonthly['Flow_P05']=monthly[['WaterMonth','Flow_mean']].groupby('WaterMonth').quantile(0.95)

calmonthly

Unnamed: 0_level_0,MonthName,Month,Flow_min,Flow_mean,Flow_median,Flow_max,Flow_std,Flow_coefvar,Flow_difference_median,Flow_difference_mean,Flow_difference_pct_mean,Flow_P95,Flow_P90,Flow_P75,Flow_P50,Flow_P25,Flow_P10,Flow_P05
WaterMonth,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
1,Oct,10,127.73176,239.052329,229.631097,485.288747,59.316979,24.8,0.500476,2.075379,0.008884,161.246785,171.923612,201.400175,228.373734,273.350007,307.057407,342.787804
2,Nov,11,127.73176,276.050585,260.604337,684.511435,82.135951,29.8,2.555949,4.354045,0.015329,179.499103,201.231447,219.909697,266.061641,305.921608,379.317907,414.844339
3,Dec,12,188.837117,438.63772,412.270891,1155.303842,145.725938,33.2,5.842028,9.065609,0.020227,289.062446,307.981238,348.853748,408.3713,516.709212,579.90117,709.440588
4,Jan,1,269.886545,737.524624,667.641871,3908.572185,334.278918,45.3,9.060193,16.3,0.01954,454.654975,489.832653,565.926335,656.340481,836.532168,995.187282,1284.897794
5,Feb,2,394.012113,1368.533829,982.050312,9912.215609,1143.996274,83.6,13.497258,34.149638,0.020672,659.927156,718.628502,837.422393,1006.522647,1465.90186,2475.486309,3148.486942
6,Mar,3,525.137271,2527.2707,2229.22302,9499.656578,1635.751481,64.7,25.343473,54.384356,0.020652,839.063044,915.084829,1193.07584,2356.074068,3180.161859,4409.373161,5621.473162
7,Apr,4,646.284527,2953.606288,2949.258926,8557.992907,1374.956015,46.6,22.388346,35.157539,0.011477,989.142958,1251.885534,1897.167845,2947.472854,3794.054358,4721.505048,5110.173251
8,May,5,424.994262,2244.176063,2197.52496,5744.292056,957.016844,42.6,30.87265,35.523461,0.016017,825.5892,1116.053637,1667.805961,2277.598336,2690.289672,3325.23944,3780.492504
9,Jun,6,269.886545,1225.975808,1170.398795,3973.851494,551.390073,45.0,26.446092,29.530189,0.024175,463.520946,659.95357,875.671631,1200.796896,1498.981914,1818.024852,2076.555367
10,Jul,7,221.684297,613.185315,555.769442,1846.584163,241.950723,39.5,9.490327,11.909224,0.017738,332.016557,377.428966,459.575585,573.56422,745.111119,878.665515,1029.306775


## Prepare the Representative Years Summaries

In [41]:

selected = pd.read_csv("./input_data/" + "selected_years.csv").rename(columns={"Year": "WaterYear"}).set_index('WaterYear')
selected['Flow_min']=yearly['Flow_min']
selected['Flow_mean']=yearly['Flow_mean']
selected['Flow_max']=yearly['Flow_max']
selected['Volume']=yearly['Volume']
selected['ExceedanceMean']=yearly['ExceedanceMean']
selected['ExceedanceMedian']=yearly['ExceedanceMedian']
selected['ExceedanceMeanQ3070']=yearly['ExceedanceMeanQ3070']

selected

Unnamed: 0_level_0,Class,Flow_Exceedance,Flow_min,Flow_mean,Flow_max,Volume,ExceedanceMean,ExceedanceMedian,ExceedanceMeanQ3070
WaterYear,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1967,Very Wet,Q3,255.575177,1871.649977,5594.412902,59.186064,0.041,0.052,0.021
2013,Wet,Q12,198.121232,1347.437994,3614.128647,42.492805,0.253,0.133,0.112
2002,Median,Q50,197.422366,1081.333566,3891.772691,34.100935,0.516,0.475,0.506
1990,Dry,Q90,201.42109,775.178142,2321.11405,24.446018,0.768,0.92,0.889
1996,Very Dry,Q97,127.73176,556.906125,1387.532392,17.562592,0.93,0.97,0.96


In [42]:
flow_fdc=pd.DataFrame({'Exceedance': np.arange(0,1.001,0.001)}).set_index('Exceedance')
flow_fdc['Flow']=np.percentile(daily['Flow'],((1-flow_fdc.index)*100))
flow_fdc

Unnamed: 0_level_0,Flow
Exceedance,Unnamed: 1_level_1
0.000,9912.215609
0.001,8709.985085
0.002,7668.954087
0.003,6835.562714
0.004,6342.131411
...,...
0.996,143.592939
0.997,138.801881
0.998,136.495189
0.999,134.711033


In [45]:
floods = pd.read_csv("./input_data/" + "flood_return.csv").set_index('ReturnYears')
floods[['LastDate','WaterYear']]=pd.merge_asof(daily.reset_index().sort_values('Flow'),floods.reset_index(),left_on='Flow',right_on='Flow')[['Date','WaterYear','ReturnYears']].groupby('ReturnYears').max()
floods['YearsSince']=2023-floods['WaterYear']
floods


Unnamed: 0_level_0,Flow,LastDate,WaterYear,YearsSince
ReturnYears,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2,3418.0,2021-03-19,2020.0,3.0
5,5124.0,2010-04-27,2009.0,14.0
10,6218.0,1978-04-22,1977.0,46.0
15,6724.0,1969-04-21,1968.0,55.0
20,7231.0,1969-04-18,1968.0,55.0
50,8489.0,1969-04-05,1968.0,55.0
100,9395.0,1958-03-01,1957.0,66.0
200,10272.0,NaT,,
500,11391.0,NaT,,
1000,12212.0,NaT,,


In [46]:
for flood in floods.reset_index().itertuples():
    if flood.ReturnYears<=100:     
        floods.at[flood.ReturnYears,'MeanDays']=daily.loc[daily['Flow']>=flood.Flow].groupby('WaterYear').count().mean()['Flow']
floods

Unnamed: 0_level_0,Flow,LastDate,WaterYear,YearsSince,MeanDays
ReturnYears,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2,3418.0,2021-03-19,2020.0,3.0,44.56
5,5124.0,2010-04-27,2009.0,14.0,26.428571
10,6218.0,1978-04-22,1977.0,46.0,33.2
15,6724.0,1969-04-21,1968.0,55.0,29.0
20,7231.0,1969-04-18,1968.0,55.0,30.0
50,8489.0,1969-04-05,1968.0,55.0,15.666667
100,9395.0,1958-03-01,1957.0,66.0,12.0
200,10272.0,NaT,,,
500,11391.0,NaT,,,
1000,12212.0,NaT,,,


## Weekly

In [47]:
daily.head(2)

Unnamed: 0_level_0,LaggedDate,VicFalls,Conversion,Flow,Exceedance,Year,Month,Day,MonthId,WaterYear,WaterMonth,WaterDay,WaterWeek,Volume,Flow_difference,Flow_difference_pct
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
1924-10-01,1924-10-12,100.0,1.364952,136.495189,0.998,1924,10,1,1924.1,1924,1,1,1,0.011793,,
1924-10-02,1924-10-13,100.0,1.364952,136.495189,0.998,1924,10,2,1924.1,1924,1,2,1,0.011793,0.0,0.0


In [54]:
weekly=daily.drop(['LaggedDate','VicFalls','Conversion','Volume','Flow_difference','Flow_difference_pct','Month','WaterMonth','Year','MonthId','Day','WaterDay'],axis=1).groupby(["WaterYear","WaterWeek"]).mean()
weekly=weekly.join(daily.reset_index().groupby(["WaterYear","WaterWeek"]).agg(    
   Flow_max=('Flow', 'max'),
   Flow_min=('Flow', 'min'), 
   Year=('Year','min'),
   Volume=('Volume', 'sum'),
   Date=('Date','min')
))
weekly['Flow_difference']=weekly['Flow']-weekly['Flow'].shift(1)
weekly['Flow_difference_abs']=np.abs(weekly['Flow']-weekly['Flow'].shift(1))
weekly['Flow_difference_pct']=weekly['Flow_difference']/weekly['Flow']
weekly['Flow_difference_abs_pct']=weekly['Flow_difference_abs']/weekly['Flow']
weekly

Unnamed: 0_level_0,Unnamed: 1_level_0,Flow,Exceedance,Flow_max,Flow_min,Year,Volume,Date,Flow_difference,Flow_difference_abs,Flow_difference_pct,Flow_difference_abs_pct
WaterYear,WaterWeek,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1924,1,136.495189,0.998000,136.495189,136.495189,1924,0.082552,1924-10-01,,,,
1924,2,136.495189,0.998000,136.495189,136.495189,1924,0.082552,1924-10-08,0.000000,0.000000,0.000000,0.000000
1924,3,136.106534,0.998286,136.495189,133.774604,1924,0.082317,1924-10-15,-0.388655,0.388655,-0.002856,0.002856
1924,4,131.775432,0.999286,138.801881,127.731760,1924,0.079698,1924-10-22,-4.331102,4.331102,-0.032867,0.032867
1924,5,130.683910,0.999857,134.711033,127.731760,1924,0.079038,1924-10-29,-1.091522,1.091522,-0.008352,0.008352
...,...,...,...,...,...,...,...,...,...,...,...,...
2023,47,284.970132,0.811143,295.974804,274.404478,2024,0.172350,2024-08-18,-21.340113,21.340113,-0.074885,0.074885
2023,48,265.584372,0.843429,270.817188,261.927202,2024,0.160625,2024-08-25,-19.385760,19.385760,-0.072993,0.072993
2023,49,250.395517,0.876286,257.195153,239.260029,2024,0.151439,2024-09-01,-15.188855,15.188855,-0.060659,0.060659
2023,50,233.723249,0.902286,239.260029,229.189501,2024,0.141356,2024-09-08,-16.672268,16.672268,-0.071333,0.071333


In [55]:
weekly['IsPeak']=(weekly['Flow']>weekly['Flow'].shift(1)) & (weekly['Flow']>weekly['Flow'].shift(-1))
weekly.loc[weekly['IsPeak']==True]
weekly=weekly.reset_index()
weekly['YearWeek']=weekly['WaterYear']*1.0+weekly['WaterWeek']/100
weekly=weekly.set_index('YearWeek')
weekly['Yearly_max']=weekly.reset_index().merge(yearly,left_on='WaterYear',right_on='WaterYear')[['YearWeek','Flow_max_y']].set_index('YearWeek')
weekly['IsPeak'].where(weekly['Flow']>weekly['Yearly_max']/2,other=False,inplace=True)
weekly

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  weekly['IsPeak'].where(weekly['Flow']>weekly['Yearly_max']/2,other=False,inplace=True)


Unnamed: 0_level_0,WaterYear,WaterWeek,Flow,Exceedance,Flow_max,Flow_min,Year,Volume,Date,Flow_difference,Flow_difference_abs,Flow_difference_pct,Flow_difference_abs_pct,IsPeak,Yearly_max
YearWeek,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
1924.01,1924,1,136.495189,0.998000,136.495189,136.495189,1924,0.082552,1924-10-01,,,,,False,3456.336763
1924.02,1924,2,136.495189,0.998000,136.495189,136.495189,1924,0.082552,1924-10-08,0.000000,0.000000,0.000000,0.000000,False,3456.336763
1924.03,1924,3,136.106534,0.998286,136.495189,133.774604,1924,0.082317,1924-10-15,-0.388655,0.388655,-0.002856,0.002856,False,3456.336763
1924.04,1924,4,131.775432,0.999286,138.801881,127.731760,1924,0.079698,1924-10-22,-4.331102,4.331102,-0.032867,0.032867,False,3456.336763
1924.05,1924,5,130.683910,0.999857,134.711033,127.731760,1924,0.079038,1924-10-29,-1.091522,1.091522,-0.008352,0.008352,False,3456.336763
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2023.47,2023,47,284.970132,0.811143,295.974804,274.404478,2024,0.172350,2024-08-18,-21.340113,21.340113,-0.074885,0.074885,False,875.472103
2023.48,2023,48,265.584372,0.843429,270.817188,261.927202,2024,0.160625,2024-08-25,-19.385760,19.385760,-0.072993,0.072993,False,875.472103
2023.49,2023,49,250.395517,0.876286,257.195153,239.260029,2024,0.151439,2024-09-01,-15.188855,15.188855,-0.060659,0.060659,False,875.472103
2023.50,2023,50,233.723249,0.902286,239.260029,229.189501,2024,0.141356,2024-09-08,-16.672268,16.672268,-0.071333,0.071333,False,875.472103


In [56]:
weekly.loc[weekly[['WaterYear','Flow']].groupby('WaterYear').idxmax().set_index('Flow').index,'IsMax']=True
weekly.loc[weekly[['Year','Flow']].groupby('Year').idxmin().set_index('Flow').index,'IsMin']=True
weekly

Unnamed: 0_level_0,WaterYear,WaterWeek,Flow,Exceedance,Flow_max,Flow_min,Year,Volume,Date,Flow_difference,Flow_difference_abs,Flow_difference_pct,Flow_difference_abs_pct,IsPeak,Yearly_max,IsMax,IsMin
YearWeek,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
1924.01,1924,1,136.495189,0.998000,136.495189,136.495189,1924,0.082552,1924-10-01,,,,,False,3456.336763,,
1924.02,1924,2,136.495189,0.998000,136.495189,136.495189,1924,0.082552,1924-10-08,0.000000,0.000000,0.000000,0.000000,False,3456.336763,,
1924.03,1924,3,136.106534,0.998286,136.495189,133.774604,1924,0.082317,1924-10-15,-0.388655,0.388655,-0.002856,0.002856,False,3456.336763,,
1924.04,1924,4,131.775432,0.999286,138.801881,127.731760,1924,0.079698,1924-10-22,-4.331102,4.331102,-0.032867,0.032867,False,3456.336763,,
1924.05,1924,5,130.683910,0.999857,134.711033,127.731760,1924,0.079038,1924-10-29,-1.091522,1.091522,-0.008352,0.008352,False,3456.336763,,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2023.47,2023,47,284.970132,0.811143,295.974804,274.404478,2024,0.172350,2024-08-18,-21.340113,21.340113,-0.074885,0.074885,False,875.472103,,
2023.48,2023,48,265.584372,0.843429,270.817188,261.927202,2024,0.160625,2024-08-25,-19.385760,19.385760,-0.072993,0.072993,False,875.472103,,
2023.49,2023,49,250.395517,0.876286,257.195153,239.260029,2024,0.151439,2024-09-01,-15.188855,15.188855,-0.060659,0.060659,False,875.472103,,
2023.50,2023,50,233.723249,0.902286,239.260029,229.189501,2024,0.141356,2024-09-08,-16.672268,16.672268,-0.071333,0.071333,False,875.472103,,


In [57]:
weekly_slim=weekly[['WaterYear','WaterWeek','Flow']]
weekly_slim

Unnamed: 0_level_0,WaterYear,WaterWeek,Flow
YearWeek,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1924.01,1924,1,136.495189
1924.02,1924,2,136.495189
1924.03,1924,3,136.106534
1924.04,1924,4,131.775432
1924.05,1924,5,130.683910
...,...,...,...
2023.47,2023,47,284.970132
2023.48,2023,48,265.584372
2023.49,2023,49,250.395517
2023.50,2023,50,233.723249


In [58]:
yearly['Max_week']=weekly.loc[weekly[['WaterYear','Flow']].groupby('WaterYear').idxmax()['Flow']].reset_index()[['WaterYear','WaterWeek']].set_index('WaterYear')['WaterWeek']
yearly.head(2)

Unnamed: 0_level_0,Flow_min,Flow_median,Flow_mean,Flow_max,Flow_range,Volume,Flow_mean_pct_var,Flow_max_pct_var,Flow_min_pct_var,Volume_pct_var,Flow_mean_5yr_mvCoefVar,DaysToStart,SeasonStart,MeanQ3070,ExceedanceMean,ExceedanceMedian,ExceedanceMeanQ3070,Max_week
WaterYear,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
1924,127.73176,588.281801,998.20758,3456.336763,3328.605002,31.479474,-9.761747,-3.851637,-40.607539,-9.811237,,24 days,1924-10-25,680.119802,0.586,0.588,0.516,27
1925,158.194815,512.185296,1124.98981,4581.58971,4423.394894,35.477679,1.699403,27.450645,-26.442888,1.643627,,29 days,1925-10-30,576.118215,0.445,0.728,0.738,26


In [59]:
yearly['Peak_count']=weekly.loc[weekly['IsPeak']==True].groupby(['WaterYear'])['IsPeak'].count()
yearly['Peak_weeks']=weekly.loc[weekly['IsPeak']==True].groupby(['WaterYear'])['WaterWeek'].apply(list)
yearly['Max_week']=weekly.loc[weekly['IsMax']==True].groupby(['WaterYear'])['WaterWeek'].max()
yearly['Min_weeks']=weekly.loc[weekly['IsMin']==True].groupby(['WaterYear'])['WaterWeek'].apply(list)

In [60]:
waterweeks=weekly[['WaterWeek','Flow','Flow_difference','Flow_difference_abs','Flow_difference_pct','Flow_difference_abs_pct']].groupby(["WaterWeek"]).mean()
waterweeks['Flow_difference_pct']=waterweeks['Flow_difference']/waterweeks['Flow']
waterweeks['Flow_difference_abs_pct']=waterweeks['Flow_difference_abs']/waterweeks['Flow']
waterweeks['Flow_P50']=weekly.reset_index()[['WaterWeek','Flow']].groupby(["WaterWeek"]).quantile(0.5)
waterweeks['Flow_P25']=weekly.reset_index()[['WaterWeek','Flow']].groupby(["WaterWeek"]).quantile(0.75)
waterweeks['Flow_P75']=weekly.reset_index()[['WaterWeek','Flow']].groupby(["WaterWeek"]).quantile(0.25)
waterweeks['Flow_P90']=weekly.reset_index()[['WaterWeek','Flow']].groupby(["WaterWeek"]).quantile(0.10)
waterweeks['Flow_P10']=weekly.reset_index()[['WaterWeek','Flow']].groupby(["WaterWeek"]).quantile(0.90)
waterweeks['YearlyMax_count']=weekly.loc[weekly['IsMax']==True].reset_index()[['WaterWeek','IsMax']].groupby(["WaterWeek"]).count()
waterweeks['YearlyPeak_count']=weekly.loc[weekly['IsPeak']==True].reset_index()[['WaterWeek','IsPeak']].groupby(["WaterWeek"]).count()
waterweeks['YearlyMin_count']=weekly.loc[weekly['IsMin']==True].reset_index()[['WaterWeek','IsMin']].groupby(["WaterWeek"]).count()

waterweeks

Unnamed: 0_level_0,Flow,Flow_difference,Flow_difference_abs,Flow_difference_pct,Flow_difference_abs_pct,Flow_P50,Flow_P25,Flow_P75,Flow_P90,Flow_P10,YearlyMax_count,YearlyPeak_count,YearlyMin_count
WaterWeek,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
1,247.99812,-7.211981,7.737754,-0.029081,0.031201,242.845946,282.472882,206.478701,184.33191,332.910087,,,7.0
2,239.338871,-8.659249,10.486991,-0.03618,0.043816,230.751831,267.961269,202.445555,173.02621,315.852327,,,10.0
3,233.750209,-5.588663,8.506254,-0.023909,0.03639,221.135838,265.971866,199.199607,165.997395,296.790864,,,31.0
4,235.215193,1.464984,8.560987,0.006228,0.036396,224.13637,264.322861,196.924578,169.647717,309.700192,,,22.0
5,241.210533,5.99534,9.525029,0.024855,0.039488,227.797042,272.462407,200.302394,171.491554,320.383175,,,18.0
6,251.656367,10.445834,12.460359,0.041508,0.049513,240.502689,279.442238,204.512929,175.812117,337.800633,,,6.0
7,269.485381,17.829014,18.285628,0.066159,0.067854,258.196945,301.272068,212.242826,197.451776,374.912279,,,2.0
8,293.029265,23.543884,23.764411,0.080347,0.081099,284.752286,324.882982,230.228913,206.482982,398.269171,,,1.0
9,327.347781,34.318516,34.318516,0.104838,0.104838,311.499678,370.775185,260.196571,233.167612,440.026348,,,
10,370.376973,43.029191,43.803168,0.116177,0.118266,353.836018,423.567427,293.015963,256.892401,495.766548,,,


## Save the Data

In [61]:
daily.to_csv(output_data + 'ngonye_daily.csv')
monthly.to_csv(output_data + 'ngonye_flow_monthly.csv')
yearly.to_csv(output_data + 'ngonye_flow_yearly.csv')
calmonthly.to_csv(output_data + 'ngonye_flow_calmonthly.csv')
selected.to_csv(output_data + 'ngonye_flow_selected_years.csv')
fdc.to_csv(output_data + 'ngonye_flow_annual_exceedance.csv')
annual_fdcs.to_csv(output_data + 'ngonye_flow_annual_fdcs.csv')
flow_fdc.to_csv(output_data + 'ngonye_flow_fdc.csv')
monthly_fdcs.to_csv(output_data + 'ngonye_monthly_fdc.csv')
floods.to_csv(output_data + 'ngonye_floods.csv')
weekly.to_csv(output_data + 'ngonye_weekly.csv')
weekly_slim.to_csv(output_data + 'ngonye_weekly_slim.csv')
waterweeks.to_csv(output_data + 'ngonye_waterweekly.csv')


In [62]:
years=daily['WaterYear'].unique().tolist()

for year in years:
    days=daily.loc[daily.WaterYear==year]
    days.to_csv(output_data + '/years/daily_' + str(year) + '.csv')