# Flow to Terminal Wetlands
This metric assessed a number of terminal wetlands on the inflow to wetland/Inflow to catchment. Terminal wetlands are defined here as end of system wetlands that can be influenced by environmental water.  

An interactive dashboard displaying the results of all flow regime element analysis was developed to communicate the results in a way that could convey the final assessment result category together with the analytical inputs to that assessment.   


## Inputs:

[AWRA-L inflow data](https://data.gov.au/data/dataset/e65078cd-808d-4514-ab60-17e597b9a883/resource/7442a111-2894-4572-aa41-1f488bf06636)

[Gauges of interest for Terminal Wetlands](https://data.gov.au/data/dataset/7c44535b-4a6a-432d-acff-00ec578ce7b9/resource/ec0fcf5e-ebc1-4602-a2bf-fd3b71b25611)

[Modeled flows Baseline 845](https://data.gov.au/data/dataset/9e3d2d32-33e7-4270-a8af-c655d6eb7710/resource/64cc37eb-19a0-4a80-b99b-6ddb4da71e49)

## Outputs:

[Results](https://data.gov.au/data/dataset/hydrologic-indicator-results-for-the-basin-plan-evaluation-2020)

In [0]:
import pandas as pd
import numpy as np
import scipy.stats 
import itertools
import warnings 
warnings.filterwarnings('ignore')

## Load Model data

Loading in the 845 Model Baseline. This scenario represents baseline conditions as specified in the Basin Plan (conditions as at 2009).

*Please note while the 845 scenario was part of the information base used to develop the Basin Plan run 871 has subsequently become the baseline scenario for legislative purposes.*

In [0]:
allsites_845_daily = \
    pd.read_csv('https://data.gov.au/data/dataset/9e3d2d32-33e7-4270-a8af-c655d6eb7710/resource/64cc37eb-19a0-4a80-b99b-6ddb4da71e49/download/modelledflows_modelrun845.csv',
    encoding='latin-1')

In [0]:
def removeHeader(PandasDataframe):
  """ Extracts a clean dataframe from a model run CSV
  Takes a pandas dataframe and removes the header information by looking for EOH
  Renames the columns and produces a date data type to use as the index
  """

  # find the end of header (EOH) row

  idx = \
      PandasDataframe.index[PandasDataframe[PandasDataframe.columns[0]]
                            == 'EOH'].tolist()

  # extract the data below the header

  data = PandasDataframe[idx[0] + 1:]

  # extract the column names

  columns = PandasDataframe.loc[idx[0] - 1].tolist()

  # rename the dataframs columns

  data.columns = PandasDataframe.loc[idx[0] - 1].tolist()

  # Check date format

  if data.columns[0:3].tolist() == ['Dy', 'Mn', 'Year']:
      data['date'] = pd.to_datetime(data.Year.astype(int) * 10000
              + data.Mn.astype(int) * 100 + data.Dy.astype(int),
              format='%Y%m%d')

  if data.columns[0:3].tolist() == ['YYYY', 'MM', 'DD']:
      data['date'] = pd.to_datetime(data.YYYY.astype(int) * 10000
              + data.MM.astype(int) * 100 + data.DD.astype(int),
              format='%Y%m%d')

  return data

In [0]:
def waterYear(date):
  '''Takes in date,
  changes year to water year
  returns water year'''
  
  if date.month <= 6:
    waterYear = date.year - 1
  else:
    waterYear = date.year
    
  return int(waterYear)


In [0]:
# cleaning and converting model 845 to pandas dataframe

allsites_845 = removeHeader(allsites_845_daily)

##Load catchment inflow data

Load daily inflow data for each catchment from Australian Landscape Water Balance Landscape model. 

The data is loaded in, cleaned, and grouped by the water year 

This dataset is stored in https://data.gov.au/data/dataset/e65078cd-808d-4514-ab60-17e597b9a883/resource/7442a111-2894-4572-aa41-1f488bf06636

Inflows are given by:  
 $$ Inflows = Runoff \times Surface Area$$

Where runoff from 1911 - 2018/19 was provided by Bureau of Meteorology’s (BoM) AWRA Modelling Team from the Australian Water Resources Assessment Landscape model (AWRA-L) version 6.0. 

Surface Area calculated from shapefile of catchments (available [here](https://services8.arcgis.com/5xxEi7I2m6ml97fE/arcgis/rest/services/BASIN_PLAN_REGIONS/FeatureServer))

In [0]:
RawBOMData = pd.read_csv('https://data.gov.au/data/dataset/e65078cd-808d-4514-ab60-17e597b9a883/resource/7442a111-2894-4572-aa41-1f488bf06636/download/catchmentinflows_modelledrunoffdata_awralv6.csv')
RawBOMData.head()

Unnamed: 0,Column1,Barwon-Darling,Border Rivers,Campaspe,Condamine-Balonne,Eastern Mt Lofty Ranges,Goulburn-Broken,Gwydir,Lachlan,Loddon,Lower Darling,Macquarie-Castlereagh,Moonie,Murray,Murrumbidgee,Namoi,Ovens,Paroo,Warrego,Wimmera-Avoca
0,1/01/1911,280.945743,889.116774,448.18278,464.969232,731.547092,5352.450565,842.971632,3708.877278,244.310367,361.576414,1989.390134,10.016151,32348.09683,12736.19431,1018.08705,7820.42964,197.437945,311.490847,318.456924
1,2/01/1911,245.939033,931.281696,354.872346,551.652614,561.807131,4426.229832,869.601585,2982.273356,217.188593,267.220044,1854.414908,7.661954,25171.03952,10177.40817,1024.492008,6337.477974,175.114516,365.595952,284.131808
2,3/01/1911,221.136955,996.579011,287.529469,656.670069,443.33388,3747.495391,1056.964622,2521.92074,197.458266,207.665376,1802.773964,6.639534,19788.21003,8278.185418,1030.521136,5224.703577,161.117734,380.395537,259.23294
3,4/01/1911,228.037716,906.414042,238.892884,565.38187,360.467282,3252.306652,1344.195169,2588.127451,183.177053,172.462025,1928.902206,5.369864,16204.2339,7446.809863,1009.258218,4470.30592,152.364875,332.566101,240.716578
4,5/01/1911,282.894836,780.040817,211.475708,494.168382,302.34578,3000.776435,1054.972176,3047.736346,185.881273,172.42676,2834.161997,4.33717,15689.32201,7848.837431,1056.232902,4109.854376,156.725326,300.506739,231.350913


In [0]:
def transformPipline(RawDataframe):
  """
  Single function to transform raw dataframe from blob into a pandas dataframe ready for analysis
  """

  # Turn Column1 into Date

  DailyRunoffDataframe = RawDataframe.rename({'Column1':'Date'}, axis =1 )

  # total up northen basin catchments

  NorthernBasinCatchments = [
      'Barwon-Darling',
      'Border Rivers',
      'Condamine-Balonne',
      'Gwydir',
      'Macquarie-Castlereagh',
      'Moonie',
      'Namoi',
      'Paroo',
      'Warrego']
  DailyRunoffDataframe['Northern Basin'] = \
    DailyRunoffDataframe.apply(lambda row: \
                               row[NorthernBasinCatchments].sum(), 
                               axis=1)

  # total up southern basin catchments

  SouthernBasinCatchments = [
      'Campaspe',
      'Eastern Mt Lofty Ranges',
      'Goulburn-Broken',
      'Lachlan',
      'Loddon',
      'Lower Darling',
      'Murray',
      'Murrumbidgee',
      'Ovens',
      'Wimmera-Avoca',
      ]

  DailyRunoffDataframe['Southern Basin'] = \
      DailyRunoffDataframe.apply(lambda row: \
                                 row[SouthernBasinCatchments].sum(),
                                 axis=1)

  # total up all catchments

  AllCatchments = NorthernBasinCatchments + SouthernBasinCatchments

  DailyRunoffDataframe['Total MDB'] = \
      DailyRunoffDataframe.apply(lambda row: \
                                   row[AllCatchments].sum(), axis=1)

  # convert to a datetime data type

  DailyRunoffDataframe['Date'] = \
      pd.to_datetime(DailyRunoffDataframe['Date'], format='%d/%m/%Y')

  # drop Nulls

  DailyRunoffDataframe = DailyRunoffDataframe.dropna()

  return DailyRunoffDataframe

In [0]:
DailyRunoffDataframe = transformPipline(RawBOMData)

# apply water year function to populate the water year column

DailyRunoffDataframe['Water Year'] = \
    DailyRunoffDataframe.apply(lambda row: waterYear(row['Date']),
                               axis=1)

In [0]:
# summing annual inflow by water year

AnnualisedInflow = DailyRunoffDataframe.groupby('Water Year').sum()
AnnualisedInflow = \
    AnnualisedInflow.rename(columns={'Northern Basin': 'Overall North '
                            , 'Southern Basin': 'Overall South ',
                            'Total MDB': 'Overall MDBA System '})


## Load gauges of interest for terminal wetlands metric
Gets gauges of interest and their associated gauge data.

Terminal wetland gauge information can be found: https://data.gov.au/data/dataset/7c44535b-4a6a-432d-acff-00ec578ce7b9/resource/ec0fcf5e-ebc1-4602-a2bf-fd3b71b25611

In [0]:
Data = pd.read_csv('https://data.gov.au/data/dataset/7c44535b-4a6a-432d-acff-00ec578ce7b9/resource/ec0fcf5e-ebc1-4602-a2bf-fd3b71b25611/download/observedflows_terminalwetlands.csv', header=None)
Data.head()

Unnamed: 0,0,1,2,3,4,5,6,7
0,,Condamine-Balonne,Gwydir,Macquarie-Castlereagh,Macquarie-Castlereagh,Lachlan,Murrumbidgee,Wimmera-Avoca
1,,Narran Lakes - Narran River at Wilby Wilby,Gwydir Wetland - Yarraman Bridge,Macquarie Marshes - Marebone Brk@D/S Reg,Macquarie Marshes - Macq @D/S Marebone W,Booligal Wetlands - Booligal Weir,Murrumbidge Valley National Park - Maude Weir,Wimmera River Wetlands - Wimmera River at Loch...
2,Date,422016,418004,421088,421090,412005,410040,415246
3,1/07/1949,,,,,72.36,,
4,2/07/1949,,,,,72.36,,


In [0]:
# Loading annual flows past gauge 415246 (Wimmera-Avoca)
wa415246 = pd.read_csv('https://data.gov.au/data/dataset/9e3d2d32-33e7-4270-a8af-c655d6eb7710/resource/8a661092-325c-4b6e-93c1-942a869a01df/download/modelledflows_wimmeraavoca_modelrun845.csv')
wa415246.head()

Unnamed: 0,Start of water year,415246
0,1911,135315
1,1912,111654
2,1913,36353
3,1914,45150
4,1915,257260


## Transform gauge data 

Organising dataframe to get it ready for analysis:
- Putting gauge numbers as column headings
- stripping header information and using this data to filter to only the locations of interest

In [0]:
DataFrame = Data.loc[3:]
DataFrame.columns = map(str.strip,
                           Data.loc[2].astype(str).tolist())
DataFrame['Date'] = pd.to_datetime(DataFrame['Date'],
        format='%d/%m/%Y')
DataFrame.set_index('Date')

DataFrame.head()

Unnamed: 0,Date,422016,418004,421088,421090,412005,410040,415246
3,1949-07-01,,,,,72.36,,
4,1949-07-02,,,,,72.36,,
5,1949-07-03,,,,,72.36,,
6,1949-07-04,,,,,72.36,,
7,1949-07-05,,,,,69.356,,


In [0]:
CatchmentGaugeMapping = Data.loc[0:2]
CatchmentGaugeMapping.columns = Data.loc[2].str.strip().tolist()
CatchmentGaugeMapping = CatchmentGaugeMapping.drop('Date', axis=1)
CatchmentGaugeMapping = CatchmentGaugeMapping.transpose()
CatchmentGaugeMapping.columns = ["catchment",'Name', 'Gauge']

CatchmentGaugeMapping.head()

Unnamed: 0,catchment,Name,Gauge
422016,Condamine-Balonne,Narran Lakes - Narran River at Wilby Wilby,422016
418004,Gwydir,Gwydir Wetland - Yarraman Bridge,418004
421088,Macquarie-Castlereagh,Macquarie Marshes - Marebone Brk@D/S Reg,421088
421090,Macquarie-Castlereagh,Macquarie Marshes - Macq @D/S Marebone W,421090
412005,Lachlan,Booligal Wetlands - Booligal Weir,412005


In [0]:
DataFrame['water year'] = DataFrame.apply(lambda row: \
        waterYear(row['Date']), axis=1)
DataFrame.head()


Unnamed: 0,Date,422016,418004,421088,421090,412005,410040,415246,water year
3,1949-07-01,,,,,72.36,,,1949
4,1949-07-02,,,,,72.36,,,1949
5,1949-07-03,,,,,72.36,,,1949
6,1949-07-04,,,,,72.36,,,1949
7,1949-07-05,,,,,69.356,,,1949


In [0]:
def gaugetocatchment(gaugeID):
  """ Ingests a gauge number
  Returns the catchment it falls in"""
  return CatchmentGaugeMapping.loc[gaugeID]["catchment"]

gaugetocatchment("412005")

In [0]:
# Melting/transposing the dataframe and adding the catchment:

meltedDataFrame = pd.melt(DataFrame, id_vars=['Date', 'water year'
                             ], var_name='ID', value_name='Outflow')
meltedDataFrame['Catchment'] = meltedDataFrame.apply(lambda x: \
        gaugetocatchment(x['ID']), axis=1)

meltedDataFrame['Outflow'] = meltedDataFrame['Outflow'
        ].astype('float64')
meltedDataFrame['ID'] = meltedDataFrame['ID'].astype('int64')
meltedDataFrame = meltedDataFrame.groupby(['Date', 'water year',
        'Catchment'], as_index=False).sum()

meltedDataFrame.loc[meltedDataFrame['Catchment']
                       == 'Macquarie-Castlereagh', 'ID'] = 421090
meltedDataFrame.head()

Unnamed: 0,Date,water year,Catchment,ID,Outflow
0,1949-07-01,1949,Condamine-Balonne,422016,0.0
1,1949-07-01,1949,Gwydir,418004,0.0
2,1949-07-01,1949,Lachlan,412005,72.36
3,1949-07-01,1949,Macquarie-Castlereagh,421090,0.0
4,1949-07-01,1949,Murrumbidgee,410040,0.0


In [0]:
#Grouping by water year:
AnnualisedDataFrame = meltedDataFrame[['Outflow', 'water year',
        'ID', 'Catchment']].groupby(['Catchment', 'water year', 'ID'],
                                    as_index=False).sum()[['Outflow',
        'water year', 'Catchment', 'ID']]
AnnualisedDataFrame.head()


Unnamed: 0,Outflow,water year,Catchment,ID
0,0.0,1949,Condamine-Balonne,422016
1,0.0,1950,Condamine-Balonne,422016
2,0.0,1951,Condamine-Balonne,422016
3,0.0,1952,Condamine-Balonne,422016
4,0.0,1953,Condamine-Balonne,422016


In [0]:
AnnualisedInflow['Water Year'] = AnnualisedInflow.index
meltedDataFrameinflows = pd.melt(AnnualisedInflow,
                                    id_vars=['Water Year'],
                                    var_name='Catchment',
                                    value_name='inflow')
meltedDataFrameinflows.head()


Unnamed: 0,Water Year,Catchment,inflow
0,1910,Barwon-Darling,126450.682659
1,1911,Barwon-Darling,227134.33317
2,1912,Barwon-Darling,164450.268088
3,1913,Barwon-Darling,81373.743705
4,1914,Barwon-Darling,64467.114776


In [0]:
# Filtering the data to only include observed flow data after the cap on diversions was introduced (1994) and calculating the outflow/inflow ratio:
MergedResults = pd.merge(meltedDataFrameinflows,
                         AnnualisedDataFrame, left_on=['Water Year',
                         'Catchment'], right_on=['water year',
                         'Catchment'])

MergedResults = MergedResults.drop(['water year'], axis=1)

MergedResults = MergedResults[MergedResults['Water Year'] >= 1994]

MergedResults['Ratio'] = MergedResults.apply(lambda row: row['Outflow'] \
        / row['inflow'], axis=1)

Results = MergedResults
observedPostDF = Results

Results.head()

Unnamed: 0,Water Year,Catchment,inflow,Outflow,ID,Ratio
45,1994,Condamine-Balonne,775150.6,3795.641,422016,0.004897
46,1995,Condamine-Balonne,3627252.0,364884.828,422016,0.100595
47,1996,Condamine-Balonne,2022364.0,129468.549,422016,0.064018
48,1997,Condamine-Balonne,1014814.0,39471.112,422016,0.038895
49,1998,Condamine-Balonne,2790745.0,319225.726,422016,0.114387


## Node mapping

Mapping the gauges to the model nodes and filtering out the nodes not used in the analysis

In [0]:
gauge_mapping = pd.read_csv("https://data.gov.au/data/dataset/7c44535b-4a6a-432d-acff-00ec578ce7b9/resource/265dd4f0-08e1-4485-be9e-0feac8deb5f0/download/gaugemapping.csv", usecols = [0,1,2,3])
gauge_mapping

gauge_mapping.loc[gauge_mapping.catchment=="Overall South", "catchment"]= "Overall South "

In [0]:
# Filtering to locations of interest

allsites_845['water year'] = allsites_845.apply(lambda row: \
        waterYear(row['date']), axis=1)

justnodes_845 = allsites_845.drop(['Dy', 'Mn', 'Year', 'date'], axis=1)

listofcol = justnodes_845.columns.tolist()

Gauges = CatchmentGaugeMapping.index.tolist()


def GaugeToNode(gauge):
  '''Takes in gauge
  returns the matching model node'''
  
  node = gauge_mapping[gauge_mapping['gauge_number'] == gauge]['node']
  return node


gauge_mapping

match = []
for gauge in Gauges:
    mapping = GaugeToNode(gauge)
    if len(mapping) > 0:

        match.append(mapping.tolist())

merged = list(itertools.chain.from_iterable(match))

joinmerged = list(set(listofcol) & set(merged))

justkwnodes_845 = justnodes_845[joinmerged]

justkwnodes_845['Water Year'] = allsites_845['water year']
justkwnodes_845.set_index('Water Year')
justkwnodes_845.head()

Unnamed: 0,8MARBON,11FGMAU,10BOOLG,6YARMAN,422016_,Water Year
292,103,2351,200,84,0,1895
293,80,2286,201,80,0,1895
294,74,2647,163,81,0,1895
295,205,3335,105,96,0,1895
296,277,5303,97,107,0,1895


In [0]:
meltedDataFrameinflows = \
    meltedDataFrameinflows.replace({'Overall North ': 'Overall North'
        , 'Overall MDBA System ': 'Overall MDBA System'})
meltedDataFrameinflows.Catchment.unique()


In [0]:
wa415246["Catchment"] = "Wimmera-Avoca"

wa415246 = wa415246.rename(columns={"415246": "Outflow"})
wa_concat_415246 = wa415246.set_index('Start of water year')
wa_concat_415246.head()

Unnamed: 0_level_0,Outflow,Catchment
Start of water year,Unnamed: 1_level_1,Unnamed: 2_level_1
1911,135315,Wimmera-Avoca
1912,111654,Wimmera-Avoca
1913,36353,Wimmera-Avoca
1914,45150,Wimmera-Avoca
1915,257260,Wimmera-Avoca


## Calculating outflow/inflow ratio for each water year

In [0]:
def nodetoCatchment(nodeID):
  """ Ingests a node ID label
  Returns the catchment the node is in"""
  
  return gauge_mapping[gauge_mapping['node'] == nodeID]['catchment'].values[0]


In [0]:
# Grouping by water year and calculating the outflow/inflow ratio
melt845 = pd.melt(justkwnodes_845, id_vars=['Water Year'],
                  var_name='Node', value_name='Outflow')
melt845['Outflow'] = melt845['Outflow'].astype(int)
melt845 = melt845.groupby(['Node', 'Water Year'], as_index=False).sum()

melt845['Catchment'] = melt845.apply(lambda row: \
        nodetoCatchment(row['Node']), axis=1)

melt845 = melt845.groupby(['Water Year', 'Catchment'],
                          as_index=False).sum()

melt845 = melt845[melt845['Water Year'] > 1910]
melt845 = melt845.set_index('Water Year')

melt845 = pd.concat([melt845, wa_concat_415246])
melt845.index.set_names('Water Year', inplace=True)

modelmerge = pd.merge(melt845, meltedDataFrameinflows, how='left',
                      left_on=['Water Year', 'Catchment'],
                      right_on=['Water Year', 'Catchment'])

modelmerge['Ratio'] = modelmerge.apply(lambda row: row['Outflow'] \
        / row['inflow'], axis=1)

In [0]:
def Catchmenttogauge(CatchmentID):
  """ Ingests a catchment name
  Returns the gauge number associated with this cathcment"""
  return CatchmentGaugeMapping[CatchmentGaugeMapping["catchment"]==CatchmentID].index.values[0]

Catchmenttogauge("Gwydir")

In [0]:
modelmerge = modelmerge.replace({'Overall North': 'Overall North ',
                                'Overall MDBA System': 'Overall MDBA System '
                                })

modelmerge['ID'] = modelmerge.apply(lambda x: \
                                    Catchmenttogauge(x['Catchment']),
                                    axis=1)
modelmerge.head()

Unnamed: 0,Water Year,Catchment,Outflow,inflow,Ratio,ID
0,1911,Condamine-Balonne,38852,635247.3,0.06116,422016
1,1911,Gwydir,117980,349458.5,0.337608,418004
2,1911,Lachlan,89373,992129.1,0.090082,412005
3,1911,Macquarie-Castlereagh,194566,1154193.0,0.168573,421088
4,1911,Murrumbidgee,584600,2851666.0,0.205003,410040


##Statistical analysis on observed and model data

Compare the pre and post Basin Plan Terminal Wetlands using:
- Welsh's T-test (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html)
- the KS two sample test (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ks_2samp.html)

##Selecting an \\(\alpha\\)
With two tests with alphas set at 0.1, the probability of observing a false statistically significant results in both tests is 1%  

Typically, methods for dealing with multiple tests call for adjusting alpha in some way, however, these methods are designed for statistical investigations looking for a single significant result, ‘a discovery’. This is not the case in the application of two statistical tests looking for concurrent significant results.  

Setting alpha to 0.1 in both tests so that the chance of a false positive ‘increased’ or ‘decreased’ result is 1% is suitably rigorous and decidedly reasonable for the task at hand.

## Observed data: comparing pre and post Basin Plan
- Pre Basin Plan period: 1994 to 2011 water years
- Post Basin Plan period: 2012 to 2018 water years

In [0]:
def siteloop(ResultsDataFrame, Catchment, quiet=True):
  '''Takes in dataframe with Terminal Wetland ratio and date,
  filters the dataframe to pre and post basin plan periods,
  runs  Welsh's t test and ks two sample test on both periods,
  returns the results dataframe'''
  pre = np.array(ResultsDataFrame[(ResultsDataFrame['Water Year']
               < 2012) & (ResultsDataFrame['Catchment']
               == Catchment)]['Ratio'])
  post = np.array(ResultsDataFrame[(ResultsDataFrame['Water Year']
                  >= 2012) & (ResultsDataFrame['Catchment']
                  == Catchment)]['Ratio'])

  (ksStat, KsP) = scipy.stats.ks_2samp(pre, post)
  (tStat, tP) = scipy.stats.ttest_ind(pre, post, equal_var=False)
  ID = ResultsDataFrame[ResultsDataFrame['Catchment']
                        == Catchment]['ID'].iloc[0]
  
  Outcome = Significant(KsP, tStat, tP, alpha)

  if not quiet:
      print (Catchment, scipy.stats.ks_2samp(pre, post))
      print (Catchment, scipy.stats.ttest_ind(pre, post,
             equal_var=False))
    
  StepDataFrame = pd.DataFrame({
    "Catchment":[Catchment], 
    "ID":[ID], 
    "Metric":["Terminal Wetlands"], 
    "Source":["Observed"],
    "Ks_2sampResult statistic":[ksStat], 
    "Ks_2sampResult pvalue":[KsP], 
    "Welch’s t-test statistic":[tStat], 
    "Welch’s t-test pvalue":[tP], 
    "Outcome":[Outcome]
  }) 
  
  return StepDataFrame


def Significant (Ksp, tStat, tP, alpha):
  '''Takes in results of statistical tests,
  compares the results of the two tests to an alpha value defined by the operator,
  returns the significance'''
  if ((Ksp < alpha) and (tStat <0) and (tP < alpha)):
    outcome = "Improved" 
  elif (tStat >0 and Ksp <alpha and tP < alpha):
    outcome = "Degraded" 
  elif (Ksp >alpha and tP > alpha):
    outcome = "Maintained" 
  elif (Ksp <alpha and tP > alpha):
    outcome = "Unsure - t-test failed" 
  else:
    outcome = "Unsure - ks-test failed"
  return outcome

In [0]:
alpha = 0.1

StatsResults = pd.DataFrame(data=[],columns = [
  "Catchment", 
  "ID", 
  "Metric", 
  "Source", 
  "Ks_2sampResult statistic", 
  "Ks_2sampResult pvalue", 
  "Welch’s t-test statistic", 
  "Welch’s t-test pvalue", 
  "Outcome"
])

for Catchment in Results["Catchment"].unique():
  
  StepDataFrame = siteloop(Results, Catchment) 
  StatsResults = StatsResults.append(StepDataFrame)

StatsResults


Unnamed: 0,Catchment,ID,Metric,Source,Ks_2sampResult statistic,Ks_2sampResult pvalue,Welch’s t-test statistic,Welch’s t-test pvalue,Outcome
0,Condamine-Balonne,422016,Terminal Wetlands,Observed,0.325397,0.567895,1.681442,0.108658,Maintained
0,Gwydir,418004,Terminal Wetlands,Observed,0.603175,0.028809,-1.889838,0.08669,Improved
0,Lachlan,412005,Terminal Wetlands,Observed,0.444444,0.199883,-1.030214,0.332676,Maintained
0,Macquarie-Castlereagh,421090,Terminal Wetlands,Observed,0.52381,0.0817,-1.947328,0.088565,Improved
0,Murrumbidgee,410040,Terminal Wetlands,Observed,0.555556,0.054803,-2.060096,0.071984,Improved
0,Wimmera-Avoca,415246,Terminal Wetlands,Observed,0.492063,0.118959,-0.320611,0.751781,Maintained


## Model data: comparing pre and post Basin Plan
Pre basin Plan period: 1911 to mid 2009 water years

Post Basin Plan period: 2012 to mid 2019 water years

In [0]:
# modelmerge

def modelloop(
    modelmergeDataFrame,
    observedPostDF,
    Catchment,
    quiet=True,
    ):
  '''Takes in dataframe with Terminal Wetland ratio and date,
  filters the dataframe to pre and post basin plan periods,
  runs  Welsh's t test and ks two sample test on both periods,
  returns the results dataframe'''
  modelpre = \
        np.array(modelmergeDataFrame[(modelmergeDataFrame['Water Year']
                 < 2012) & (modelmergeDataFrame['Catchment']
                 == Catchment)]['Ratio'])
  post = np.array(observedPostDF[(observedPostDF['Water Year']
                    >= 2012) & (observedPostDF['Catchment']
                    == Catchment)]['Ratio'])

  (mksStat, mKsP) = scipy.stats.ks_2samp(modelpre, post)
  (mtStat, mtP) = scipy.stats.ttest_ind(modelpre, post,
            equal_var=False)
  ID = modelmergeDataFrame[modelmergeDataFrame['Catchment']
                             == Catchment]['ID'].iloc[0]

  Outcome = Significant(mKsP, mtStat, mtP, alpha)

  if not quiet:
        print (Catchment, scipy.stats.ks_2samp(modelpre, post))
        print (Catchment, scipy.stats.ttest_ind(modelpre, post,
               equal_var=False))


      
  ModelStepDataFrame = pd.DataFrame({
      "Catchment":[Catchment],
      "ID":[ID],
      "Metric":["Terminal Wetlands"],
      "Source":["Model"],
      "Ks_2sampResult statistic":[mksStat],
      "Ks_2sampResult pvalue":[mKsP],
      "Welch’s t-test statistic":[mtStat],
      "Welch’s t-test pvalue":[mtP],
      "Outcome":[Outcome]
       }) 
  
  return ModelStepDataFrame


def Significant (mKsp, mtStat, mtP, alpha):
  '''Takes in results of statistical tests,
  compares the results of the two tests to an alpha value defined by the operator,
  returns the significance'''
  if ((mKsp < alpha) and (mtStat <0) and (mtP < alpha)):
    outcome = "Improved" 
  elif (mtStat >0 and mKsp <alpha and mtP < alpha):
    outcome = "Degraded" 
  elif (mKsp >alpha and mtP >alpha):
    outcome = "Maintained" 
  elif (mKsp <alpha and mtP > alpha):
    outcome = "Unsure - t-test failed"
  else:
    outcome = "Unsure - ks-test failed"
  return outcome


In [0]:
ModelStatsResults = pd.DataFrame(data=[],columns = [
  "Catchment", 
  "ID", 
  "Metric", 
  "Source", 
  "Ks_2sampResult statistic", 
  "Ks_2sampResult pvalue", 
  "Welch’s t-test statistic", 
  "Welch’s t-test pvalue", 
  "Outcome"
   ])

for Catchment in modelmerge['Catchment'].unique():

    ModelStepDataFrame = modelloop(modelmerge, observedPostDF,
                                   Catchment)
    ModelStatsResults = ModelStatsResults.append(ModelStepDataFrame)

ModelStatsResults

Unnamed: 0,Catchment,ID,Metric,Source,Ks_2sampResult statistic,Ks_2sampResult pvalue,Welch’s t-test statistic,Welch’s t-test pvalue,Outcome
0,Condamine-Balonne,422016,Terminal Wetlands,Model,0.479592,0.066673,2.067305,0.073998,Degraded
0,Gwydir,418004,Terminal Wetlands,Model,0.234694,0.810114,1.007794,0.341642,Maintained
0,Lachlan,412005,Terminal Wetlands,Model,0.255102,0.721793,0.288777,0.781572,Maintained
0,Macquarie-Castlereagh,421088,Terminal Wetlands,Model,0.255102,0.721793,0.008543,0.993429,Maintained
0,Murrumbidgee,410040,Terminal Wetlands,Model,0.346939,0.335716,1.561498,0.157928,Maintained
0,Wimmera-Avoca,415246,Terminal Wetlands,Model,0.77551,0.000275,6.885808,1.4e-05,Degraded


In [0]:
frames = [ModelStatsResults, StatsResults]

finaloutflowinflowresult = pd.concat(frames)
finaloutflowinflowresult

Unnamed: 0,Catchment,ID,Metric,Source,Ks_2sampResult statistic,Ks_2sampResult pvalue,Welch’s t-test statistic,Welch’s t-test pvalue,Outcome
0,Condamine-Balonne,422016,Terminal Wetlands,Model,0.479592,0.066673,2.067305,0.073998,Degraded
0,Gwydir,418004,Terminal Wetlands,Model,0.234694,0.810114,1.007794,0.341642,Maintained
0,Lachlan,412005,Terminal Wetlands,Model,0.255102,0.721793,0.288777,0.781572,Maintained
0,Macquarie-Castlereagh,421088,Terminal Wetlands,Model,0.255102,0.721793,0.008543,0.993429,Maintained
0,Murrumbidgee,410040,Terminal Wetlands,Model,0.346939,0.335716,1.561498,0.157928,Maintained
0,Wimmera-Avoca,415246,Terminal Wetlands,Model,0.77551,0.000275,6.885808,1.4e-05,Degraded
0,Condamine-Balonne,422016,Terminal Wetlands,Observed,0.325397,0.567895,1.681442,0.108658,Maintained
0,Gwydir,418004,Terminal Wetlands,Observed,0.603175,0.028809,-1.889838,0.08669,Improved
0,Lachlan,412005,Terminal Wetlands,Observed,0.444444,0.199883,-1.030214,0.332676,Maintained
0,Macquarie-Castlereagh,421090,Terminal Wetlands,Observed,0.52381,0.0817,-1.947328,0.088565,Improved
