# Homework 7

Adaptation of Question 5.1 from Loucks and van Beek / question 2 from HW 5. A reservoir serves multiple purposes. It is a recreation site for swimmers, wind surfers and boaters; a flood storage reservoir; and an irrigation supply structure. Each season, there is a required minimum release for irrigation to meet water rights downstream. Reservoir releases in excess of the irrigation requirement flow to a wetland downstream. The wetland also receives return flows from irrigation, equal to 30\% of the water diverted for irrigation. These return flows are highly saline, which can damage the ecosystem. Different target water allocations are favorable for each of these objectives. Based on mean inflows across 4 seasons, your task is to determine how much water should be released from the reservoir in each season to minimize the sum of percent deviation across all targets given the following parameters and targets:

* Reservoir storage capacity, $K$: 30 million m$^3$ (mcm)
* Maximum reservoir release, $Rmax$: 50 mcm
* Reservoir seasonal inflows distribution, $Y=\ln(Qin)$: $N(1.8, 0.1)$, $N(3.6, 0.3)$, $N(2.9, 0.3)$, $N(2.3, 0.2)$ ($Qin$ in mcm)
* Correlation between consecutive season's log-space flows: $\rho=0.7$
* Minimum releases for irrigation each season, $Qirr$: 5, 20, 10, 5 mcm
* Salinity concentration of reservoir water, $Cres$: 1 part per trillion (ppt)
* Salinity concentration of irrigation return flow water, $Cirr$: 20 ppt
* Target salinity concentration in the wetland, $Cwet\_maxTarget$: 3 ppt
* Target minimum flow in the wetland each season, $Qwet\_minTarget$: 10, 20, 15, 15 mcm
* Target reservoir storages each period, $S$: 20, 5, 20, 20 mcm

## Part a

Re-run your dynamic program from HW 5 to find the optimal releases from the reservoir each season for discrete storage states between 0 and 30 in increments of 5, assuming the mean inflow each season. Use the same cost function as HW5: the sum of percent deviations from all targets, where storages above and below the target are penalized but only wetland salinity concentrations below the minimum target are penalized, and only wetland flows above the maximum target. Report a table with the prescribed release each season from each storage state.

$\color{red}{\text{Complete the code below with the inflow parameters}}$

In [None]:
import numpy as np
import pandas as pd
import scipy.stats as ss
import matplotlib.pyplot as plt

# inflow parameters
muY =
sigmaY =
nSeasons = len(muY)
meanQ =
rho =

$\color{red}{\text{The code below is complete - run it to get the optimal DDP policy.}}$

In [None]:
# define model parameters
# reservoir parameters
K = 30
maxR = 50
S_target = np.array([20, 5, 20, 20])

# irrigation parameters
Qirr = np.array([5,20,10,5])
Qirr_RF =  0.3

# salinity parameters
Cres = 1
Cirr = 20
Cwet_maxTarget = 5

# wetland parameters
Qwet_minTarget = np.array([10,20,15,15])

######################## DP Optimization ########################

def calcCostDP(S, Q, Qirr, S_target, Cwet_maxTarget, Qwet_minTarget, bounds, FutureCost):
  '''
  Function to calculate the optimal release (Rbest) from each storage state (S)
  and associated present and future cost (Cbest), defined as the total squared
  deviation between storage and release targets for that stage.

  Inputs:
    S: 1-D array of discrete storage values representing the states
    Q: inflow received at this stage (scalar)
    Qirr: irrigation flow this stage (scalar)
    S_target: target storage for next stage (scalar)
    Cwet_target: target maximum salinity in the wetland for this stage (scalar)
    Qwet_minTarget: target minimum flow in the wetland for this stage (scalar)
    bounds: bounds on possible releases (1-D array of length 2)
    FutureCost: 1-D array of future costs at each state that will be added to
      cost of the optimal state transition at this stage to compute present + future cost

  Outputs:
    Rbest: 1-D array of optimal releases from each state in S
    Cbest: 1-D array of present + future costs associated with each state S
  '''

  # initialize current cost at infinity and releases at 0
  Cbest = np.empty([len(S)])+np.inf
  Rbest = np.zeros([len(S)])
  for i, s in enumerate(S): # storage at stage t
    # find optimal storage to move to at stage t+1
    for j, sNext in enumerate(S): # storage at stage t+1
      R = s + Q - sNext # release to get to sNext
      # find cost of this release if it's feasible
      if R >= bounds[0] and R <= bounds[1]:
        # compute flow and salinity in the wetland
        Qwet = R - Qirr + Qirr_RF*Qirr
        Cwet = (Cres*(R - Qirr) + Cirr*Qirr_RF*Qirr) / Qwet

        # compute total cost C (total deviation from targets + future cost at sNext)
        S_deviation = np.abs(sNext-S_target)*100/S_target
        Qwet_deviation = max(Qwet_minTarget - Qwet,0)*100 / Qwet_minTarget
        Cwet_deviation = max(Cwet - Cwet_maxTarget,0)*100 / Cwet_maxTarget
        C = S_deviation + Qwet_deviation + Cwet_deviation + FutureCost[j]
        # update optimal value (Cbest) and decision (Rbest) if better than current best
        if C < Cbest[i]:
          Cbest[i] = C
          Rbest[i] = R

  return Rbest, Cbest

# get indices of stages
nStages = len(meanQ)
forward_indices = np.arange(nStages)
backward_indices = forward_indices[::-1]
backward_indices = np.insert(backward_indices,0,0)

# discretize states
states = np.arange(0,31,5)
nStates = len(states)

# bounds on decision variables (releases)
# R is between Qirr and 50, S is positive and can't exceed capacity K
bounds = []
for i in range(len(Qirr)):
    bounds.append([Qirr[i],maxR])

# initialize matrices with costs of each state at each stage
# and optimal releases to make from each state at each stage
DDP_costs = np.empty([nStates,nStages])
DDP_release_policy = np.empty([nStates,nStages])

# initialize FutureCost at 0 for all states; will update as we move backwards
FutureCost = np.zeros([nStates])

# begin backward-moving DP
loop = True
while loop:
  count = 0
  for index in backward_indices[0:-1]:
    # find optimal release and value of each state in this stage
    R, FutureCost = calcCostDP(states, meanQ[index-1], Qirr[index-1], S_target[index], Cwet_maxTarget,
                               Qwet_minTarget[index-1], bounds[index-1], FutureCost)

    # count iterations with no change in optimal release
    if np.all(R == DDP_release_policy[:,index-1]):
      count += 1

    # update best releases and value of each state if not yet in steady state
    DDP_costs[:,index] = FutureCost
    DDP_release_policy[:,index-1] = R

  # stop loop if no change in optimal decisions across all iterations
  if count == len(backward_indices[0:-1]):
    break

release_policy_df = pd.DataFrame(DDP_release_policy,
                                 columns=["Season 1","Season 2","Season 3","Season 4"],
                                 index=states)
release_policy_df.index.rename("Storage",inplace=True)
release_policy_df

## Part b

Now explicitly consider uncertainty in the optimization using stochastic dynamic programming. First, compute the transition probabilities from 15 discrete log-space flow levels $Y_{t-1}$ in season $t-1$ to 15 discrete log-space flow levels $Y_t$ in season $t$. Print the transition probabilities each season.

$\color{red}{\text{Complete the code below to calculate and print the transition probabilities}}$

## Part c

Modify the functions $\texttt{calcCostSDP}$ and $\texttt{findBestR}$ from the [SDPexample.ipynb](https://colab.research.google.com/github/EnvSystemsUVA/CodingExamples/blob/main/10_SDPexample.ipynb) notebook shown in class to calculate the sum of present and expected future costs, where costs are quantified the same as in part (a). Then optimize the SDP policy, iterating through the years until the average percent difference in the release policy is $<$ 0.1\% across all seasons. Constrain the SDP policy to not violate the storage constraints under the mean inflow. Print the release policy as a function of the storage states and flow levels each season.

$\color{red}{\text{Complete the functions below to calculate present and expected future costs.}}$

In [None]:
######################## SDP Optimization ########################
from scipy.optimize import minimize

def findBestR(R, s, y1, j, S, Ylevels2, transprob, Qirr, S_target, Cwet_maxTarget, Qwet_minTarget, FutureExpCost):


  return C

def calcCostSDP(S, Ylevels1, Ylevels2, Qguess, transprob, Qirr, S_target, Cwet_maxTarget, Qwet_minTarget, bounds, FutureExpCost):


  return Rbest, Cbest

$\color{red}{\text{Complete the code below to find the optimal release policy with SDP}}$

In [None]:
# use same stages, states, indices and bounds as for DDP
# initialize matrices with costs of each state at each stage
# and optimal releases to make from each state at each stage
SDP_costs = np.empty([nStates, nLevels, nStages])
SDP_release_policy = np.empty([nStates, nLevels, nStages])

# initialize FutureCost at 0 for all states; will update as we move backwards
FutureExpCost = np.zeros([nStates, nLevels])

# begin backward-moving SDP


# print the release policy of each stage


## Part d

Simulate 50 years of operations in which the release each season is determined by the DDP policy found in part (a). Repeat this using the SDP operating policy found to be optimal using DP in part (c). In both cases, start the simulation at the target reservoir storage for the first season. If there is insufficient water to meet the release prescribed by the policy, only release as much water as is available. Likewise, if the prescribed release would result in exceeding the reservoir capacity, release as much as needed to prevent that (this may not meet the irrigation requirement, or may exceed 50, but that's okay for the purpose of this simulation).

$\color{red}{\text{The code below is complete.}}$

In [None]:
######################## Simulation ########################

# initialize storages and releases for simulation of 50 years of 3 seasons with NLP and DP policies
nYears = 50
nSeasons = 4

class Solution():
  # initialize Solution class with certain attributes for DP vs. NLP solution
  def __init__(self):
    self.simS = np.zeros([nYears,nSeasons])
    self.simR = np.zeros([nYears,nSeasons])
    self.simQwet = np.zeros([nYears,nSeasons])
    self.simCwet = np.zeros([nYears,nSeasons])
    self.S_costs = np.zeros([nYears])
    self.Qwet_costs = np.zeros([nYears])
    self.Cwet_costs = np.zeros([nYears])
    self.Total_costs = np.zeros([nYears])
    self.prescribedR = None
    self.Rmin_violations = 0
    self.Rmax_violations = 0

  # method of Solution class to calculate simulated R and S
  def getSimStates(self, Q, year, season):
    # adjust prescribed release if not physically possible
    # R = min(prescribedR, simS + Q) prevents it from releasing more water than is available
    # max(simS + Q - K, R) prevents storage capacity from being exceeded
    self.simR[year,season] = max(self.simS[year,season] + Q - K,
                                 min(self.prescribedR, self.simS[year,season] + Q))

    # find number of violations of maxR and Qirr
    if self.simR[year,season] > maxR:
      self.Rmax_violations += 1
    elif self.simR[year,season] < Qirr[season]:
      self.Rmin_violations += 1

    # calculate new storage
    if season != (nSeasons-1): # storage in next season of same year
      self.simS[year,season+1] = self.simS[year,season] + Q - self.simR[year,season]
    elif year != (nYears-1): # storage in season 1 of next year
      self.simS[year+1,0] = self.simS[year,season] + Q - self.simR[year,season]

    # calculate Qwet and Cwet from mass balance
    self.simQwet[year,season] = self.simR[year,season] - Qirr[season] + Qirr_RF*Qirr[season]
    self.simCwet[year,season] = (Cres*(self.simR[year,season] - Qirr[season]) +
                                     Cirr*Qirr_RF*Qirr[season]) / self.simQwet[year,season]

  # method of Solution class to calculate cost (total deviation from targets) over simulation
  def getSimCost(self, year):
    self.S_costs[year] = np.sum(np.abs(self.simS[year,:] - S_target)*100 / S_target)
    for season in range(nSeasons):
      self.Qwet_costs[year] += max(0,Qwet_minTarget[season] - self.simQwet[year,season])*100 / Qwet_minTarget[season]
      self.Cwet_costs[year] += max(0,self.simCwet[year,season] - Cwet_maxTarget)*100 / Cwet_maxTarget

    self.Total_costs[year] = self.S_costs[year] + self.Qwet_costs[year] + self.Cwet_costs[year]

$\color{red}{\text{Complete the code below to run the simulation with DDP and SDP}}$

In [None]:
from scipy.interpolate import RegularGridInterpolator as interp2d

# create objects of Solution class for DDP and SDP solutions
DDP = Solution()
SDP = Solution()

# start at target storage
DDP.simS[0,0] = S_target[0]
SDP.simS[0,0] = S_target[0]

# vector of standard normal random variables for flow simulation
Z = np.zeros([nYears*nSeasons+1])

# generate prior season's random normal inflow
seed = 0
Z[seed] = ss.norm.rvs(0,1,1)[0]
Qpast = np.exp(Z[seed-1]*sigmaY[-1] + muY[-1])

# simulate operations over 50 years of 4 seasons


## Part e

Based on your simulation from part (d), make a 2x2 panel figure of the empirical cumulative distribution function of percent deviations from the storage, salinity, and wetland flow targets, as well as the sum across all three. Do this for the policies from parts (a) and (c) using a different color for each. Discuss the differences you see in performance between the operating policies found using DDP vs. SDP and why.

$\color{red}{\text{Use the code chunk below to make your plot.}}$

## Part f

Make a bar chart of release violations below Qirr and above Rmax for each solution method. How do the two methods compare?

$\color{red}{\text{Use the code chunk below to make your plot.}}$