# Resource Allocation Code Demo

The Food Bank of the Southern Tier (FBST) is a member of Feeding America, focused on providing food security for people with limited financial resources, and serves six counties and nearly 4,000 square miles in the New York.  Under normal operations (non COVID times), the Mobile Food Pantry program is among the main activities of the FBST.  The goal of the service is to make nutritious and healthy food more accessible to people in underserved communities.  Even in areas where other agencies provide assistance, clients may not always have access to food due to limited public transportation options, or because those agencies are only open hours or days per work.

Here we do a sample experiment testing out some of the existing and developed algorithms against a randomized heuristic.

In [1]:
import or_suite
import numpy as np

import copy

import os
from stable_baselines3.common.monitor import Monitor
from stable_baselines3 import PPO
from stable_baselines3.ppo import MlpPolicy
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.evaluation import evaluate_policy
import pandas as pd


import gym

In [2]:
# Getting out configuration parameter for the environment
CONFIG =  or_suite.envs.env_configs.resource_allocation_default_config


# Specifying training iteration, epLen, number of episodes, and number of iterations
epLen = CONFIG['num_rounds']
nEps = 1
numIters = 500

# Configuration parameters for running the experiment
DEFAULT_SETTINGS = {'seed': 1, 
                    'recFreq': 1, 
                    'dirPath': '../data/resource/', 
                    'deBug': False, 
                    'nEps': nEps, 
                    'numIters': numIters, 
                    'saveTrajectory': True, # save trajectory for calculating additional metrics
                    'epLen' : epLen,
                    'render': False,
                    'pickle': False # indicator for pickling final information
                    }

resource_env = gym.make('Resource-v0', config=CONFIG)
mon_env = Monitor(resource_env)


In [13]:
agents = { # 'SB PPO': PPO(MlpPolicy, mon_env, gamma=1, verbose=0, n_steps=epLen),
# 'Random': or_suite.agents.rl.random.randomAgent(),
# 'Equal': or_suite.agents.resource_allocation.equal_allocation.equalAllocationAgent(epLen, CONFIG)
'FixedThreshold': or_suite.agents.resource_allocation.fixed_threshold.fixedThresholdAgent(epLen, CONFIG),
'Guardrail-0.5': or_suite.agents.resource_allocation.hope_guardrail.hopeguardrailAgent(epLen, CONFIG, 0.5),
'Guardrail-0.3': or_suite.agents.resource_allocation.hope_guardrail.hopeguardrailAgent(epLen, CONFIG, 0.3),
'Guardrail-0.25': or_suite.agents.resource_allocation.hope_guardrail.hopeguardrailAgent(epLen, CONFIG, 0.25)
}

Mean and variance endomwnets:
[[1.971 1.972 1.975 2.03  1.99  2.026 2.023 1.994 1.958 1.986]
 [2.97  2.915 3.12  2.965 3.012 3.047 3.011 2.98  2.971 3.069]
 [3.891 3.974 4.069 3.891 3.986 3.962 3.977 3.962 4.037 3.955]] [[0.942159 0.989216 0.912375 0.9651   0.9799   0.985324 0.982471 1.035964
  0.926236 1.013804]
 [1.9611   1.977775 2.1476   1.907775 1.939856 2.042791 1.950879 1.9056
  2.000159 2.098239]
 [2.763119 2.673324 3.078239 2.905119 3.045804 3.018556 3.062471 3.072556
  3.223631 2.924975]]
Mean and variance endomwnets:
[[2.001 2.    1.986 1.999 2.047 2.007 1.99  1.948 1.978 2.03 ]
 [2.975 2.934 2.968 3.07  3.037 3.104 2.996 2.964 2.996 2.996]
 [4.044 3.952 4.053 3.924 4.061 4.05  3.96  3.951 3.986 4.076]] [[1.020999 1.006    0.901804 1.034999 0.940791 1.056951 0.8879   0.879296
  0.979516 1.0151  ]
 [1.984375 2.109644 1.934976 2.0351   1.859631 2.195184 1.801984 2.142704
  1.929984 1.927984]
 [3.120064 2.773696 3.104191 2.828224 3.153279 3.0895   2.8504   2.922599
  3.065804 3

# Step 5: Run Simulations

Run the different heuristics in the environment

In [14]:
import warnings
warnings.simplefilter('ignore')

In [15]:
path_list_line = []
algo_list_line = []
path_list_radar = []
algo_list_radar= []
for agent in agents:
    print(agent)
    DEFAULT_SETTINGS['dirPath'] = '../data/resource_'+str(agent)+'/'
    if agent == 'SB PPO':
        or_suite.utils.run_single_sb_algo(mon_env, agents[agent], DEFAULT_SETTINGS)
    elif agent == 'AdaQL' or agent == 'Unif QL' or agent == 'AdaMB' or agent == 'Unif MB':
        or_suite.utils.run_single_algo_tune(resource_env, agents[agent], scaling_list, DEFAULT_SETTINGS)
    else:
        or_suite.utils.run_single_algo(resource_env, agents[agent], DEFAULT_SETTINGS)

    path_list_line.append('../data/resource_'+str(agent))
    algo_list_line.append(str(agent))
    if agent != 'SB PPO':
        path_list_radar.append('../data/resource_'+str(agent)+'/')
        algo_list_radar.append(str(agent))     
        
fig_path = '../figures/'
fig_name = 'resource'+'_line_plot'+'.pdf'
or_suite.plots.plot_line_plots(path_list_line, algo_list_line, fig_path, fig_name, int(nEps / 40)+1)        
        
additional_metric = {'Efficiency': lambda traj : or_suite.utils.delta_EFFICIENCY(traj, CONFIG), \
                      'Hindsight Envy': lambda traj : or_suite.utils.delta_HINDSIGHT_ENVY(traj, CONFIG), \
                      'Counterfactual Envy': lambda traj : or_suite.utils.delta_COUNTERFACTUAL_ENVY(traj, CONFIG)}
#                       'Prop': lambda traj : or_suite.utils.delta_PROP(traj, CONFIG), \
#                       'Exante Envy': lambda traj : or_suite.utils.delta_EXANTE_ENVY(traj, CONFIG)}
fig_name = 'resource'+'_radar_plot'+'.pdf'
or_suite.plots.plot_radar_plots(path_list_radar, algo_list_radar,
fig_path, fig_name,
additional_metric
)

FixedThreshold
Lower Solutions:
[[0.      0.08651]
 [0.      0.08651]
 [0.10607 0.     ]]
Writing to file data.csv
Guardrail-0.5
Lower and Upper Solutions:
[[0.      0.08684]
 [0.      0.08684]
 [0.10698 0.     ]]
[[0.      0.12701]
 [0.      0.12701]
 [0.15645 0.     ]]
Writing to file data.csv
Guardrail-0.3
Lower and Upper Solutions:
[[0.      0.08658]
 [0.      0.08658]
 [0.10613 0.     ]]
[[0.      0.17357]
 [0.      0.17357]
 [0.21277 0.     ]]
Writing to file data.csv
Guardrail-0.25
Lower and Upper Solutions:
[[0.      0.08693]
 [0.      0.08693]
 [0.10577 0.     ]]
[[0.      0.19862]
 [0.      0.19862]
 [0.24168 0.     ]]
Writing to file data.csv
        Algorithm     Reward      Time      Space    Efficiency  \
0  FixedThreshold -14.635384  5.413936 -12498.342 -11571.418670   
1   Guardrail-0.5 -10.774767  5.219331 -12290.717  -7596.929250   
2   Guardrail-0.3  -7.809539  5.217422 -12280.866  -3293.347751   
3  Guardrail-0.25  -7.200009  5.222994 -12279.810  -1991.251960   

  

In [16]:
from IPython.display import IFrame
IFrame("../figures/resource_line_plot.pdf", width=600, height=280)

In [17]:
IFrame("../figures/resource_radar_plot.pdf", width=600, height=500)