Skip to content

jamie01713/EGT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

$\epsilon$-Greeedy Thresholding algorithm for RMAB

Setup

In general, the following setup should suffice for development or reproduction

# ensure micromamba
"${SHELL}" <(curl -L micro.mamba.pm/install.sh)

# setup the developer's env for alsoservice
# XXX '=' fuzzy prefix version match, '==' exact version match
# XXX micromamba deactivate && micromamba env remove -n ucblcb
micromamba create -n EGT                  \
  "python>=3.11"                               \
  numpy                                        \
  scipy                                        \
  jax                                          \
  chex                                         \
  scikit-learn                                 \
  pandas                                       \
  gitpython                                    \
  matplotlib                                   \
  gymnasium                                    \
  "gurobi::gurobi=12"                          \
  jupyter                                      \
  tqdm                                         \
  "black[jupyter]"                             \
  nbdime                                       \
  && micromamba clean --all --yes

Running experiments

The following command runs the experiments declared in run_xp2.sh, which sweeps over several settings of the budget of arms, that the policy is allowed to interact with on every step and of the number of states.

micromamba run -n EGT sh ./run_xp2.sh

Running your own experiments

To run your own experiments, you can modify the following parameters in run_xp2.sh:

  • Change the size of the state space by setting n_state
  • Change the budget size by setting n_budgets
  • Change the number of agents by setting n_arms

You can also use your own instance by replacing the source path in run_xp2.sh.

The input file should be an .npz file containing:

  • A transition matrix stored in an array named kernel with shape (m, a, s, x)
  • A reward tensor stored in an array named reward with shape (m, a, s, x)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors