-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the dice_rl_TU_Vienna
wiki! 🥳 Here we provide documentation for our policy evaluation API.
Note that the policy value is defined as
Here,
DICE-based methods are designed for infinite-horizon settings. If your environment terminates after a finite horizon, consider looping it or modeling termination with absorbing states to better reflect infinite-horizon assumptions.
Before using the library in depth, we strongly recommend reading the documentation carefully — especially the Background section — to understand key assumptions and concepts. You may also benefit from reviewing the example project linked below for a concrete application.
Jump directly to:
- Background — Key assumptions, estimators, and Bellman equations.
- Dataset and Policies — Required dataset structure and policy representation.
- Hyperparameters — Configuration details for DICE estimators.
- Algorithms — List of implemented algorithms and their expected input formats.
For a practical application of these estimators in the healthcare domain, see our related repository:
👉 dice_rl_sepsis
— Code and experiments for the publication Evaluating Reinforcement-Learning-based Sepsis Treatments via Tabular and Continuous Stationary Distribution Correction Estimation.