Skip to content

Latest commit

 

History

History
137 lines (112 loc) · 4.57 KB

feat_algorithms.rst

File metadata and controls

137 lines (112 loc) · 4.57 KB

Algorithms

Our implemented X-Armed Bandit algorithms can be classified into different categories according to different features in the algorithm design.

Algorithm Research Stochastic Cumulative Anytime
DiRect paper
DOO DOO paper
SOO SOO paper
Zooming Zooming paper
T-HOO T-HOO paper
StoSOO StoSOO paper
HCT HCT paper
POO* POO paper
GPO* GPO paper
PCT GPO paper
SequOOL SequOOL paper
StroquOOL StroquOOL paper
VROOM VROOM paper
VHCT VHCT paper
VPCT N.A.

  • (Stochastic) For some algorithms such as T_HOO and HCT, they perform well in the stochastic X-Armed Bandit setting when there is noise in the problem. However for some of the algorithms, e.g., DOO, they only work in the noise-less (deterministic) setting.
  • (Cumulative) For some algorithms such as T_HOO and HCT, they are designed to optimize the cumulative regret, i.e., the performance over the whole learning process. However for algorithms such as StoSOO and StroquOOL, they will optimize the simple regret, i.e., the final-round/last output performance.
  • (Anytime) For some algorithms such as SequOOL and StroquOOL, they need the total number of rounds (budget) information to run the algorithm, but for algorithms such as T_HOO and HCT, they do not need such information.

Note

Please refer to the following details for more information.

.. toctree::
    :maxdepth: 1

    algorithms/Zooming/Zooming
    algorithms/DOO/DOO
    algorithms/SOO/SOO
    algorithms/StoSOO/StoSOO
    algorithms/T-HOO/T-HOO
    algorithms/HCT/HCT
    algorithms/POO/POO
    algorithms/GPO/GPO
    algorithms/PCT/PCT
    algorithms/SequOOL/SequOOL
    algorithms/StroquOOL/StroquOOL
    algorithms/VROOM/VROOM
    algorithms/VHCT/VHCT
    algorithms/VPCT/VPCT