ucb

Implementation of Multi-Armed Bandit (MAB) algorithms UCB and Epsilon-Greedy. MAB is a class of problems in reinforcement learning where an agent learns to choose actions from a set of arms, each associated with an unknown reward distribution. UCB and Epsilon-Greedy are popular algorithms for solving MAB problems.

reinforcement-learning-algorithms ucb bandits mab e-greedy

Updated Mar 26, 2023
Python

salimandre / Monte-Carlo-Tree-Search-for-checkers-game

Star

We compare different policies for the checkers game using reinforcement learning algorithms.

python reinforcement-learning turtle-graphics ucb monte-carlo-tree-search checkers-game upper-confidence-bound mcts-algorithm

Updated Aug 24, 2020
Python

LittleWat / hyper-parameter-optimization-by-GMRF-GPUCB

Star

R.I.T project

python3 ucb gaussian-processes gmrf markov-random-field gp

Updated Jul 29, 2019
Python

JoelJa835 / Least-Loaded-Server

Star

reinforcement-learning-algorithms ucb multiplicative-weights

Updated Apr 26, 2023
Python

Suchetaaa / CS747-Assignments

Star

Foundations Of Intelligent Learning Agents (FILA) Assignments

reinforcement-learning monte-carlo linear-programming thompson-sampling ucb bootstrapping multi-armed-bandits bellman-equation temporal-differencing-learning howards-pi sarsa-learning kl-ucb windy-gridworld intelligent-learning-agents

Updated Nov 8, 2019
Python

paramrathour / Intelligent-and-Learning-Agents

Star

My programs during CS747 (Foundations of Intelligent and Learning Agents) Autumn 2021-22

linear-programming thompson-sampling epsilon-greedy mountain-car sarsa ucb markov-decision-processes multi-armed-bandit policy-iteration value-iteration tile-coding kl-ucb policy-control

Updated Apr 17, 2022
Python

SarCode / ML-Code-Tutorials-Udemy

Star

Complete Tutorial Guide with Code for learning ML

natural-language-processing random-forest svm scikit-learn artificial-neural-networks logistic-regression ucb polynomial-regression kmeans-clustering knearest-neighbor-algorithm apriori-algorithm classification-methods svr kernel-svm kernel-pca heirarchical-clustering decison-trees

Updated Apr 21, 2023
Python

csfive / CS61A

Star

🚧

python cs61a sicp cs ucb

Updated Jul 15, 2024
Python

amait41 / Hex-Game

Star

Python implementation of the Hex game with AI based on MC and MCTS methods. Interactive mode with pygame.

game python hex reinforcement-learning ai ucb

Updated Mar 11, 2023
Python

ishank-juneja / Correlated-AoI-Bandits

Star

Author's implementation of the paper Correlated Age-of-Information Bandits.

thompson-sampling ucb multi-armed-bandit aoi age-of-information correlated-multi-armed-bandits correlated-arms aoi-regret

Updated Jun 19, 2021
Python

MaxenceGiraud / ucb-nonstationary

Star

On Upper-Confidence Bound Policies for Non-Stationary Bandit Problems

ucb multi-armed-bandits non-stationary-bandit discounted-ucb sliding-ucb

Updated Oct 7, 2022
Python

idanmoradarthas / MutiArmedBandit-DeepLearning

Star

Multi-armed bandit algorithm with tensorflow and 11 policies

tensorflow deep-reinforcement-learning python3 ucb multi-armed-bandit epsilon softmax

Updated Dec 27, 2022
Python

woctezuma / puissance4

Star

AI for the game "Connect Four". Available on PyPI.

Updated Mar 14, 2024
Python

erdogant / thompson

Sponsor

Star

Thompson is Python package to evaluate the multi-armed bandit problem. In addition to thompson, Upper Confidence Bound (UCB) algorithm, and randomized results are also implemented.

python machine-learning reinforcement-learning genetic-algorithm bayesian ucb multi-armed-bandit thompson thompson-algorithm