ucb

Implementations of basic concepts dealt under the Reinforcement Learning umbrella. This project is collection of assignments in CS747: Foundations of Intelligent and Learning Agents (Autumn 2017) at IIT Bombay

reinforcement-learning linear-programming thompson-sampling epsilon-greedy ucb policy-evaluation mdps multi-armed-bandits policy-iteration randomised-algorithms reinforcement-learning-excercises kl-divergence markovian-epidemic-processes reinforcement-learning-analysis multiarm-bandit ucb1 howards-pi batch-switching randomized-policy-iteration

Updated May 21, 2018
Python

ishank-juneja / Correlated-AoI-Bandits

Star

Author's implementation of the paper Correlated Age-of-Information Bandits.

thompson-sampling ucb multi-armed-bandit aoi age-of-information correlated-multi-armed-bandits correlated-arms aoi-regret

Updated Jun 19, 2021
Python

idanmoradarthas / MutiArmedBandit-DeepLearning

Star

Multi-armed bandit algorithm with tensorflow and 11 policies

tensorflow deep-reinforcement-learning python3 ucb multi-armed-bandit epsilon softmax

Updated Dec 27, 2022
Python

csfive / CS61A

Star

🚧

python cs61a sicp cs ucb

Updated Apr 28, 2024
Python

annieyan / Bandits-using-UCB-algorithm

Star

Thompson Sampling for Bandits using UCB policy

reinforcement-learning thompson-sampling ucb bandits

Updated Jul 29, 2017
Python

rudrajit1729 / Machine-Learning-Codes-And-Templates

Star

Codes and templates for ML algorithms created, modified and optimized in Python and R.

feature-selection datascience feature-extraction thompson-sampling dimensionality-reduction ucb ann regression-models nlp-machine-learning kmeans-clustering apriori-algorithm hierarchical-clustering classification-algorithims parameter-tuning regression-algorithms xgboost-model kfold-cross-validation cnn-classification eclat-algorithm

Updated Mar 28, 2020
Python

Suchetaaa / CS747-Assignments

Star

Foundations Of Intelligent Learning Agents (FILA) Assignments

reinforcement-learning monte-carlo linear-programming thompson-sampling ucb bootstrapping multi-armed-bandits bellman-equation temporal-differencing-learning howards-pi sarsa-learning kl-ucb windy-gridworld intelligent-learning-agents

Updated Nov 8, 2019
Python

woctezuma / puissance4

Star

AI for the game "Connect Four". Available on PyPI.

Updated Mar 14, 2024
Python

MaxenceGiraud / ucb-nonstationary

Star

On Upper-Confidence Bound Policies for Non-Stationary Bandit Problems

ucb multi-armed-bandits non-stationary-bandit discounted-ucb sliding-ucb

Updated Oct 7, 2022
Python

amait41 / Hex-Game

Star

Python implementation of the Hex game with AI based on MC and MCTS methods. Interactive mode with pygame.

game python hex reinforcement-learning ai ucb

Updated Mar 11, 2023
Python

salimandre / Monte-Carlo-Tree-Search

Star

We implemented a Monte Carlo Tree Search (MCTS) from scratch and we successfully applied it to Tic-Tac-Toe game.

reinforcement-learning graphics mcts ucb monte-carlo-tree-search tic-tac-toe-game upper-confidence-bound

Updated Jul 9, 2020
Python

sarthakmittal92 / multi-armed-bandits

Star

Repository for the course project done as part of CS-747 (Foundations of Intelligent & Learning Agents) course at IIT Bombay in Autumn 2022.

python thompson-sampling reinforcement-learning-algorithms ucb multi-armed-bandits bandits kl-ucb

Updated Oct 14, 2022
Python

erdogant / thompson

Sponsor

Star

Thompson is Python package to evaluate the multi-armed bandit problem. In addition to thompson, Upper Confidence Bound (UCB) algorithm, and randomized results are also implemented.

python machine-learning reinforcement-learning genetic-algorithm bayesian ucb multi-armed-bandit thompson thompson-algorithm