bandit-algorithms

Star

Here are 83 public repositories matching this topic...

rssalessio / py-lower-bound-bai

Star

Python utilities to compute a lower bound of the expected sample complexity to identify the best arm in a bandit model

bandit-algorithms lower-bound best-arm-identification sample-complexity

Updated Sep 8, 2021
Python

rssalessio / reading-list

Star

This is a collection of interesting papers that I have read so far or want to read. Note that the list is not up-to-date. Topics: reinforcement learning, deep learning, mathematics, statistics, bandit algorithms, optimization.

learning machine-learning statistics reinforcement-learning deep-learning optimization reading-list bandit-algorithms

Updated Sep 5, 2023

fouratifares / RGL

Star

Randomized Greedy Learning Under Full-bandit Feedback

agent machine-learning reinforcement-learning machine-learning-algorithms reinforcement-learning-algorithms machinelearning bandit-learning submodular-optimization submodularity bandit-algorithms

Updated Jan 22, 2024
Python

Rajarshi1001 / CS780

Star

Repository contains codes for the course CS780: Deep Reinforcement Learning

reinforcement-learning-algorithms monte-carlo-simulation ddpg-algorithm bandit-algorithms d3qn dqn-pytorch policy-based-method td3-pytorch gymnasium-environment

Updated Apr 17, 2024
Jupyter Notebook

JurajZelman / multi-armed-bandits

Star

Several multi-armed bandit strategies with additional holding option for smoother exploration.

optimization multi-armed-bandits bandit-algorithms

Updated Feb 2, 2024
Jupyter Notebook

Sagarnandeshwar / Bandit_Algorithms

Star

Reinforcement Learning (COMP 579) Project

reinforcement-learning thompson-sampling epsilon-greedy ucb bernoulli-distribution bandit-algorithms exploration-exploitation

Updated Aug 4, 2023
Jupyter Notebook

Naereen / KullbackLeibler.jl

Sponsor

Star

💫 Fast Julia implementation of various Kullback-Leibler divergences for 1D parametric distributions. 🏋 Also provides optimized code for kl-UCB indexes

kullback-leibler-divergence julia-package divergence bandit-algorithms kl-ucb

Updated May 13, 2018
Julia

rsoaresp / bandits_notebooks

Star

a collection of google colab notebooks with educational stuff about bandits and their variations

jupyter-notebook python3 bandit-algorithms

Updated Mar 26, 2020
Jupyter Notebook

Hins-Hu / Bandit-Algorithms

Star

An illustrative project including some multi-armed bandit algorithms and contextual bandit algorithms

multi-armed-bandit contextual-bandit bandit-algorithms

Updated Feb 3, 2021
Python

chunjenpeng / pyBandit

Star

Bandit and Evolutionary Algorithms using Python

python optimization evolutionary-algorithms aco pso cmaes bandit-algorithms

Updated Feb 11, 2021
Python

hughrawlinson / bandit-algorithms

Star

🎩🤠Some Bandit Algorithms in Typescript

learning optimization bandit-algorithms

Updated Aug 27, 2021
TypeScript

jajajang / LowPopArt

Star

2024 ICML Official code

reinforcement-learning-algorithms bandit-algorithms low-rank-matrix-recovery

Updated Oct 24, 2023
C

hamzaghojaria / Ads_CTR_ThompsonSampling

Star

Ads Click-through rate using thompson sampling

data-science machine-learning algorithms numpy scikit-learn ads pandas artificial-intelligence thompson-sampling matplotlib ctr-prediction ctr bandit-algorithms

Updated Jan 27, 2020
Python

duongnhatthang / meta-bandit

Star

Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms

python3 multi-task bandit meta-learning partial-monitoring sequential-decision-making-problems bandit-algorithms sequential-decisions meta-bandit

Updated Dec 7, 2022
Python

Acemad / alphaNTBEA

Star

An Implementation of the N-Tuple Bandits Evolutionary Algorithm.

java genetic-algorithm artificial-intelligence evolutionary-algorithms bandit-algorithms noisy-optimization game-agent-optimization

Updated Nov 6, 2021
Java

luke-davidson / ReinforcementLearning

Star

Programming assignments completed for my Reinforcement Learning course: Topics include Bandit Algorithms, Dynamic Programming, policy iteration, Monte-Carlo methods, SARSA, Q-Learning, Dyna-Q/Dyna-Q+, gradient control methods, state aggregation methods, and Deep Q-Learning Networks (DQNs).

reinforcement-learning deep-learning monte-carlo deep-reinforcement-learning q-learning policy-gradient dynamic-programming deep-q-network policy-iteration gradient-descent-algorithm bandit-algorithms sarsa-learning dyna-q

Updated Jan 28, 2023
Jupyter Notebook

albertopirillo / ola-project-2023

Star

Pricing and advertising strategy for the e-commerce of an airline company, based on Multi-Armed Bandits (MABs) algorithms and Gaussian Processes. Simulations include non-stationary environments.

reinforcement-learning marketing-automation online-learning bandit-algorithms