This is a purely client-side React Single Page Application (SPA) designed to mathematically simulate and visually analyze various Multi-Armed Bandit (MAB) reinforcement learning algorithms.
Try it live here: 👉 Live Dashboard on GitHub Pages
This application performs a Monte-Carlo simulation involving 10,000 steps averaged over 100 independent trials. The environment consists of 3 Bandits with the following True Means:
- Bandit A: 0.80 (Optimal)
- Bandit B: 0.70
- Bandit C: 0.50
The heavy numerical simulation is offloaded seamlessly to a background JavaScript Web Worker, preventing the UI thread from freezing and allowing for instant reactivity even processing billions of randomized calculations.
- A/B Testing (Explore-then-Exploit)
- Optimistic Initial Values
$\epsilon$ -Greedy- Softmax (Boltzmann Distribution)
- Upper Confidence Bound (UCB)
- Thompson Sampling (Bayesian)
Compares all 6 algorithms iteratively across the full 10,000 budget, tracing their Cumulative Average Return and their % Optimal Action Selection to demonstrate how intelligent exploration methods (like UCB and Thompson) absolutely crush static A/B testing methods.
An exclusive dashboard dedicated to analyzing the classic Explore: $2,000 / Exploit: $8,000 strategy exactly matching standard statistical requirements:
- Chart 1: Line chart visualizing the volatility and stabilization of estimated empirical means prior to exploitation.
-
Chart 2: A standard deviation (
$\pm 1 \text{ SD}$ ) bounded, cumulative running average curve clearly demarcating the Phase 1 vs Phase 2 borders. - Chart 3: A discrete Grouped Bar comparison plotting exactly what the machine believed at Step 2000 against actual mathematical reality.
To run this project locally:
- Clone this repository.
- Install dependencies:
npm install
- Start the Vite dev server:
npm run dev
- Access it in your browser natively at
http://localhost:5173.
Built with React, Vite, standard Chart.js / react-chartjs-2, and modern CSS.