rl-algorithms

This package contains the reinforcement learning algorithms

Installation

pip install git+https://github.com/ahadjawaid/rl-algorithms

or

git clone https://github.com/ahadjawaid/rl-algorithms.git

cd rl-algorithms

pip install .

Usage

Dynamic Programming Algorithms

Currently we have the following dynamic programming algorithms:

policy_iteration, value_iteration

Here's an example of using the algorithms:

from rl_algorithms.dp import value_iteration, policy_iteration
import gym

env = gym.make("FrozenLake-v1")

P = env.unwrapped.P

V, policy, _ = value_iteration(P)

V, policy, _ = policy_iteration(P)

Multi-armed Bandits Algorithms

Currently we have the following exploration strategies:

PureExploitation, PureExploration, EpsilonGreedy, EpsilonGreedyLinearDecay, EpsilonGreedyExponentialDecay, OptimisticInitialization, Softmax, UpperConfidenceBound, ThompsonSampling

You can use a strategy over a number episodes by doing the following:

from rl_algorithms.mab import PureExploitation
import multi_armed_bandits
import gym

env = gym.make("2-Armed-Bandit")

strategy = PureExploitation(env, n_episodes=1000)

action_value, rewards = strategy.run()

PureExploitation episodes: 100% 1000/1000 [00:00<00:00, 124364.11it/s]

And if you want a summarized results done for you. You can use the following method:

from rl_algorithms.mab import PureExploitation
import multi_armed_bandits

PureExploitation.test_strategy(n_arms=2, seed=42, n_episodes=1000)

PureExploitation episodes: 100% 1000/1000 [00:00<00:00, 85948.85it/s]
True action-value function: [-0.75275929  1.39196365]
Predicted Action-value function: [-1.36461784  1.39664592]
Average episode reward: 1.39

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
rl_algorithms		rl_algorithms
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

rl-algorithms

Installation

Usage

Dynamic Programming Algorithms

Multi-armed Bandits Algorithms

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ahadjawaid/rl-algorithms

Folders and files

Latest commit

History

Repository files navigation

rl-algorithms

Installation

Usage

Dynamic Programming Algorithms

Multi-armed Bandits Algorithms

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages