Skip to content

Value Iteration (Exact RL method) implmeneted in basic python

License

Notifications You must be signed in to change notification settings

piyush2896/ValueIteration-RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Value Iteration Method - Reinforcement Learning

Value Iteration is an exact method of solving a Reinforcement Learning problem. The goal of the task is to find how much expected discounted reward we can get from s if we use the best possible policy.

In mathmetical notations, we calculate this(below equation) for every state in set of States S, given an MDP.

V star of s
Src: UC Berkley 2017 Deep RL bootcamp Lecture 1 slides

Task at Hand

The task is to maximize a reward in a world that consists of an agent that can navigate in 4 directions - North, South, East and West. With a 20% of equally likely chance of deviating to left or right from the action asked to perform.

World
Src: UC Berkley 2017 Deep RL bootcamp Lecture 1 slides

Usage

Modify main.json to suit your needs. The key names are self explanatory. Then run python main.py.

You can also create your own <user-defined>.json file with every paramter defined and then run python main.py --json_path <user-defined>.json

Releases

No releases published

Packages

No packages published

Languages