Skip to content

Latest commit

 

History

History
72 lines (46 loc) · 1.58 KB

README.md

File metadata and controls

72 lines (46 loc) · 1.58 KB

Game-Theory-Final-Project

Outlook

algos.py

Contains the implementation of all the RL algorithm. Each algorithm has two version :

  • One version using Tables for the value-functions
  • One version using neural networks to approximate the values of the value-functions

agent.py

Contains the implementation of the agent used in the 5 experiments

experimentConf.py

Contains all the parameters that we used for the algorithms/ensemble methods for each experiment

mazes.py

Contains the implementations off all the environments (mazes) needed for the 5 experiments as well as functions to generate such environments.

clusterRunner.py

Script used to run the experiments on a cluster and distribute the trials across multiple cores.

Tasks

Neural nets

  • Q-learning
  • SARSA
  • Actor-Critic
  • QV-learning
  • ACLA

Belief State

  • Belief State
  • Maze observations

Experiments

  • Exp 1 (Simple maze + base algo)
  • Exp 2 (Partially obsebable maze + neural net)
  • Exp 3 (Dynamic obstacles maze + neural net)
  • Exp 4 (Dynamic Goal maze + neural net)
  • Exp 5 (Generalized maze + neural net)

Base Algorithms

  • Q-learning
  • SARSA
  • Actor-Critic
  • QV-learning
  • ACLA

Ensemble methods

  • Majority voting
  • Rank voting
  • Boltzmann multiplication
  • Boltzmann addition

Environments

  • Simple Dyna maze (9x6)
  • Dyna maze with Dynamic Goal (9x6)
  • Dyna maze with dynamic obstacles (9x6)
  • Generalized maze (9x6)