AlekseyKorshuk · AlekseyKorshuk · Dec 6, 2022 · Dec 6, 2022 · Dec 6, 2022 · Dec 6, 2022
diff --git a/README.md b/README.md
@@ -1,54 +1,108 @@
-# YaES
-In this project we applied genetic programming to solve OpenAI Gym Environments and compared its performance to RL models.
+# Decision Genetic Programming
+
+In this project we applied genetic programming to solve OpenAI Gym Environments and compared its performance to RL
+models.
+
+# Paper
+
+The paper with the complete evaluations, results and limitations of this project can be found [here]().
 
 # Quick Start
-* `git clone git@github.com:AlekseyKorshuk/YaES.git`
-* `cd YaES`
-* `pip install -r requirements.txt`
-* `python dash_app.py`
+
+## Installation
+
+```bash
+git clone git@github.com:AlekseyKorshuk/YaES.git
+cd YaES
+pip install -r requirements.txt
+```
+
+## Dash application
+
+You can easily evaluate any GYM environment with our dash application. Just run the following command and open the link
+in your browser.
+
+```bash
+python3 dash_app.py
+```
+
+## Demo gym environment
+
+Evaluate PPO, MultiTree and Modi agents on the CartPole-v1 environment.
+
+```bash
+python3 evaluate.py
+```
 
 # Examples
+
 <p float="left">
   <img src="https://user-images.githubusercontent.com/70323559/205954264-ef4c999c-1770-4277-98fb-5af888e5f0a0.gif" alt="mountain_car" height="250"/>
   <img src="https://user-images.githubusercontent.com/70323559/205955271-b68d18e5-4def-42b2-82d9-51c0fb76e853.gif" alt="cart_pole" height="250"/>
   <img src="https://user-images.githubusercontent.com/70323559/205971663-8e056a50-0044-4f7b-b7c1-dbec6ced8809.gif" alt="cart_pole" height="250"/>
 </p>
 
 # Explanations
-**Why even try?**
-In most simple games the mapping from a state to an action can be expressed as closed-form function. It is a natural application of genetic programming and we leverage this technique to find the exact formula.
+
+> Why even try?
+
+In most simple games the mapping from a state to an action can be expressed as closed-form function. It is a natural
+application of genetic programming and we leverage this technique to find the exact formula.
 
 ## Single Action Space
-Genetic Programming is naturally applicable here. A mathematical formula can be expressed as a tree where root is the result of calculations, internal nodes are operations and terminal nodes are either the input variables (state of the game in our case) or functions without variables such as constants and random number generators.
+
+Genetic Programming is naturally applicable here. A mathematical formula can be expressed as a tree where root is the
+result of calculations, internal nodes are operations and terminal nodes are either the input variables (state of the
+game in our case) or functions without variables such as constants and random number generators.
 
 ![image](https://user-images.githubusercontent.com/70323559/205684823-2c7acccd-88ed-4b20-978d-82051a9b15c9.png)
 
 Picture source: [Wikipedia](https://upload.wikimedia.org/wikipedia/commons/7/77/Genetic_Program_Tree.png)
 
 ### Decision Making
-For binary actions (do or don't do) we make a decision by checking whether the output is greater (do) or less (don't do) than zero. For continuous actions, such as the speed of a car, we return the output as it is.
+
+For binary actions (do or don't do) we make a decision by checking whether the output is greater (do) or less (don't do)
+than zero. For continuous actions, such as the speed of a car, we return the output as it is.
 
 ### Fitness Function
+
 We obtain the fitness by taking the reward after running our agents in a Gym.
 
 ## Mutliple Action Space
-Evolution of the usual tree doesn't scale to games with multiple outputs because it returns only single number. For that reason, we implemented modified individuals which return vector of outputs. For discrete games we apply argmax function and return the result as an action. In games with continuous actions we return the result unaltered. 
+
+Evolution of the usual tree doesn't scale to games with multiple outputs because it returns only single number. For that
+reason, we implemented modified individuals which return vector of outputs. For discrete games we apply argmax function
+and return the result as an action. In games with continuous actions we return the result unaltered.
 
 ### Modi
+[Source of idea](https://www.researchgate.net/publication/228824043_A_multiple-output_program_tree_structure_in_genetic_programming)
+
 Files with implementation:
+
 * `agent/base.py`
 * `agent/modi.py`
 
-We implemented this idea with a slight modification. The authors of above mentioned paper suggest to add a special node which passes the result of their calculations to the parent (as usual), but also adds this result to the output vector. Each such node has an assigned number which specifies the index to which it will add the result. 
+We implemented this idea with a slight modification. The authors of above mentioned paper suggest to add a special node
+which passes the result of their calculations to the parent (as usual), but also adds this result to the output vector.
+Each such node has an assigned number which specifies the index to which it will add the result.
+
+Instead, we decided to separate these two functions. We add a special node called 'modi{index}' which passes its input
+to the parent without changes and adds this input to the output vector. This approach allowed us to simplify the
+implementation.
 
-Instead, we decided to separate these two functions. We add a special node called 'modi{index}' which passes its input to the parent without changes and adds this input to the output vector. This approach allowed us to simplify the implementation.
 
-[Source of idea](https://www.researchgate.net/publication/228824043_A_multiple-output_program_tree_structure_in_genetic_programming)
 ### Multi-Tree
+[Source of idea](https://github.com/DEAP/deap/issues/491)
+
 Files with implementation:
+
 * `agent/base.py`
 * `agent/multi_tree.py`
 
-The idea is to create a bag of trees where each one is responsible for specific output index. Thus, for output vector with size N we have N populations. To obtain an action, we take i-th individual from each population, feed them the state of the game and collect outputs.
+The idea is to create a bag of trees where each one is responsible for specific output index. Thus, for output vector
+with size N we have N populations. To obtain an action, we take i-th individual from each population, feed them the
+state of the game and collect outputs.
+
+
+
 
-[Source of idea](https://github.com/DEAP/deap/issues/491)
diff --git a/dash_app.py b/dash_app.py
@@ -13,7 +13,7 @@
 from os.path import isfile, join
 import dash_daq as daq
 from dash.exceptions import PreventUpdate
-from yaes.utils import train_dash
+from dgp.utils import train_dash
 
 external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']
 last_run_file = ".last_run"

diff --git a/yaes/agent/__init__.py → dgp/agent/__init__.py b/yaes/agent/__init__.py → dgp/agent/__init__.py
diff --git a/yaes/agent/base.py → dgp/agent/base.py b/yaes/agent/base.py → dgp/agent/base.py
@@ -1,4 +1,4 @@
-from yaes.environment import Environment
+from dgp.environment import Environment
 from deap import gp
 from deap import creator, base, tools, algorithms
 import operator

diff --git a/yaes/agent/deap_primitives.py → dgp/agent/deap_primitives.py b/yaes/agent/deap_primitives.py → dgp/agent/deap_primitives.py
diff --git a/yaes/agent/modi.py → dgp/agent/modi.py b/yaes/agent/modi.py → dgp/agent/modi.py
diff --git a/yaes/agent/multi_tree.py → dgp/agent/multi_tree.py b/yaes/agent/multi_tree.py → dgp/agent/multi_tree.py
@@ -1,6 +1,6 @@
 import uuid
 
-from yaes.environment import Environment
+from dgp.environment import Environment
 from deap import tools
 from .deap_primitives import basic_primitive_set
 from .base import Agent

diff --git a/yaes/agent/stable_baselines.py → dgp/agent/stable_baselines.py b/yaes/agent/stable_baselines.py → dgp/agent/stable_baselines.py
@@ -4,7 +4,7 @@
 from stable_baselines3.common.evaluation import evaluate_policy
 from stable_baselines3.common.monitor import Monitor
 
-from yaes.environment import Environment
+from dgp.environment import Environment
 
 
 class RLAgent:

diff --git a/yaes/environment/__init__.py → dgp/environment/__init__.py b/yaes/environment/__init__.py → dgp/environment/__init__.py
diff --git a/yaes/environment/base.py → dgp/environment/base.py b/yaes/environment/base.py → dgp/environment/base.py
diff --git a/yaes/environment/continuous.py → dgp/environment/continuous.py b/yaes/environment/continuous.py → dgp/environment/continuous.py
@@ -1,6 +1,6 @@
 import numpy as np
 
-from yaes.environment import Environment
+from dgp.environment import Environment
 
 
 class ContinuousEnvironment(Environment):

diff --git a/yaes/environment/discrete.py → dgp/environment/discrete.py b/yaes/environment/discrete.py → dgp/environment/discrete.py
@@ -1,6 +1,6 @@
 import numpy as np
 
-from yaes.environment import Environment
+from dgp.environment import Environment
 
 
 class DiscreteEnvironment(Environment):

diff --git a/yaes/environment/utils.py → dgp/environment/utils.py b/yaes/environment/utils.py → dgp/environment/utils.py
@@ -1,7 +1,7 @@
 import gym
 
-from yaes.environment.continuous import ContinuousEnvironment
-from yaes.environment.discrete import DiscreteEnvironment
+from dgp.environment.continuous import ContinuousEnvironment
+from dgp.environment.discrete import DiscreteEnvironment
 
 
 def wrap_env(env: gym.Env):

diff --git a/yaes/evaluate/__init__.py → dgp/evaluate/__init__.py b/yaes/evaluate/__init__.py → dgp/evaluate/__init__.py
diff --git a/yaes/evaluate/base.py → dgp/evaluate/base.py b/yaes/evaluate/base.py → dgp/evaluate/base.py
@@ -1,6 +1,6 @@
 import os
-from yaes.environment import Environment
-from yaes.train import Trainer
+from dgp.environment import Environment
+from dgp.train import Trainer
 import pandas as pd
 
 

diff --git a/yaes/train/__init__.py → dgp/train/__init__.py b/yaes/train/__init__.py → dgp/train/__init__.py
diff --git a/yaes/train/base.py → dgp/train/base.py b/yaes/train/base.py → dgp/train/base.py
@@ -1,5 +1,5 @@
-from yaes.environment import Environment
-from yaes.agent.stable_baselines import RLAgent
+from dgp.environment import Environment
+from dgp.agent.stable_baselines import RLAgent
 from stable_baselines3.common.monitor import Monitor
 
 

diff --git a/yaes/utils.py → dgp/utils.py b/yaes/utils.py → dgp/utils.py
@@ -3,10 +3,10 @@
 from stable_baselines3 import PPO
 from stable_baselines3.common.base_class import BaseAlgorithm
 
-from yaes.agent import RLAgent, multi_tree
-from yaes.agent.modi import ModiAgent
-from yaes.environment import wrap_env
-from yaes.evaluate import Evaluator
+from dgp.agent import RLAgent, multi_tree
+from dgp.agent.modi import ModiAgent
+from dgp.environment import wrap_env
+from dgp.evaluate import Evaluator
 
 
 def dump_results(stats, agent_names=None):

diff --git a/evaluate.py b/evaluate.py
@@ -3,10 +3,10 @@
 import gym
 from stable_baselines3 import PPO
 
-from yaes.agent import multi_tree, RLAgent
-from yaes.agent.modi import ModiAgent
-from yaes.environment import wrap_env
-from yaes.evaluate import Evaluator
+from dgp.agent import multi_tree, RLAgent
+from dgp.agent.modi import ModiAgent
+from dgp.environment import wrap_env
+from dgp.evaluate import Evaluator
 
 
 def set_seed(seed):

diff --git a/requirements.txt b/requirements.txt
@@ -4,4 +4,5 @@ numpy==1.23.4
 dill==0.3.4
 dash==2.7.0
 dash-daq==0.5.0
-git+https://github.com/SyrexMinus/deap_MultiOutputTree.git@MultiOutputTree
+git+https://github.com/SyrexMinus/deap_MultiOutputTree.git@MultiOutputTree
+pyglet==1.5.27
diff --git a/visualize_results.py b/visualize_results.py
@@ -1,6 +1,6 @@
 import dill
 import gym
-from yaes.environment import wrap_env
+from dgp.environment import wrap_env
 import sys
 
 

diff --git a/yaes/agent/__pycache__/__init__.cpython-310.pyc b/yaes/agent/__pycache__/__init__.cpython-310.pyc
diff --git a/yaes/agent/__pycache__/__init__.cpython-39.pyc b/yaes/agent/__pycache__/__init__.cpython-39.pyc
diff --git a/yaes/agent/__pycache__/base.cpython-310.pyc b/yaes/agent/__pycache__/base.cpython-310.pyc
diff --git a/yaes/agent/__pycache__/base.cpython-39.pyc b/yaes/agent/__pycache__/base.cpython-39.pyc
diff --git a/yaes/agent/__pycache__/stable_baselines.cpython-310.pyc b/yaes/agent/__pycache__/stable_baselines.cpython-310.pyc
diff --git a/yaes/agent/__pycache__/stable_baselines.cpython-39.pyc b/yaes/agent/__pycache__/stable_baselines.cpython-39.pyc
diff --git a/yaes/environment/__pycache__/__init__.cpython-310.pyc b/yaes/environment/__pycache__/__init__.cpython-310.pyc
diff --git a/yaes/environment/__pycache__/__init__.cpython-39.pyc b/yaes/environment/__pycache__/__init__.cpython-39.pyc
diff --git a/yaes/environment/__pycache__/base.cpython-310.pyc b/yaes/environment/__pycache__/base.cpython-310.pyc
diff --git a/yaes/environment/__pycache__/base.cpython-39.pyc b/yaes/environment/__pycache__/base.cpython-39.pyc
diff --git a/yaes/evaluate/__pycache__/__init__.cpython-310.pyc b/yaes/evaluate/__pycache__/__init__.cpython-310.pyc
diff --git a/yaes/evaluate/__pycache__/__init__.cpython-39.pyc b/yaes/evaluate/__pycache__/__init__.cpython-39.pyc
diff --git a/yaes/evaluate/__pycache__/base.cpython-310.pyc b/yaes/evaluate/__pycache__/base.cpython-310.pyc
diff --git a/yaes/evaluate/__pycache__/base.cpython-39.pyc b/yaes/evaluate/__pycache__/base.cpython-39.pyc
diff --git a/yaes/gyms/cart_pole.py b/yaes/gyms/cart_pole.py
diff --git a/yaes/train/__pycache__/__init__.cpython-310.pyc b/yaes/train/__pycache__/__init__.cpython-310.pyc
diff --git a/yaes/train/__pycache__/__init__.cpython-39.pyc b/yaes/train/__pycache__/__init__.cpython-39.pyc
diff --git a/yaes/train/__pycache__/base.cpython-310.pyc b/yaes/train/__pycache__/base.cpython-310.pyc
diff --git a/yaes/train/__pycache__/base.cpython-39.pyc b/yaes/train/__pycache__/base.cpython-39.pyc