Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V1.0.0 #8

Open
wants to merge 5 commits into
base: old
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 70 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,54 +1,108 @@
# YaES
In this project we applied genetic programming to solve OpenAI Gym Environments and compared its performance to RL models.
# Decision Genetic Programming

In this project we applied genetic programming to solve OpenAI Gym Environments and compared its performance to RL
models.

# Paper

The paper with the complete evaluations, results and limitations of this project can be found [here]().

# Quick Start
* `git clone git@github.com:AlekseyKorshuk/YaES.git`
* `cd YaES`
* `pip install -r requirements.txt`
* `python dash_app.py`

## Installation

```bash
git clone git@github.com:AlekseyKorshuk/YaES.git
cd YaES
pip install -r requirements.txt
```

## Dash application

You can easily evaluate any GYM environment with our dash application. Just run the following command and open the link
in your browser.

```bash
python3 dash_app.py
```

## Demo gym environment

Evaluate PPO, MultiTree and Modi agents on the CartPole-v1 environment.

```bash
python3 evaluate.py
```

# Examples

<p float="left">
<img src="https://user-images.githubusercontent.com/70323559/205954264-ef4c999c-1770-4277-98fb-5af888e5f0a0.gif" alt="mountain_car" height="250"/>
<img src="https://user-images.githubusercontent.com/70323559/205955271-b68d18e5-4def-42b2-82d9-51c0fb76e853.gif" alt="cart_pole" height="250"/>
<img src="https://user-images.githubusercontent.com/70323559/205971663-8e056a50-0044-4f7b-b7c1-dbec6ced8809.gif" alt="cart_pole" height="250"/>
</p>

# Explanations
**Why even try?**
In most simple games the mapping from a state to an action can be expressed as closed-form function. It is a natural application of genetic programming and we leverage this technique to find the exact formula.

> Why even try?

In most simple games the mapping from a state to an action can be expressed as closed-form function. It is a natural
application of genetic programming and we leverage this technique to find the exact formula.

## Single Action Space
Genetic Programming is naturally applicable here. A mathematical formula can be expressed as a tree where root is the result of calculations, internal nodes are operations and terminal nodes are either the input variables (state of the game in our case) or functions without variables such as constants and random number generators.

Genetic Programming is naturally applicable here. A mathematical formula can be expressed as a tree where root is the
result of calculations, internal nodes are operations and terminal nodes are either the input variables (state of the
game in our case) or functions without variables such as constants and random number generators.

![image](https://user-images.githubusercontent.com/70323559/205684823-2c7acccd-88ed-4b20-978d-82051a9b15c9.png)

Picture source: [Wikipedia](https://upload.wikimedia.org/wikipedia/commons/7/77/Genetic_Program_Tree.png)

### Decision Making
For binary actions (do or don't do) we make a decision by checking whether the output is greater (do) or less (don't do) than zero. For continuous actions, such as the speed of a car, we return the output as it is.

For binary actions (do or don't do) we make a decision by checking whether the output is greater (do) or less (don't do)
than zero. For continuous actions, such as the speed of a car, we return the output as it is.

### Fitness Function

We obtain the fitness by taking the reward after running our agents in a Gym.

## Mutliple Action Space
Evolution of the usual tree doesn't scale to games with multiple outputs because it returns only single number. For that reason, we implemented modified individuals which return vector of outputs. For discrete games we apply argmax function and return the result as an action. In games with continuous actions we return the result unaltered.

Evolution of the usual tree doesn't scale to games with multiple outputs because it returns only single number. For that
reason, we implemented modified individuals which return vector of outputs. For discrete games we apply argmax function
and return the result as an action. In games with continuous actions we return the result unaltered.

### Modi
[Source of idea](https://www.researchgate.net/publication/228824043_A_multiple-output_program_tree_structure_in_genetic_programming)

Files with implementation:

* `agent/base.py`
* `agent/modi.py`

We implemented this idea with a slight modification. The authors of above mentioned paper suggest to add a special node which passes the result of their calculations to the parent (as usual), but also adds this result to the output vector. Each such node has an assigned number which specifies the index to which it will add the result.
We implemented this idea with a slight modification. The authors of above mentioned paper suggest to add a special node
which passes the result of their calculations to the parent (as usual), but also adds this result to the output vector.
Each such node has an assigned number which specifies the index to which it will add the result.

Instead, we decided to separate these two functions. We add a special node called 'modi{index}' which passes its input
to the parent without changes and adds this input to the output vector. This approach allowed us to simplify the
implementation.

Instead, we decided to separate these two functions. We add a special node called 'modi{index}' which passes its input to the parent without changes and adds this input to the output vector. This approach allowed us to simplify the implementation.

[Source of idea](https://www.researchgate.net/publication/228824043_A_multiple-output_program_tree_structure_in_genetic_programming)
### Multi-Tree
[Source of idea](https://github.com/DEAP/deap/issues/491)

Files with implementation:

* `agent/base.py`
* `agent/multi_tree.py`

The idea is to create a bag of trees where each one is responsible for specific output index. Thus, for output vector with size N we have N populations. To obtain an action, we take i-th individual from each population, feed them the state of the game and collect outputs.
The idea is to create a bag of trees where each one is responsible for specific output index. Thus, for output vector
with size N we have N populations. To obtain an action, we take i-th individual from each population, feed them the
state of the game and collect outputs.




[Source of idea](https://github.com/DEAP/deap/issues/491)
2 changes: 1 addition & 1 deletion dash_app.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
from os.path import isfile, join
import dash_daq as daq
from dash.exceptions import PreventUpdate
from yaes.utils import train_dash
from dgp.utils import train_dash

external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']
last_run_file = ".last_run"
Expand Down
File renamed without changes.
2 changes: 1 addition & 1 deletion yaes/agent/base.py → dgp/agent/base.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from yaes.environment import Environment
from dgp.environment import Environment
from deap import gp
from deap import creator, base, tools, algorithms
import operator
Expand Down
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion yaes/agent/multi_tree.py → dgp/agent/multi_tree.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import uuid

from yaes.environment import Environment
from dgp.environment import Environment
from deap import tools
from .deap_primitives import basic_primitive_set
from .base import Agent
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from stable_baselines3.common.evaluation import evaluate_policy
from stable_baselines3.common.monitor import Monitor

from yaes.environment import Environment
from dgp.environment import Environment


class RLAgent:
Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import numpy as np

from yaes.environment import Environment
from dgp.environment import Environment


class ContinuousEnvironment(Environment):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import numpy as np

from yaes.environment import Environment
from dgp.environment import Environment


class DiscreteEnvironment(Environment):
Expand Down
4 changes: 2 additions & 2 deletions yaes/environment/utils.py → dgp/environment/utils.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import gym

from yaes.environment.continuous import ContinuousEnvironment
from yaes.environment.discrete import DiscreteEnvironment
from dgp.environment.continuous import ContinuousEnvironment
from dgp.environment.discrete import DiscreteEnvironment


def wrap_env(env: gym.Env):
Expand Down
File renamed without changes.
4 changes: 2 additions & 2 deletions yaes/evaluate/base.py → dgp/evaluate/base.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import os
from yaes.environment import Environment
from yaes.train import Trainer
from dgp.environment import Environment
from dgp.train import Trainer
import pandas as pd


Expand Down
File renamed without changes.
4 changes: 2 additions & 2 deletions yaes/train/base.py → dgp/train/base.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from yaes.environment import Environment
from yaes.agent.stable_baselines import RLAgent
from dgp.environment import Environment
from dgp.agent.stable_baselines import RLAgent
from stable_baselines3.common.monitor import Monitor


Expand Down
8 changes: 4 additions & 4 deletions yaes/utils.py → dgp/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
from stable_baselines3 import PPO
from stable_baselines3.common.base_class import BaseAlgorithm

from yaes.agent import RLAgent, multi_tree
from yaes.agent.modi import ModiAgent
from yaes.environment import wrap_env
from yaes.evaluate import Evaluator
from dgp.agent import RLAgent, multi_tree
from dgp.agent.modi import ModiAgent
from dgp.environment import wrap_env
from dgp.evaluate import Evaluator


def dump_results(stats, agent_names=None):
Expand Down
8 changes: 4 additions & 4 deletions evaluate.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
import gym
from stable_baselines3 import PPO

from yaes.agent import multi_tree, RLAgent
from yaes.agent.modi import ModiAgent
from yaes.environment import wrap_env
from yaes.evaluate import Evaluator
from dgp.agent import multi_tree, RLAgent
from dgp.agent.modi import ModiAgent
from dgp.environment import wrap_env
from dgp.evaluate import Evaluator


def set_seed(seed):
Expand Down
3 changes: 2 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ numpy==1.23.4
dill==0.3.4
dash==2.7.0
dash-daq==0.5.0
git+https://github.com/SyrexMinus/deap_MultiOutputTree.git@MultiOutputTree
git+https://github.com/SyrexMinus/deap_MultiOutputTree.git@MultiOutputTree
pyglet==1.5.27
2 changes: 1 addition & 1 deletion visualize_results.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import dill
import gym
from yaes.environment import wrap_env
from dgp.environment import wrap_env
import sys


Expand Down
Binary file removed yaes/agent/__pycache__/__init__.cpython-310.pyc
Binary file not shown.
Binary file removed yaes/agent/__pycache__/__init__.cpython-39.pyc
Binary file not shown.
Binary file removed yaes/agent/__pycache__/base.cpython-310.pyc
Binary file not shown.
Binary file removed yaes/agent/__pycache__/base.cpython-39.pyc
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file removed yaes/environment/__pycache__/__init__.cpython-310.pyc
Binary file not shown.
Binary file removed yaes/environment/__pycache__/__init__.cpython-39.pyc
Binary file not shown.
Binary file removed yaes/environment/__pycache__/base.cpython-310.pyc
Binary file not shown.
Binary file removed yaes/environment/__pycache__/base.cpython-39.pyc
Binary file not shown.
Binary file removed yaes/evaluate/__pycache__/__init__.cpython-310.pyc
Binary file not shown.
Binary file removed yaes/evaluate/__pycache__/__init__.cpython-39.pyc
Binary file not shown.
Binary file removed yaes/evaluate/__pycache__/base.cpython-310.pyc
Binary file not shown.
Binary file removed yaes/evaluate/__pycache__/base.cpython-39.pyc
Binary file not shown.
38 changes: 0 additions & 38 deletions yaes/gyms/cart_pole.py

This file was deleted.

Binary file removed yaes/train/__pycache__/__init__.cpython-310.pyc
Binary file not shown.
Binary file removed yaes/train/__pycache__/__init__.cpython-39.pyc
Binary file not shown.
Binary file removed yaes/train/__pycache__/base.cpython-310.pyc
Binary file not shown.
Binary file removed yaes/train/__pycache__/base.cpython-39.pyc
Binary file not shown.