GIRL

0. Introduction

1. Genetic Algorithm

Pittsburgh vs Michigan approach
Standard Evolutionary Algorithm
Structure of the Genome
Code borrowed from...

2. Evolution of Logic Rules

Structure of a Rule
Flow Chart for Generating Random Logic Formulas
Scoring of Rules
Score Updating from the Reinforcement-Learning Perspective
Running the Code
Why It Fails to Converge?

3. Rete algorithm

Understanding Rete
Implementation Details
For Each Game Move (play_1_move function)
Trying the Rete Demos

4. Graphical Interface for Tic Tac Toe

0. Introduction

GIRL = Genetic Induction of Relational Rules.

This is my attempt to use genetic programming to learn first-order logic rules to solve the game of Tic Tac Toe.

It also makes use of the Rete production system for logic inference.

So far it has not been successful in solving Tic Tac Toe, but I think it's getting close 🙂

1. Genetic Algorithm

Pittsburgh vs Michigan approach

My algorithm is special in that it evolves an entire set of logic rules to play a game, where each rule has its own fitness value. This is called the "Michigan" approach. See the excerpt below:

Standard Evolutionary Algorithm

Initialize population
Repeat until success:
- Select parents
- Recombine, mutate
- Evaluate
- Select survivors

Structure of the Genome

The genome is a set of rules, which evolve co-operatively.
Each candidate = just one rule.
Each rule = [ head => tail ]
Heads and tails are composed from "var" symbols and "const" symbols.

Is it OK for rules to have variable length? Yes, as long as their lengths can decrease during learning.

Code borrowed from...

This very simple genetic programming demo is translated from Ruby to Python from the book Clever Algorithms by Jason Brownlee:

Run via (note: always use Python3):

python genetic_programming_[original-demo].py

This code is the predecessor of my code.

2. Evolution of Logic Rules

Structure of a Rule

pre-condition => post-condition
pre-condition = list of positive/negative atoms, followed by an NC part
NC = NC[ list of atoms... ]
post-condition = just one positive atom
literal = atomic proposition optionally preceded by a negation sign

In this version we use rules that are compatible with Rete, that consist only of conjunctions, negations, and negated conjunctions (NC). NCs can be nested to any level.

So the general form of a rule is: a conjunction, followed by some negated atoms, followed by a possibly nested NC.

Flow chart of logic formula generation

This flow chart helps to understand the code in GIRL.py:

Scoring of Rules

For each generation, rules should be allowed to fire plentifully
Some facts lead to rewards

Once generated, a KB (knowledge base, = set of rules) would be run over many games:

For each game, a positive/negative reward would be obtained
That reward would be assigned to the entire inference chain (with time-discount)
Over many runs, each candidate rule would accumulate some scores

How the fitness score is calculated for each KB:

moves are saved during a game
at game's end, moves (ie. logic rules) are added or subtracted scores
the average fitness is simply averaged over the entire population of rules

Note: In the inference engine Clara Rules (not used here), chains of inference can be inspected.

Score Updating from the Reinforcement-Learning Perspective

For each inferred post-cond, the rule.fire += ε
Then for each time step, the "fire" values of every rule amortize.
At the time of reward, we reward all rules that has recently fired.
A question is: If a rule recently fired, but has no influence on the rewarded rule?
The point is: at least I can more easily detect the antecedents during backward chaining.
Another problem: what about instantiations? So the "fire" should be recorded as instantiated post-conds of a rule.
Recording all instantiations of post-conds may be costly but there seems no other alternatives.

Another question is how to express the Bellman Condition or update formula.

The "state" would be the WM for each inference step.
The "action" would be the inference post-cond.
So the Bellman condition says: V(x) = Expect[ R + γ V(x') ]
which means we have to establish a value function over the states x = WM contents.
But this is different from value functions over rules.
The rules are more like actions taking a state to a new state.
So how come I am evaluating actions instead of states?
- Perhaps it is a kind of Q-learning? Q(a|x).
- Bellman update formula: V(x) += η[ R + γ V(x') - V(x) ]
- for Q-learning: Q(x,a) += η[ R + γ max Q(x',a') - Q(x,a) ]
- for SARSA: Q(x,a) += η[ R + γ Q(x',a') - Q(x,a) ]

Running the GIRL Code

You can try the current version:

python GIRL.py

The randomly generated logic rules are like this, for example:

where

grey = conjunction
green = negated conjunction
bright green = conclusion
bright red = conclusion that is also action

Why It Fails to Converge?

The current algorithm fails to converge for Tic-Tac-Toe:

Failure is probably because the current algorithm performs only 1 inference step per game move. I predict that Tic-Tac-Toe can be solved once we have multi-step inference.

3. Rete algorithm

Understanding Rete

Rete is a notoriously complicated algorithm, although its basic idea is simple: compile logic rules into a decision-tree-like network, so that rules-matching can be performed efficiently.

The PhD thesis [Doorenbos 1995].pdf is also included in this repository. It explains the basic Rete algorithm very clearly and provides pseudo-code. NaiveRete is based on the pseudo-code in this paper, in particular Appendix A.

There is also a paper, originally in French, which explains Rete in more abstract terms, which I partly translated into English: [Fages and Lissajoux 1992].pdf.

This is an example of a Rete network (with only 1 logic rule):

Implementation Details

Rete is like a minimalist logic engine. The version we use here is called NaiveRete, from Github: https://github.com/GNaive/naive-rete.

The original NaiveRete code has a few bugs that I fixed with great pain, and with the help of Doorenbos' thesis.

For our purpose, any inference engine will do. Rete is not necessary; It mere provides faster inference speed. For example, Genifer 3 is another simple rule engine. Genifer 6 is based on Rete.

First, evolve a set of rules, import into Rete
Run the rules for N iterations, record scores
Repeat

Rete-related ideas:

If Rete is used, we may want to learn the Rete network directly
How to genetically encode a Rete net?
Perhaps differentiable Rete is a better approach?
It may be efficient enough to compile to Rete on each GA iteration

For Each Game Move (play_1_move function)

REPEAT: apply rules and collect all results Update Rete Working Memory (WM)
Select 1 playable result and play it Each rule candidate could have multiple instances

Should we add all Pi's to WM?

Every rule may infer a (non-action) proposition Pi
Every rule has its instantiations that should be assumed
- Why are instantiations different? because of substitution into rules.
- But are these subsitutions mutually compatible or exclusive?
- Seems compatible, eg: all men are mortal => Socrates and Plato are mortal.
Can we simply accept all such propositions in the same Working-Memory state?
- In other words, if head[0] == P then we always add postcond to WM.
TO-DO: we can iterate the "inference" step multiple times, before making an action.

NOTE: When a variable is unbound, we simply assign random values to it; This seems reasonable, as we regard unbound predicates as stochastic.

Trying the Rete Demos

Here are some demos:

python genifer.py
python genifer_lover.py

You can also look into the tests/ directory for examples.

To run the Rete tests, first install PyTest via:

pip3 install pytest

And then:

python -m pytest test/*_test.py

4. Graphical Interface for Tic Tac Toe

The GUI is like this:

It requires PyGame:

sudo apt install python3-pygame

I will prepare a version that does not use a graphic interface.

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
rete		rete
tests		tests
web		web
.gitignore		.gitignore
Clever_Algorithms_cover.jpg		Clever_Algorithms_cover.jpg
Fages_Lissajoux1992.pdf		Fages_Lissajoux1992.pdf
GIRL.py		GIRL.py
GIRL_white_paper.pdf		GIRL_white_paper.pdf
GUI-screenshot.png		GUI-screenshot.png
GUI.html		GUI.html
GUI.py		GUI.py
Genifer-lovers.png		Genifer-lovers.png
NOTES_debug.txt		NOTES_debug.txt
README.md		README.md
TicTacToe.prolog		TicTacToe.prolog
[Freitas]_quote_1.jpg		[Freitas]_quote_1.jpg
[Freitas]_quote_2.jpg		[Freitas]_quote_2.jpg
average_fitness.txt		average_fitness.txt
basic_Rete_algorithm_[Doorenbos1995].pdf		basic_Rete_algorithm_[Doorenbos1995].pdf
drop_test.py		drop_test.py
genetic_programming_[original-demo].py		genetic_programming_[original-demo].py
genetic_programming_with_disjunctions.py		genetic_programming_with_disjunctions.py
genifer.py		genifer.py
genifer_lovers.py		genifer_lovers.py
genifer_tic_tac_toe.py		genifer_tic_tac_toe.py
legend.dot		legend.dot
legend.png		legend.png
logic_rule.dot		logic_rule.dot
logic_rule.png		logic_rule.png
logic_rules_screenshot.png		logic_rules_screenshot.png
ncc_test.py		ncc_test.py
new_GUI.py		new_GUI.py
program-flow-chart.png		program-flow-chart.png
program-flow-chart.svg		program-flow-chart.svg
program-flow-chart_resized.jpg		program-flow-chart_resized.jpg
program-flow-chart_resized.png		program-flow-chart_resized.png
rete-0.dot		rete-0.dot
rete-0.png		rete-0.png
rete.dot		rete.dot
rete.png		rete.png
rete1.dot		rete1.dot
rete1.png		rete1.png
rete_0.dot		rete_0.dot
rete_0.gml		rete_0.gml
rete_0.graphml		rete_0.graphml
rete_0.gv		rete_0.gv
rete_0.gxl		rete_0.gxl
rete_0.png		rete_0.png
rete_1.dot		rete_1.dot
rete_1.png		rete_1.png
rete_2.png		rete_2.png
rete_3.dot		rete_3.dot
rete_4.dot		rete_4.dot
rete_4.png		rete_4.png
rete_graph_ncc_test.png		rete_graph_ncc_test.png
run-results.png		run-results.png
test.sh		test.sh
tic_tac_toe_1.py		tic_tac_toe_1.py
tic_tac_toe_2.py		tic_tac_toe_2.py
web-client.py		web-client.py
web-server.py		web-server.py
web-server1.py		web-server1.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GIRL

0. Introduction

1. Genetic Algorithm

Pittsburgh vs Michigan approach

Standard Evolutionary Algorithm

Structure of the Genome

Code borrowed from...

2. Evolution of Logic Rules

Structure of a Rule

Flow chart of logic formula generation

Scoring of Rules

Score Updating from the Reinforcement-Learning Perspective

Running the GIRL Code

Why It Fails to Converge?

3. Rete algorithm

Understanding Rete

Implementation Details

For Each Game Move (play_1_move function)

Trying the Rete Demos

4. Graphical Interface for Tic Tac Toe

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Cybernetic1/GIRL

Folders and files

Latest commit

History

Repository files navigation

GIRL

0. Introduction

1. Genetic Algorithm

Pittsburgh vs Michigan approach

Standard Evolutionary Algorithm

Structure of the Genome

Code borrowed from...

2. Evolution of Logic Rules

Structure of a Rule

Flow chart of logic formula generation

Scoring of Rules

Score Updating from the Reinforcement-Learning Perspective

Running the GIRL Code

Why It Fails to Converge?

3. Rete algorithm

Understanding Rete

Implementation Details

For Each Game Move (play_1_move function)

Trying the Rete Demos

4. Graphical Interface for Tic Tac Toe

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages