Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Comparing changes

Choose two branches to see what's changed or to start a new pull request. If you need to, you can also compare across forks.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also compare across forks.
base: d756feabd3
...
compare: 6b0b273cba
Checking mergeability… Don't worry, you can still create the pull request.
  • 6 commits
  • 4 files changed
  • 0 commit comments
  • 1 contributor
Showing with 92 additions and 67 deletions.
  1. +19 −27 README.rdoc
  2. +18 −14 bootcamp.rb
  3. +36 −3 darwin.rb
  4. +19 −23 player.rb
View
46 README.rdoc
@@ -1,30 +1,22 @@
= Evolved Neural Network solution to RubyWarrior
-This is an attempt at solving Ryan Bates' RubyWarrior using an evolved Neural Network. My aim was to build a solution that was not a 'state machine' which used designed 'if then else' logic to respond to situations but rather an Artificial Intelligence which had figured out the correct responses to situations on its own.
+This is an attempt at solving Ryan Bates' RubyWarrior using an evolved Neural Network. My aim was to build a solution that was not a 'state machine' which used designed 'if then else' logic to respond to situations but rather an Artificial Intelligence which had figured out the correct responses to situations on its own (and some guided prodding).
-I used Darwin's LAW* of selection to evolve the genomes of a population of Neural Networks (NN's) from initially random genomes and eventually arrived at a genome able to get a grade A score on the beginner tower in epic mode. Evolving the solution took many thousands of generations and I did a number of things to speed this up, see 'Evolving the Solution' for more details. I have not attempted to use this approach on the intermediate tower, one day....
+I used Darwin's LAW* of selection to evolve the genomes of a population of Neural Networks (NN's) from initially random genomes and eventually arrived at a genome able to get a grade A score on the beginner tower in epic mode. I have not attempted to use this approach on the intermediate tower, one day....
Thanks to Ryan for writing RubyWarrior, it was great fun writing this solution.
==The Solution
-
-The solution's code does not contain any human designed responses. The human design is in parsing the rubywarrior world on each turn and presenting that to the network and in interpreting the network's output into a command which can be passed to the warrior. The decision of what to do on each turn is entirely up to the neural network. I have no idea what the NN's are thinking!
-
-I wanted to produce a genome which could do well in epic mode and also be able to work up the levels in normal mode from the start. So far I have one genome which can run epic mode and one which can steadily climb up the tower from the start in normal mode up to level 7. Levels 8 and 9 in normal mode where evolved separately and I've not been able to cross breed them with the 1-7genomes yet.
-
-The first epic solution I evolved was a trigger happy nut which made it to the top but shot all the captives. By cross breeding it with a more captive friendly genome from level 5 I got one which does rescue captives, but only if it is also injured! Further evolution has improved on this but some captives still get shot. This genome gets a grade A score on epic but it could do better so I'm continuing to evolve it.
+The solution uses a 'genome' file to define the 'weights' of a neural network which responds to inputs on each turn. The behaviour of the warrior is entirly dependent on the genome and the solution's code does not contain any human designed responses. The code for the solution (player.rb and brains.rb) take inputs from the level and passes them to the NN and then translates it's output into a command for the warrior.
===Running the solution
To run the solution copy the files 'brains.rb' and 'genome' along with my player.rb into the warrior directory.
-The genome in the genome file was evolved to solve rubywarrior in normal mode and will get as far as level 7 (and has some nice behaviours).
-Once at level 8 copy the genome_epic file into the folder and rename it to genome. This genome was evolved in epic mode and will complete normal mode levels 8(badly) and 9(Sgrade) and then all of the levels on epic. It's the current best, I hope to get it a bit better.
-
+The genome in that file will solve rubywarrior in normal mode but can't pass level 8 although it has some nice behaviours. Once at level 8 use the genome in the file 'genome_epic' (renaming it to 'genome'). That genome will complete normal mode levels 8(badly) and 9(Sgrade) and then all of the levels on epic. It's the current best, I hope to get it a bit better.
-The other files 'darwin.rb' and 'bootcamp.rb' are not needed to run the solution. They where used to evolve the solution, see 'Evolving the Solution' for more. To get everything in one go run this inside the folder for a rubywarrior;
- git clone /home/sujimichi/coding/lab/ruby_warrior_NN_solution && sh ruby_warrior_NN_solution/init.sh
+The code in brains.rb has extra code for using NN's with different numbers of layers. There is a cut down version without all the extra code/comments, just the bare essentials to run the solution (==100lines) and the two genomes (https://gist.github.com/2795962). The other files 'darwin.rb' and 'bootcamp.rb' are not needed to run the solution. They where used to evolve the solution, see 'Evolving the Solution' for more.
@@ -32,13 +24,13 @@ The other files 'darwin.rb' and 'bootcamp.rb' are not needed to run the solution
Player#initialize loads the 'genome' file and uses this to initialize it's Brain (neural network).
* In each turn :play_turn first 'senses' the world as an 'input array'.
-* The input array is passed to the :act_on method of the brain which calls the neural network to calculate its response to the inputs. The networks response is interpreted and returned as an 'action' and an 'impulse', ie: [:walk, :forward]
+* The input array is passed to the :act_on method of the brain which calls the neural network to calculate its response to the inputs. The networks response is interpreted and returned as an 'action' and an 'impulse', ie: [:walk, :forward], or [:attack, :backward]
* Finaly the action and impulse are passed to the warrior.
====Inputs
-The input array has a mapping of warror.feel and warror.look (when available) in all directions as an Array of values. The first four values are for 'walls' in each direction, left, forward, right, backward respectivly. The value is 1 if the warrior can feel a wall in the given direction. The next four values represent 'enemies' of all kinds (with the same order for directions) and the next for are for 'captives'.
+The input array has a mapping of warror.feel and warror.look (when available) in all directions as an Array of values. The first four values are for 'walls' in each direction, left, forward, right, backward respectivly. The value is 1 if the warrior can feel a wall in the given direction, otherwise 0. The next four values represent 'enemies' of all kinds and the next four are for 'captives' (all have the same order for directions).
-If the warrior can just :feel the first 12 values might be;
+For example, if the warrior can just :feel the first 12 values might be;
-----------
| @a >| => [1,0,1,0, 0,1,0,0, 0,0,0,0]
-----------
@@ -50,11 +42,11 @@ For example;
| @ a >| => [1,0,1,0.3, 0,0.6,0,0, 0,0,0,0]
-----------
-The input array also has values for armed, health and taking_fire.
+The input array also has values for armed, health and health reducing.
* armed gives the neural net a sense of whether it can shoot. 1 for can shoot, otherwise 0.
* health is the current health, scaled between 0 and 1 with 0 for full, 1 for dead. if the warrior does not respond to health it returns 0.
-* taking_fire returns 1 if the previous health is less than current.
+* 'health reducing' returns 1 if the previous health is less than current, otherwise 0.
There is one final input which is a constant 1 in all cases (representational bias)
The full set of 16 inputs might be something like;
@@ -66,29 +58,29 @@ The full set of 16 inputs might be something like;
-----------
====The Brain
-The Brain class has a method :act_on which takes the input array and passes them to an instance an instance of NeuralNetwork. The network calculates its response which is returned as an array of 8 values. These eight values or nodes represent different actions on the warrior. The first two nodes select an 'impulse' while the others select an 'action'. The first node defines the forward and backward impulse; positive for forward and negative for backward, the second node defines the impulse for left and right, +ive for left, -ive for right. If the first nodes absolute value is greater than the seconds it takes the forward/backward impulse, otherwise it takes the left/right impulse.
+The Brain class has a method :act_on which takes the input array and passes it to an instance of NeuralNetwork. The network calculates its response which is returned as an array of 8 values. These eight values or nodes represent different actions on the warrior.
+
+The first 2 nodes select an 'impulse' while the other 6 select an 'action'. The first node defines the forward and backward impulse; positive for forward and negative for backward, the second node defines the impulse for left and right, +ive for left, -ive for right. If the first nodes absolute value is greater than the seconds it takes the forward/backward impulse, otherwise it takes the left/right impulse.
The other 6 nodes each represent an action; :walk, :attack, :rest, :rescue, :pivot, :shoot respectivly (the order was not important). Which ever of these nodes has the strongest stimulation (highest value) will be the action taken.
-The evolution of the neural network will have to work out which node results in what outcome and 'realise' which actions it can use as all nodes are present throughout all the levels. It's therefore possible for the brain to request actions which are not yet available, it will have to evolve not to as this would result in no activity.
+I defined which node is 'connected' to which action, the evolution of the neural network will have to work out which node results in what outcome. It will also have to 'realise' which actions it can use as all nodes are present throughout all the levels. It's therefore possible for the brain to request actions which are not yet available, it will have to evolve not to as this results in no activity.
The resultant output is an array of an action and an impulse ie; [:walk, :forward] or [:attack, :backward].
====The NeuralNetwork
-The neural network used (after trying several configs) was a 2 layer network with 6 inner nodes.
-
- [w<] [w/\] [w>] [w\/] [we] [e/\] [e>] [e\/] [c<] [c/\] [c>] [c\/] [Armed?] [cur_health] [health_reducing?] [1]
-
+The neural network used (after trying several configs) was a 2 layer network with 6 inner nodes (16 in, 8 out). It might look something like this;
+ [w<] [w/\] [w>] [w\/] [we] [e/\] [e>] [e\/] [c<] [c/\] [c>] [c\/] [Armed?] [cur_health] [health_reducing?] [1]
- [?] [?] [?] [?] [?] [?]
+ [?] [?] [?] [?] [?] [?]
- [fwd/bkwd] [<-/->] [walk] [attack] [rest] [rescue] [pivot] [shoot]
+ [fwd/bkwd] [<-/->] [walk] [attack] [rest] [rescue] [pivot] [shoot]
-The resultant network would look something like this, with each 'node' connected to each 'node' in the subsequent layer. The connections (which I could not draw in text!) are weighted according to the genome used. The value on an inner node is the value given to each input node multipled by the weight of the connection to that inner node and all summed together and passed through a sin function (Math.sin(summed_value)). The same thing happens for the output nodes; the value is the weighted sum (with sin func) of the values from the inner layer.
+Each 'node' is connected to each 'node' in the subsequent layer. The connections (which I could not draw in text!) are weighted according to the genome used. The value on an inner node is the value given to each input node multipled by the weight of the connection to that inner node and all summed together and passed through a sin function (Math.sin(summed_value)). The same thing happens for the output nodes; the value is the weighted sum (with sin func) of the values from the inner layer.
====Instructing the Warrior
View
32 bootcamp.rb
@@ -245,27 +245,26 @@ def initialize n_layers = 2
@target_score = 842
set_config_for n_layers
reset_high_score
-
- @ga =MGA.new(:generations => 5000, :mutation_rate => 2, :gene_length => @gene_length, :fitness => Proc.new{|genome, gen|
+ @ga =MGA.new(:generations => 5000, :mutation_rate => 2, :gene_length => @gene_length, :cache_fitness => true, :fitness => Proc.new{|genome, gen|
puts "#{gen}\n"
-
genome_file = "./genome"
- File.open(genome_file,'w'){|f| f.write( genome.join(",") )}
-
+ File.open(genome_file,'w'){|f| f.write( genome.join(",") )}
score_sum = 0
threads = []
levels = [1,2,3,4,5,6,7,8,9]
- levels.each do |i|
+
+ levels.each do |lvl|
threads << Thread.new{
- results = `rubywarrior -t 0 -s -l #{i}`
+ results = `rubywarrior -t 0 -s -l #{lvl}`
invigilator = Invigilator.new
score, level_score, level_total, n_turns, turn_score, time_bonus, clear_bonus = invigilator.score_results results
- puts "Level#{i} | levelscore: #{level_score} | turnscore: #{turn_score.round(2)} | bonus(t:c): #{time_bonus}:#{clear_bonus} | turns: #{n_turns} | Total: #{level_total} | fitnes: #{score.round(2)}"
- instance_variable_set("@ans#{i}", score)
+ puts "Level#{lvl} | levelscore: #{level_score} | turnscore: #{turn_score.round(2)} | bonus(t:c): #{time_bonus}:#{clear_bonus} | turns: #{n_turns} | Total: #{level_total} | fitnes: #{score.round(2)}"
+ instance_variable_set("@ans#{lvl}", score)
}
end
threads.each{|t| t.join}
- score_sum = levels.map{|i| instance_variable_get("@ans#{i}")}.compact.sum
+ score_sum = levels.map{|lvl| instance_variable_get("@ans#{lvl}")}.compact.sum
+
puts "\n\t==Summed Score #{score_sum}"
remark_on score_sum
puts "."
@@ -290,16 +289,21 @@ def initialize n_layers =2
rootdir = "/home/sujimichi/coding/lab/rubywarrior"
- @ga =MGA.new(:generations => 5000, :mutation_rate => 2, :gene_length => @gene_length, :fitness => Proc.new{|genome, gen|
+ @ga =MGA.new(:generations => 5000, :mutation_rate => 2, :gene_length => @gene_length, :cache_fitness => true, :fitness => Proc.new{|genome, gen|
puts "#{gen}\n"
Dir.chdir(rootdir)
- level_factor = [0.8, 1.0, 0.8, 0.6, 0.4, 0.9, 1.0, 1.0, 1.0]
+ #weight level scores to account for some levels having more potential points than others. Try to prevent breeding with preference for levels
+ #level_weight = [0.8, 1.0, 0.8, 0.6, 0.4, 0.9, 1.0, 1.0, 1.0] #stab in dark values
+ #ace_scores = [15, 26, 71, 90, 123, 105, 50, 46, 100]#level ace scores.
+ #level_weight = ace_scores.map{|i| (15/i.to_f).round(1)} #=> [1.0, 0.6, 0.2, 0.2, 0.1, 0.1, 0.3, 0.3, 0.2]
+ level_weight = [1.0, 0.5, 0.2, 0.1, 0.1, 0.1, 0.4, 0.4, 0.3]
+
puts "\n\n"
threads = []
- levels.sort_by{rand}.each do |lvl|
+ levels.each do |lvl|
Dir.chdir("#{rootdir}/level#{lvl}bot-beginner")
File.open("./genome", 'w'){|f| f.write( genome.join(",") )} #write the genome to file which Player will use
@@ -310,7 +314,7 @@ def initialize n_layers =2
#use invigilator to get the final score. Also returns the break down of points for displaying.
score, level_score, level_total, n_turns, turn_score, time_bonus, clear_bonus = invigilator.score_results(results)
- score = score * level_factor[lvl-1]
+ score = score * level_weight[lvl-1]
puts "level-#{lvl}|levelscore: #{level_score} | turnscore: #{turn_score.round(2)} | bonus(t:c): #{time_bonus}:#{clear_bonus} | turns: #{n_turns} | Total: #{level_total} | fitnes: #{score.round(2)}"
View
39 darwin.rb
@@ -1,6 +1,7 @@
#Micro Genetic Algorithm - slight variation on https://github.com/Sujimichi/micro_ga
class MGA
- attr_accessor :population, :generations, :mutation_rate, :cross_over_rate, :current_generation, :popsize
+
+ attr_accessor :population, :generations, :mutation_rate, :cross_over_rate, :current_generation, :popsize, :scores, :cache_fitness
def initialize args = {}
@popsize = args[:popsize] || 30 #Number of members (genomes) in the population
@gene_length = args[:gene_length] || 10 #Number of bit (genes) in a genome
@@ -11,6 +12,7 @@ def initialize args = {}
@fitness_function = args[:fitness] || Proc.new{|genome| genome.inject{|i,j| i+j} } #init fitness function or use simple max ones
@current_generation = 0
@scores = {}
+ @cache_fitness = args[:cache_fitness] || false
end
def evolve generations = @generations
@@ -29,14 +31,45 @@ def pos_mutate n
n + (rand - 0.5) #plus or minus small value. || (n-1).abs #for binary mutation; 1 -> 0, 0 -> 1
end
def fitness genome
- @fitness_function.call(genome, @current_generation)
+ return @fitness_function.call(genome, @current_generation) unless @cache_fitness #return fitness as norm if caching is off
+ @scores[genome] = @fitness_function.call(genome, @current_generation) unless @scores[genome] #update cache if value not present
+ puts "cached fitness #{@scores[genome]}"
+ @scores[genome] #return cached value
end
+
def ordered_population
population.sort_by{|member| fitness(member)}.reverse
end
+
def best
ordered_population.first
end
-
end
+=begin
+def cache_test
+
+ f = Proc.new{|genome| print'.';sleep(0.05); genome.inject{|i,j| i+j} }
+ pop = Array.new(30){ Array.new(10){ 0 } }
+ g1 = MGA.new(:cache => false, :generations => 5000, :fitness => f)
+ g2 = MGA.new(:cache => true, :generations => 5000, :fitness => f)
+ g1.population = pop
+ g2.population = pop
+
+ ave1 = g1.population.map{|g| g1.fitness g}.inject{|i,j| i+j} / g1.population.size
+ ave2 = g2.population.map{|g| g1.fitness g}.inject{|i,j| i+j} / g2.population.size
+ puts [ave1, ave2].inspect
+
+ t1_1 = Time.now;g1.evolve; t1_2 = Time.now;
+ t2_1 = Time.now;g2.evolve; t2_2 = Time.now;
+ t1 = t1_2 - t1_1
+ t2 = t2_2 - t2_1
+
+ ave1 = g1.population.map{|g| g1.fitness g}.inject{|i,j| i+j} / g1.population.size
+ ave2 = g2.population.map{|g| g1.fitness g}.inject{|i,j| i+j} / g2.population.size
+ puts [ave1, ave2].inspect
+ puts [t1, t2].inspect
+
+
+end
+=end
View
42 player.rb
@@ -3,21 +3,17 @@
class Player
def initialize
genome = File.open("./genome", "r"){|f| f.readlines}.join.split(",").map{|s| s.to_f} #Read genome from file.
- nodes = {:in => 16, :inner => 6, :out => 8} #nodes = {:in => 15, :inner => 8, :inner2 => 8, :out => 5} || #nodes = {:in => 15, :out => 5}
- @brain = Brains::R2D2.new(nodes, genome)
+ nodes = {:in => 16, :inner => 6, :out => 8} #3layernodes = {:in => 15, :inner => 8, :inner2 => 8, :out => 5} || #1layernodes = {:in => 15, :out => 5}
+ @brain = Brains::R2D2.new(nodes, genome) #Initialize warriors brain (neural net)
end
def play_turn(warrior)
@previous_health ||= 20
-
- #Sense world and present as an array of inputs for NN
- inputs = input_array_for(warrior)
-
- #send inputs to neural network and interpret its output as :action and :impulse
- action, impulse = @brain.act_on(inputs)
- puts [inputs, action, impulse].inspect
+ inputs = input_array_for(warrior) #Sense world and present as an array of inputs for NN
+ action, impulse = @brain.act_on(inputs) #send inputs to neural network and interpret its output as :action and :impulse
+ puts [inputs, action, impulse].inspect #whats on its mind?
- #send 'impulse' and 'action' from brain to warrior. done inside rescue as brain may request actions the body can't yet do, like rest! in the eariler levels.
+ #send 'action' and impulse from brain to warrior. done inside rescue as brain may request actions the body can't yet do, like rest! in the eariler levels.
#no need to program which actions are allowed, evolution will work it out for itself. Yes creationists, this shit actually works!
#Once evolved the brain will 'know' what its body is capable of and the rescue should not be needed.
begin
@@ -30,16 +26,18 @@ def play_turn(warrior)
#sense the world and return info as an array of inputs for the NN
def input_array_for warrior
- dirs = [:left, :forward, :right, :backward]
- things = [:wall, :enemy, :captive]#, :stairs, :ticking, :golem]
+ dirs = [:left, :forward, :right, :backward] #directions in which things can be
+ things = [:wall, :enemy, :captive] #type of things there can be
vis_scale = [0, 0.6, 0.3] #used to scale the values returned by :look.
- if warrior.respond_to?(:feel)
- inputs = things.map do |thing|
- dirs.map do |dir|
- v = (warrior.feel(dir).send("#{thing}?").eql?(true) ? 1 : 0)
- if warrior.respond_to?(:look)
- look = warrior.look(dir)
+ if warrior.respond_to?(:feel)
+ can_look = warrior.respond_to?(:look)
+ inputs = things.map do |thing| #for each of the things
+ dirs.map do |dir| #in each of the directions
+ v = (warrior.feel(dir).send("#{thing}?").eql?(true) ? 1 : 0) #test if that thing is there, returning 1 for true else 0
+ if can_look #if warrior can also look
+ look = warrior.look(dir) #look in direction
+ #reduce to a single val from given 3 ie [0,1,1] => [0, 0.6, 0.3] => [0.6]
v = v + look.map{|l| (l.send("#{thing}?").eql?(true) ? 1 : 0) * vis_scale[look.index(l)] }.max
end
v
@@ -48,9 +46,8 @@ def input_array_for warrior
else
#in the first level the warrior has less sensory input than a sea sponge. No sensory input means no neural activity.
#So when warrior does not respond to :feel it 'imagines' that its in an empty corridor!
- inputs = [1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
+ inputs = [1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0] #inputs for empty corridor.
end
-
#give the NN sense of whether it is armed or not.
inputs << (warrior.respond_to?(:shoot!) ? 1 : 0)
@@ -59,8 +56,7 @@ def input_array_for warrior
w_health = warrior.respond_to?(:health) ? warrior.health : 20
inputs << (1 - 1.0/20 * w_health).round(2)
inputs << ((@previous_health > w_health) ? 1 : 0) #sense of health dropping
- inputs << 1 #representational bias. yeah, I should prob explain this! its REALLY important!
-
- inputs.flatten
+ inputs << 1 #representational bias. yeah, I should prob explain this! its REALLY important!
+ inputs.flatten #return array of values.
end
end

No commit comments for this range

Something went wrong with that request. Please try again.