Skip to content

Commit

Permalink
Drhuffman12/add team utils part 7 redo (#63)
Browse files Browse the repository at this point in the history
* update command line for example re 'spec_bench/ai4cr/neural_network/rnn/rnn_simple_manager_spec.cr'

* drhuffman12/add_team_utils_part_7_redo 'src' changes re clone vs to/from json

* drhuffman12/add_team_utils_part_7_redo 'spec' and 'spec_bench' changes re clone vs to/from json, etc

* drhuffman12/add_team_utils_part_7_redo formatting

* drhuffman12/add_team_utils_part_7_redo log less often

* drhuffman12/add_team_utils_part_7_redo log more often

* drhuffman12/add_team_utils_part_7_redo add name suffix re breed type

* drhuffman12/add_team_utils_part_7_redo formatting

* drhuffman12/add_team_utils_part_7_redo round the comparative 'when_delta' times

* drhuffman12/add_team_utils_part_7_redo pend a 'chain' spec (for now)

* drhuffman12/add_team_utils_part_7_redo TODO: Refactor and enable breeding (similar to RnnSimple)! (and adjust specs)

* drhuffman12/add_team_utils_part_7_redo enable tests re w/ more time col's and additional io files combo's

* drhuffman12/add_team_utils_part_7_redo adjust propagation_function to handle NAN and Infinity; TODO: review andlikewise for other prop func cases

* drhuffman12/add_team_utils_part_7_redo formatting

* drhuffman12/add_team_utils_part_7_redo more NAN handling for 'Cmn::MiniNetConcerns::CalcGuess#propagation_function'; ameba-ize

* drhuffman12/add_team_utils_part_7_redo misc tweeks

* drhuffman12/add_team_utils_part_7_redo add 'purge_replace' helper

* drhuffman12/add_team_utils_part_7_redo formatting

* drhuffman12/add_team_utils_part_7_redo only show purge comment if members were purged

* drhuffman12/add_team_utils_part_7_redo update test data file path

* drhuffman12/add_team_utils_part_7_redo bench tests: revert from tiny hidden size to auto-calc'd, skip team sizes of 1 and 2 members for more complex RNN nets/data; re-org 'train_team_using_sequence'; start RELU net weights tiny; step_update_weights as serial instead of parallel; comment out 'purge_error_limit' code; formatting

* drhuffman12/add_team_utils_part_7_redo re-enable purging/replacing members; re-enable 'traing_qty'

* drhuffman12/add_team_utils_part_7_redo formatting

* drhuffman12/add_team_utils_part_7_redo re-enable 'purge_error_limit' params for 'train_team_using_sequence(..)'

* drhuffman12/add_team_utils_part_7_redo rename to 'infinite?'

* drhuffman12/add_team_utils_part_7_redo members added during purge get named 'P'

* drhuffman12/add_team_utils_part_7_redo use 'score' in the purge

* move 'train_qty' so as to affect all tests under "when using a text file as io data"

* drhuffman12/add_team_utils_part_7_redo move 'train_qty' so as to affect all tests under "when using a text file as io data"

* drhuffman12/add_team_utils_part_7_redo adjust the purge process

* drhuffman12/add_team_utils_part_7_redo switch back to using 'score' instead of 'distance' for purging

* drhuffman12/add_team_utils_part_7_redo re-enable team of 8 in benches

* drhuffman12/add_team_utils_part_7_redo add a hidden layer for the rnn text spec

* drhuffman12/add_team_utils_part_7_redo bump up to 16 tc's and 4 hl's

* drhuffman12/add_team_utils_part_7_redo bump down to 8 tc's and 4 hl's

* drhuffman12/add_team_utils_part_7_redo in 'purge_replace', breed purged member with rand_excluding new membere so we retain some of the training

* drhuffman12/add_team_utils_part_7_redo move expect's next to each other

* drhuffman12/add_team_utils_part_7_redo note 'inputs_sequence' index 'i' in purge output

* drhuffman12/add_team_utils_part_7_redo add 'weight_init_scale*' and auto-scale it; convert some 'TextFile' methods to class methods; add example unicode text file; large vs small RNN size test params; add text output of expected vs actual guesses

* drhuffman12/add_team_utils_part_7_redo output formatting

* drhuffman12/add_team_utils_part_7_redo adjust outputs and for a 'mid-sized rnn'

* drhuffman12/add_team_utils_part_7_redo output should handle less than 32 bits

* drhuffman12/add_team_utils_part_7_redo misc tweeks; bump 1st textfile team size up to 10

* drhuffman12/add_team_utils_part_7_redo adjust rnn params; add and revert auto_shrink_weights

* drhuffman12/add_team_utils_part_7_redo add timestamp

* drhuffman12/add_team_utils_part_7_redo update shards

* drhuffman12/add_team_utils_part_7_redo update shards, ameba formatting

* drhuffman12/add_team_utils_part_7_redo ameba formatting

* drhuffman12 misc cleanup re Crystal v1.0 and other updated shards

* drhuffman12 add 'default_to_bit_size' e.g.: to allow forcing from 32bit utf down to 8bit ascii (ignoring higher bits)

* drhuffman12 adjust bench params and update README.md

* drhuffman12 try 4 hidden layers

* drhuffman12 spec_bench w/ 3 hidden layers; TODO: Replace(?)/Supplement(?) below with actual UTF-to-ASCII conversion! (And ASCII-to-UTF reversion)

* drhuffman12 tweek bench params

* drhuffman12 attempts to get better and quicker RNN results and improved visuals of result progress

* drhuffman12/add_team_utils_part_7_redo add system monitoring data to the 'tmp/log.txt' file; formatting

* drhuffman12 adjust mem logging; formatting

* drhuffman12 add histogram of correct chars

* drhuffman12 formatting

* drhuffman12 trim logging

* drhuffman12 round percentages

* drhuffman12 adjust histograms

* drhuffman12 adjust histograms

* drhuffman12 adjust histograms

* drhuffman12 adjust histograms

* drhuffman12 reset histograms

* drhuffman12 rename 'all_hists' to 'recent_hists'

* drhuffman12 set default_to_bit_size to 8 for 'examples/rnn_simple_manager_example.cr

* drhuffman12 add second round of training (hopefully will all be 6/6 correct) for whole set of io's in 'examples/rnn_simple_manager_example.cr

* drhuffman12 add current time to logging

* drhuffman12 add auto-save option to manager

* drhuffman12 code cleanup

* drhuffman12 formatting

* drhuffman12 formatting

* drhuffman12 fix 'validate! re hidden_size_given; note re 'implement bi-directional RNN in the next phase'

* drhuffman12 Drop 'PURGE_ERROR_LIMIT_SCALE' from 1e12 to 1e4 (to keep weights and guesses from going extreme).

* drhuffman12 revert 'INPUT_SIZE_MIN' to '2' with TODO comment

* drhuffman12 better error handling for auto-saving the trained nets

* drhuffman12 adjust naming for auto-save of nets to show which guesses are correct
  • Loading branch information
drhuffman12 committed Apr 10, 2021
1 parent 54cf06e commit 9f41353
Show file tree
Hide file tree
Showing 41 changed files with 2,022 additions and 917 deletions.
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,10 @@ e.g.: `time crystal spec --release`

Use the `-Dpreview_mt` (for `crystal build` or `-D preview_mt` for `crystal spec`) flag for multithreading.

e.g.: `CRYSTAL_WORKERS=14 crystal spec spec/ai4cr/neural_network/rnn/rnn_simple_manager_spec.cr --release -D preview_mt`
e.g.: `time CRYSTAL_WORKERS=14 crystal spec spec_bench/ai4cr/neural_network/rnn/rnn_simple_manager_spec.cr --release -D preview_mt`
e.g.: `time CRYSTAL_WORKERS=24 crystal spec spec_bench/ai4cr/neural_network/rnn/rnn_simple_manager_spec.cr --release -D preview_mt > tmp/log.txt 2>&1`

(Personally, as for how many `CRYSTAL_WORKERS`, I'd recommend keep it to less than the number of cores in your CPU, so that you leave at least one or two cores for the OS and apps.)

See also:
* https://crystal-lang.org/2019/09/23/crystal-0.31.0-released.html
Expand Down
171 changes: 171 additions & 0 deletions examples/rnn_simple_manager_example.cr
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# Run via: `time CRYSTAL_WORKERS=24 crystal run examples/rnn_simple_manager_example.cr -Dpreview_mt --release > tmp/log.txt`
# (Adjust the 'CRYSTAL_WORKERS=24' as desired.)
# Follow `tmp/log.txt' in your IDE or in console (i.e.: `tail -f tmp/log.txt`)
# Be on the look out for high `percent_correct: x of x` in the 'tmp/log.txt file'
# Monitor your Ram and CPU usage!
# (This seems to stablize at around about 4 Gb and 1/3 of my system's AMD Ryzen 7 1700X CPU.)
# NOTE: Training results look promising, but tend to be more successful towards the 'more future' side of the outputs.
# So, implement bi-directional RNN in the next phase, in hopes of balancing out the successfulness of the
# 'less future' vs 'more future' guesses.

require "./../src/ai4cr"

class Runner
getter file_path : String

def initialize(@file_path)
end

def compare_successive_training_rounds(
io_offset, time_col_qty,
inputs_sequence, outputs_sequence,
hidden_layer_qty, hidden_size_given,
qty_new_members,
my_breed_manager, max_members,
train_qty,
io_set_text_file
)
puts
puts "v"*40
puts "successive generations (should) score better (?) .. max_members: #{max_members} .. start"
when_before = Time.local
puts "when_before: #{when_before}"
puts "file_path: #{file_path}"
puts

params = Ai4cr::NeuralNetwork::Rnn::RnnSimple.new(
io_offset: io_offset,
time_col_qty: time_col_qty,
input_size: inputs_sequence.first.first.size,
output_size: outputs_sequence.first.first.size,
hidden_layer_qty: hidden_layer_qty,
hidden_size_given: hidden_size_given
).config

puts "inputs_sequence.size: #{inputs_sequence.size}"
puts "inputs_sequence.first.size: #{inputs_sequence.first.size}"
puts "inputs_sequence.first.first.size: #{inputs_sequence.first.first.size}"
puts "inputs_sequence.class: #{inputs_sequence.class}"
puts "outputs_sequence.class: #{outputs_sequence.class}"
puts "params: #{params}"

puts "* build/train teams"
puts "\n * first_gen_members (building)..."
first_gen_members = my_breed_manager.build_team(qty_new_members, **params)
puts "\n * second_gen_members (breeding and training; after training first_gen_members)..."
second_gen_members = my_breed_manager.train_team_using_sequence(inputs_sequence, outputs_sequence, first_gen_members, io_set_text_file, max_members, train_qty) # , block_logger: train_team_using_sequence_logger)
puts "\n * third_gen_members (breeding and training; after training second_gen_members) ..."
third_gen_members = my_breed_manager.train_team_using_sequence(inputs_sequence, outputs_sequence, second_gen_members, io_set_text_file, max_members, train_qty) # , block_logger: train_team_using_sequence_logger)

puts "* score and stats ..."
p "."
first_gen_members_scored = first_gen_members.map { |member| member.error_stats.score }.sum / qty_new_members
first_gen_members_stats = first_gen_members.map { |member| member.error_hist_stats }

p "."
second_gen_members_scored = second_gen_members.map { |member| member.error_stats.score }.sum / qty_new_members
second_gen_members_stats = second_gen_members.map { |member| member.error_hist_stats }

p "."
third_gen_members_scored = third_gen_members.map { |member| member.error_stats.score }.sum / qty_new_members
third_gen_members_stats = third_gen_members.map { |member| member.error_hist_stats }

puts
puts "#train_team_using_sequence (text from Bible):"
puts
puts "first_gen_members_scored: #{first_gen_members_scored}"
first_gen_members_stats.each { |m| puts m }

puts
puts "second_gen_members_scored: #{second_gen_members_scored}"
second_gen_members_stats.each { |m| puts m }

puts
puts "third_gen_members_scored: #{third_gen_members_scored}"
third_gen_members_stats.each { |m| puts m }

when_after = Time.local
puts "when_after: #{when_after}"
when_delta = when_after - when_before
puts "when_delta: #{(when_delta.total_seconds / 60.0).round(1)} minutes
"
puts
puts "successive generations score better (?) .. max_members: #{max_members} .. end"
puts "-"*40
puts
end
end

####

my_breed_manager = Ai4cr::NeuralNetwork::Rnn::RnnSimpleManager.new

file_path = "./spec_bench/support/neural_network/data/bible_utf/eng-web_002_GEN_01_read.txt"
file_type_raw = Ai4cr::Utils::IoData::FileType::Raw
prefix_raw_qty = 0
prefix_raw_char = " "
default_to_bit_size = 8

io_set_text_file = Ai4cr::Utils::IoData::TextFileIodBits.new(
file_path, file_type_raw,
prefix_raw_qty, prefix_raw_char,
default_to_bit_size
)

# re 'compare_successive_training_rounds'
time_col_qty = 6 # 25
io_offset = time_col_qty
ios = io_set_text_file.iod_to_io_set_with_offset_time_cols(time_col_qty, io_offset)

inputs_sequence = ios[:input_set]
outputs_sequence = ios[:output_set]

hidden_layer_qty = 3
hidden_size_given = 100 # 100 # 200

max_members = 10
qty_new_members = max_members

train_qty = 2

puts
puts "*"*40
puts "my_breed_manager: #{my_breed_manager}"
puts "io_set_text_file: #{io_set_text_file}"
puts "v"*40
puts "io_set_text_file.raw: #{io_set_text_file.raw}"
puts "^"*40
puts
puts "io_set_text_file.raw.size: #{io_set_text_file.raw.size}"
puts "io_set_text_file.raw.size: #{io_set_text_file.raw.class}"
puts
puts "io_set_text_file.iod.size: #{io_set_text_file.iod.size}"
puts "io_set_text_file.iod.class: #{io_set_text_file.iod.class}"
puts "io_set_text_file.iod.first.size: #{io_set_text_file.iod.first.size}"
puts "io_set_text_file.iod.first.class: #{io_set_text_file.iod.first.class}"
puts "io_set_text_file.iod.first.first.class: #{io_set_text_file.iod.first.first.class}"

puts "-"*40
puts

r = Runner.new(file_path)

r.compare_successive_training_rounds(
io_offset, time_col_qty,
inputs_sequence, outputs_sequence,
hidden_layer_qty, hidden_size_given,
qty_new_members,
my_breed_manager, max_members,
train_qty,
io_set_text_file
)

r.compare_successive_training_rounds(
io_offset, time_col_qty,
inputs_sequence, outputs_sequence,
hidden_layer_qty, hidden_size_given,
qty_new_members,
my_breed_manager, max_members,
train_qty,
io_set_text_file
)
35 changes: 13 additions & 22 deletions shard.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
name: ai4cr
version: 0.1.23
version: 0.1.24

authors:
- Daniel Huffman <drhuffman12@yahoo.com>

crystal: ">= 0.36.0"
# crystal: 1.0
# crystal: ">= 1.0.0"

license: MIT

Expand All @@ -20,32 +19,19 @@ dependencies:

development_dependencies:

## Un-comment/edit after Crystal 0.36.0 or 1.0 compatible:
ameba:
github: crystal-ameba/ameba
version: ~> 0.13.4
# branch: master
# # version: ~> 0.13.3
# ## REVERT to crystal-ameba/ameba after it is Crystal 1.0 compatible.
# ## See also: https://github.com/crystal-ameba/ameba/pull/173
# github: drhuffman12/ameba
# # branch: drhuffman12/bump_crystal_version_to_1
version: ~> 0.14.1

## Un-comment/edit after Crystal 0.36.0 or 1.0 compatible:
# icr:
# github: crystal-community/icr
# branch: master

## Un-comment/edit after Crystal 0.36.0 or 1.0 compatible:
spectator:
# gitlab: arctic-fox/spectator
# branch: master
# version: ">= 0.9.31"

github: drhuffman12/spectator
# branch: drhuffman12/bump_crystal_version_to_1
# branch: drhuffman12/master
branch: drhuffman12/bump_crystal_version_to_0_36_0b
gitlab: arctic-fox/spectator
branch: master
version: ">= 0.9.33"

## Un-comment after Crystal 1.0 compatible:
# aasm:
Expand All @@ -59,18 +45,23 @@ development_dependencies:

faker:
github: askn/faker
# github: osacar/faker
# version: "0.6.0"

# # TODO: Fix ameba dependencies or wait for newer version of crystal-coverage (aka https://github.com/anykeyh/crystal-coverage)
# # Error is:
# # ```
# # 12 | new(YAML::ParseContext.new, parse_yaml(string_or_io))
# # ^--
# # Error: wrong number of arguments for 'Ameba::Rule::Lint::Syntax.new' (given 2, expected 0..1)

# # ...
# # Overloads are:
# # - Ameba::Rule::Lint::Syntax.new(config = nil)

# # ...
# # make[1]: *** [Makefile:8: bin/ameba] Error 1
# # ```
# coverage:
# github: anykeyh/crystal-coverage

hardware:
github: crystal-community/hardware
4 changes: 4 additions & 0 deletions spec/ai4cr/breed/manager_spec.cr
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@ class MyBreed
def initialize(@name, @some_value, @history_size = 2)
@error_stats = Ai4cr::ErrorStats.new(history_size)
end

def clone
MyBreed.new(self.name, self.some_value, self.history_size)
end
end

class MyBreedManager < Ai4cr::Breed::Manager(MyBreed)
Expand Down
2 changes: 1 addition & 1 deletion spec/ai4cr/neural_network/cmn/chain_spec.cr
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ describe Ai4cr::NeuralNetwork::Cmn::Chain do
(cns.errors.empty?).should be_true
end

it "updates last net's outputs when guessing" do
pending "updates last net's outputs when guessing" do
last_net_output_before = cns.net_set.last.outputs_guessed.clone
(cns.guesses_best).should eq(expected_inital_outputs)
(cns.guesses_best.size).should eq(expected_inital_outputs.size)
Expand Down
8 changes: 4 additions & 4 deletions spec/ai4cr/neural_network/rnn/rnn_simple_manager_spec.cr
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@ Spectator.describe Ai4cr::NeuralNetwork::Rnn::RnnSimpleManager do
:name, :history_size, :io_offset, :time_col_qty,
:input_size, :output_size, :hidden_layer_qty, :hidden_size_given,
:learning_style, :bias_disabled, :bias_default, :learning_rate,
:momentum, :deriv_scale,
:momentum, :deriv_scale, :weight_init_scale_given,
]
}
it "which include expected keys" do
Expand All @@ -355,7 +355,7 @@ Spectator.describe Ai4cr::NeuralNetwork::Rnn::RnnSimpleManager do
end

context "creates members with expected values for" do
it ":name" do
pending ":name" do
key = :name
key_string = key.to_s
team_members.each do |member|
Expand Down Expand Up @@ -408,7 +408,7 @@ Spectator.describe Ai4cr::NeuralNetwork::Rnn::RnnSimpleManager do
it "creates members of specified class" do
qty_new_members = 4
next_gen_members = my_breed_manager.build_team(qty_new_members)
member_classes = next_gen_members.map { |member| member.class.name }.sort.uniq
member_classes = next_gen_members.map(&.class.name).sort!.uniq!
expect(member_classes.size).to eq(1)
expect(member_classes.first).to eq(Ai4cr::NeuralNetwork::Rnn::RnnSimple.name)
end
Expand All @@ -426,7 +426,7 @@ Spectator.describe Ai4cr::NeuralNetwork::Rnn::RnnSimpleManager do
qty_new_members = 4
params = Ai4cr::NeuralNetwork::Rnn::RnnSimple.new.config
next_gen_members = my_breed_manager.build_team(qty_new_members, **params)
member_classes = next_gen_members.map { |member| member.class.name }.sort.uniq
member_classes = next_gen_members.map(&.class.name).sort!.uniq!
expect(member_classes.size).to eq(1)
expect(member_classes.first).to eq(Ai4cr::NeuralNetwork::Rnn::RnnSimple.name)
end
Expand Down
10 changes: 0 additions & 10 deletions spec/ai4cr/team_spec.cr

This file was deleted.

0 comments on commit 9f41353

Please sign in to comment.