# Segmented Representations

One common representation in evolutionary algorithms (EA) is that of a "segmented representation."  That is, each individual is comprised of a sequence of segments, which are themselves fixed-length sequences, and are usually binary, but needn't be.  Each segment represents a salient feature, such as a rule in a Pitt Approach system, or a convolutional layer and its hyperparameters, as is the case for Multi-node Evolutionary Neural Networks for Deep Learning (MENNDL).

There are two broad categories for these systems: those that have a fixed number of such segments, as is the case currently for MENNDL, and a dynamic number of segments, as is the case for Pitt Approach classifiers.

In this notebook we look at LEAP support for segmented representations, starting with initializers and decoders, and then looking at the mutation pipeline operator.  We then plug all that into a simple EA example.


In [1]:
import sys
import random
import functools
from pprint import pprint, pformat
import numpy as np
from toolz import pipe

from leap_ec.individual import Individual
from leap_ec.ops import pool, cyclic_selection, clone

from leap_ec.segmented_rep.initializers import create_segmented_sequence
from leap_ec.segmented_rep.decoders import SegmentedDecoder
from leap_ec.segmented_rep.ops import apply_mutation, add_segment, remove_segment, copy_segment

from leap_ec.binary_rep.initializers import create_binary_sequence
from leap_ec.binary_rep.ops import genome_mutate_bitflip
from leap_ec.binary_rep.decoders import BinaryToIntDecoder

from leap_ec.real_rep.initializers import create_real_vector
from leap_ec.real_rep.ops import genome_mutate_gaussian

## Binary genomes

We first look at segmented representations with segments that use a binary representaion.

In [2]:
# Create a genome of four segments of five binary digits.
seg = create_segmented_sequence(4, create_binary_sequence(5))
print(seg())

[array([ True,  True,  True,  True,  True]), array([ True, False,  True,  True, False]), array([ True, False, False,  True, False]), array([ True, False, False, False, False])]


In [3]:
# Now create five genomes of varying length by passing in a function for `length` that provides an
# integer drawn from a distribution.
seqs = [] # Save sequences for next step
for i in range(5):
    seq = create_segmented_sequence(functools.partial(random.randint, a=1,b=5), create_binary_sequence(5))()
    print(i, seq)
    seqs.append(seq)

0 [array([False, False, False,  True,  True]), array([False,  True, False, False, False])]
1 [array([ True, False, False,  True, False]), array([False,  True,  True,  True,  True]), array([False, False,  True, False,  True]), array([ True, False,  True, False, False]), array([False,  True, False, False, False])]
2 [array([False, False, False, False, False]), array([False,  True, False, False, False]), array([ True, False, False,  True, False]), array([ True, False, False,  True,  True])]
3 [array([ True, False, False,  True, False]), array([False, False,  True, False, False]), array([False, False, False,  True, False]), array([False, False, False, False,  True])]
4 [array([False,  True, False,  True,  True]), array([ True, False, False, False, False]), array([ True,  True,  True, False,  True]), array([ True,  True, False,  True, False])]


Now let's see about decoding those segments.  The segmented representation relies on a secondary decoder that's applied to each segment.  In this case, we'll just use a simple binary to int decoder on the segments we created in the previous step.

In [4]:
# We want each segment to have two integers from the five bits.
decoder = SegmentedDecoder(BinaryToIntDecoder(2,3)) 

for i, seq in enumerate(seqs):
    vals = decoder.decode(seq)
    print(i, vals)

0 [array([0, 3]), array([1, 0])]
1 [array([2, 2]), array([1, 7]), array([0, 5]), array([2, 4]), array([1, 0])]
2 [array([0, 0]), array([1, 0]), array([2, 2]), array([2, 3])]
3 [array([2, 2]), array([0, 4]), array([0, 2]), array([0, 1])]
4 [array([1, 3]), array([2, 0]), array([3, 5]), array([3, 2])]


In [5]:
# And now for mutation, which shows that, on average, a single value is changed in an example individual.  The
# takeaway here is that segmented mutation just uses a mutator from another representation and naively applies it.

original = Individual(np.array([[0,0],[1,1]]))
print('original:', original)
mutated = next(apply_mutation(iter([original]),mutator=genome_mutate_bitflip(expected_num_mutations=1)))
print('mutated:', mutated)

original: Individual<23db28e1-4994-4bc0-aeee-243873b24d3e> with fitness None
mutated: Individual<23db28e1-4994-4bc0-aeee-243873b24d3e> with fitness None


## Real-valued genomes

Now we demonstrate the same process using a real-valued representation.

In [6]:
# Create five segmented sequences that vary from 1 to 3 segments
bounds = ((-5.12,5.12), (-1,1), (-10,10)) # three reals and their respective bounds for sampling
seqs = []
for i in range(5):
    seq = create_segmented_sequence(functools.partial(random.randint, a=1,b=3), 
                                    create_real_vector(bounds))
    seqs.append(seq)

# Just for fun, now add a genome that has exactly 5 segments
seqs.append(create_segmented_sequence(5, create_real_vector(bounds)))

for i, s in enumerate(seqs):
    print(i, pformat(s, indent=2))

0 <function create_segmented_sequence.<locals>.segmented at 0x10e3a5ee0>
1 <function create_segmented_sequence.<locals>.segmented at 0x10e3a5300>
2 <function create_segmented_sequence.<locals>.segmented at 0x10e3a5a80>
3 <function create_segmented_sequence.<locals>.segmented at 0x10e3a6340>
4 <function create_segmented_sequence.<locals>.segmented at 0x10e3a6480>
5 <function create_segmented_sequence.<locals>.segmented at 0x10e3a65c0>


Now we repeat the application of the segmented mutation operator, but this time to real-valued genomes.

In [7]:
original = Individual(np.array([[0.0,0.0],[1.0,1.0],[-1.0,0.0]]))
print('original:', original)
mutated = next(apply_mutation(iter([original]),
                              mutator=genome_mutate_gaussian(std=1.0, expected_num_mutations=1.5)
                             )
              )
print('mutated:', mutated)

original: Individual<32dbcc92-461a-4a59-ade2-d987494ec3aa> with fitness None
mutated: Individual<32dbcc92-461a-4a59-ade2-d987494ec3aa> with fitness None


# Other pipeline operators

Besides the aformentioned `apply_mutation`, segmented representations have other pipeline operators, which are:

* `add_segment()`, possibly add a new segment
* `remove_segment()`, possibly remove a segment
* `copy_segment()`, possibly select and copy an existing segment


In [8]:
# demonstrate pipe by running existing sequence through a number of operators
pop = [Individual([[0,0],[1,1]]) for x in range(5)]
print('pop:', pformat(pop))
new_pop = pipe(pop, 
               cyclic_selection,
               clone,
               remove_segment(probability=1.0), 
               pool(size=len(pop)))
print('new_pop:', pformat(new_pop))

pop: [Individual<2ce0b1c1-e782-45af-a317-a94285d6019a>([[0, 0], [1, 1]], IdentityDecoder(), None),
 Individual<17f2663a-1a98-48e4-bd9d-21121aaea898>([[0, 0], [1, 1]], IdentityDecoder(), None),
 Individual<e1dcf197-51a2-49ab-8e2f-03dcf00d8015>([[0, 0], [1, 1]], IdentityDecoder(), None),
 Individual<1dc5816a-cb2d-4c99-8f73-332081806c57>([[0, 0], [1, 1]], IdentityDecoder(), None),
 Individual<8ec48855-3ebe-4fbc-873b-939c812d903c>([[0, 0], [1, 1]], IdentityDecoder(), None)]
new_pop: [Individual<5ae7b9e6-0357-4481-a4e4-ee07b7e20358>([[1, 1]], IdentityDecoder(), None),
 Individual<c45f9416-a4fa-45be-ad46-ccb5000339d4>([[0, 0]], IdentityDecoder(), None),
 Individual<39e91c07-2a2d-4872-a8a1-56fe7b71138e>([[1, 1]], IdentityDecoder(), None),
 Individual<b22b7087-d905-4ed5-9bf1-798c2079444b>([[0, 0]], IdentityDecoder(), None),
 Individual<aa87d303-2b77-41a5-9552-8863d70df9d8>([[0, 0]], IdentityDecoder(), None)]


In [9]:
# demonstrate pipe by running existing sequence through a number of operators
pop = [Individual([[0,0],[1,1]]) for x in range(5)]
print('pop:', pformat(pop, indent=5))
new_pop = pipe(pop, 
               cyclic_selection,
               clone,
               copy_segment(probability=1.0),
               pool(size=len(pop)))
print('new_pop:', pformat(new_pop, indent=9))

pop: [    Individual<03be41de-a40b-45bf-82f5-097c04227d74>([[0, 0], [1, 1]], IdentityDecoder(), None),
     Individual<f60beb10-840d-4e9c-a8ad-00a0ae955b40>([[0, 0], [1, 1]], IdentityDecoder(), None),
     Individual<4609f196-eecd-498b-a470-1f7e29ffad7c>([[0, 0], [1, 1]], IdentityDecoder(), None),
     Individual<4df1ea4a-fb84-40b6-8d96-c05f2fc7ee56>([[0, 0], [1, 1]], IdentityDecoder(), None),
     Individual<ddd6ebe2-d648-417b-a2bb-3af7eedc9feb>([[0, 0], [1, 1]], IdentityDecoder(), None)]
new_pop: [        Individual<aea77ad9-886a-4965-b024-889456baf046>([[0, 0], [0, 0], [1, 1]], IdentityDecoder(), None),
         Individual<d68b4321-dbbe-4482-9bab-ab7200250171>([[0, 0], [1, 1], [1, 1]], IdentityDecoder(), None),
         Individual<75540233-4f84-464d-8b57-7e085a34daf2>([[0, 0], [0, 0], [1, 1]], IdentityDecoder(), None),
         Individual<d1c6e692-ce7e-4ba1-97c3-bcd598a9c782>([[0, 0], [1, 1], [1, 1]], IdentityDecoder(), None),
         Individual<a1cd2799-f813-4a7e-b47e-fafedc8e7903

In [10]:
# lastly, demonstrate add_segment, which generates an entirely new segment
test_sequence = [12345]  # just an arbitrary sequence for testing

def gen_sequence():
    """ return an arbitrary static test_sequence """
    return test_sequence

pop = [Individual([[0,0],[1,1]]) for x in range(5)]
print('pop:', pformat(pop, indent=5))

new_pop = pipe(pop, 
               cyclic_selection,
               clone,
               add_segment(seq_initializer=gen_sequence, probability=1.0),
               pool(size=len(pop)))
print('new_pop:', pformat(new_pop, indent=9))

pop: [    Individual<75229661-8ab3-490e-91be-64f6ebfea6a0>([[0, 0], [1, 1]], IdentityDecoder(), None),
     Individual<2abd478f-41fe-40c6-b5f3-d70e41c3500d>([[0, 0], [1, 1]], IdentityDecoder(), None),
     Individual<13af2543-f6f9-438c-9cdb-0bd9663bd7e9>([[0, 0], [1, 1]], IdentityDecoder(), None),
     Individual<55e84425-5654-4244-8aba-737131bf8ee7>([[0, 0], [1, 1]], IdentityDecoder(), None),
     Individual<6b42b394-6202-4e77-ae26-4ff6cff65443>([[0, 0], [1, 1]], IdentityDecoder(), None)]
new_pop: [        Individual<08cce79f-4cdf-4cf0-a352-ba92d8dcf88b>([[0, 0], [12345], [1, 1]], IdentityDecoder(), None),
         Individual<5d26653b-5e91-4730-95f9-09b5eb8637fa>([[0, 0], [12345], [1, 1]], IdentityDecoder(), None),
         Individual<1f9c8c4f-f6e9-4a8b-ae30-82f96ffae896>([[0, 0], [12345], [1, 1]], IdentityDecoder(), None),
         Individual<b2974a3b-79d0-4deb-a28f-d48b14c03ace>([[0, 0], [12345], [1, 1]], IdentityDecoder(), None),
         Individual<f13a84d9-cb51-4886-933a-c73852cd