# Learning Classifiers from Scratch

## Model Source

The initial, basic learning classifier system model taken from Dr. Ryan Urbanowicz's "Learning Classifier Systems in a Nutshell" video on YouTube found here: https://youtu.be/CRge_cZ2cJc?si=1CM2osKW7CptJ-DM

This video description of an LCS is the simplest and most digestable that has been found while also staying complete in terms of LCS operation. Additionally, some psuedo code snippets have been taken from Dr. Martin Butz's book "Rule-Based Evolutionary Online Learning Systems" and his algorithmic description of XCS.

### Step 0: Knowledge Priming

A small note on the structure of the below LCS to help readability. The population is a list of dictionaries where each dictionary is a classifier. Each classifier has a many key:value pairs that represent its state, action, accuracy, etc. Thus, the population as a whole can be modified by list methods, and classifier information can be retrieved by dictionary keys.

### Step 1: Initialize Setup

Initialize the population and create the functions for creating empty match sets and action sets:

In [4]:
# Initialize the empty population. This is only called once at the beginning of the cycle.
def initialize_population():
    population = []
    return population

population = initialize_population()
print(population)

[]


In [5]:
print(population)

[]


In [6]:
# List editing test for sanity check

x = [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
y = x
for i in x:
    i['a'] += 1

print(y)

[{'a': 2, 'b': 2}, {'a': 4, 'b': 4}]


### Step 2: Feeding Data to LCS

LCS is an online learning mechanism, but will normally be trained from some dataset. Data from the dataset in training or from the environment in testing will need to be fed to the LCS.

In [7]:

data = './11Multiplexer_Data_Complete.csv'

# Get the length of the file so that the get_instance function doesn't return anything if requested line is not present
def get_data_length(data):
    with open(data, 'r') as file:
        return sum(1 for row in file)

# Convert the instance into a list of integers
def convert_int(instance):
    int_instance = []
    for i in instance:
        int_instance.append(int(i))
    return int_instance

# Create a function that gets the data from a file an returns a specified instance of the dataset to the LCS
# This returns a single training instance from the data and does not load the entire data file into memory
def get_instance(data, line_num):
    import csv
    lines = get_data_length(data)
    with open(data, 'r') as source:
        reader = csv.reader(source)
        if line_num > lines:
            return
        for _ in range(line_num):
            next(reader)
        return convert_int(next(reader))

instance = get_instance(data, 1)
print(instance)

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]


### Step 3: Determine if classifiers in population match the current instance

Compare each classifier in the population to the current instance. If classifiers in the population match, they are each added to the match set.

In [8]:
# Create a does_match function that compares each attribute between two classifiers
# The states of each classifier are tuples of (index, value). Only some indices are specified, if they are not, they are equivalent to the hash "don't care" symbol
def does_match(state, instance):
    for i in range(len(state)):
        index = state[i][0]
        if state[i][1] != instance[index]:
            return False
    return True

# Create the match set by comparing the attributes of each classifier in the population with the current instance
def create_match_set(population, instance):
    match_set = []
    if len(population) == 0:
        return match_set # Returns an empty patch set if the population length is zero, this is importance for covering
    else:
        for classifier in population: # Is a slice of population necessary? This then iterates on the copy of a list, instead of the original. Do I need to return the edited populataion copy?
            state = classifier['state']
            if does_match(state, instance) == True:
                match_set.append(classifier)
                classifier['match count'] += 1
                classifier['accuracy'] = classifier['correct count'] / classifier['match count']
                classifier['fitness'] = classifier['accuracy'] ** 5
        return match_set

match_set = create_match_set(population, instance)
print(population)
print(match_set)
x = [(0,0), (1,0), (5,1), (4, 1)]
y = [0, 0, 1, 1, 1, 1]
print(does_match(x, y))

[]
[]
True


### Step 4: Generate the correct set

From the match set, create a correct set by comparing the action or class of each classifier with the action or class of each instance.

In [9]:
# Create the correct set by comparing the class or action of each classifier in the match set with the current instance

def create_correct_set(match_set, instance):
    correct_set = []
    if len(match_set) == 0: # Similar to the match set, return an empty correct set if the match set is empty
        return correct_set
    else:
        for classifier in match_set: # Is the slice of match set necessary? Do I need to return a new match set?
            if classifier['action'] == instance[-1]: # This assumes that the classification is the last item in the training instance list. Most of the time this is the case
                correct_set.append(classifier)
                classifier['correct count'] +=1
                classifier['accuracy'] = classifier['correct count'] / classifier['match count']
                classifier['fitness'] = classifier['accuracy'] ** 5
        return correct_set
        
correct_set = create_correct_set(match_set, instance)
print(population)
print(correct_set)

[]
[]


### Step 5: Covering

In most LCS, the population is initialized as being empty. Covering adds classifiers to the population using the current instance if the correct set is empty. This is also the step that turns the simple instance data into the classifier dictionary.

In [10]:
# Create a dictionary item to represent the current instance if the correct set is empty.

def covering(instance, iteration, specificity):
    import random
    state = []
    action = instance[-1]
    for x in range(len(instance) - 1):
        if random.random() < specificity:
            state.append(tuple((x, instance[x])))
    classifier = {'state': state, 
                  'action': action, 
                  'numerosity': 1, 
                  'match count': 1, 
                  'correct count': 1, 
                  'accuracy': 1, 
                  'fitness': 1,
                  'deletion vote': 1, 
                  'birth iteration': iteration}
    return classifier

classifier = covering(instance, 1, specificity=.5)

def update_population(classifier, population):
    population.append(classifier)
    return population
update_population(classifier, population)
print(population)

[{'state': [(0, 0), (6, 0)], 'action': 0, 'numerosity': 1, 'match count': 1, 'correct count': 1, 'accuracy': 1, 'fitness': 1, 'deletion vote': 1, 'birth iteration': 1}]


### Step 5.1: Population Filling

Below is an example of a loop that goes through the training data and creates a population from the training instances. The loop takes a data set to train on and a specificity parameter that is from 0 to 1. A specificity of 1 means that the classifier states will be 100% specific and the population will essentially be filled with one of each iteration of the training data. Anything less than 1 and there is a chance that for each bit in the state of a calssifier it will be unspecified. This is equivalent to the # symbol in most LCS algorithms and if a bit is unspecified, the algorithm "doesn't care". The states are coded in a list of tuples that represent index-value pairs. For large datasets and complicated problems, this can have a significant speed and memory advantage over representing "don't care" bits with #.

In [11]:
def testing(data, specificity):
    population = initialize_population()
    length = get_data_length(data)
    for i in range(1, length):
        instance = get_instance(data, i)
        match_set = create_match_set(population, instance)
        correct_set = create_correct_set(match_set, instance)
        if len(correct_set) == 0:
            classifier = covering(instance, i, specificity=specificity)
            update_population(classifier, population)
    return population

### Step 6: Genetic Algorithm

The genetic algorithm is the heart of learning for the LCS. It introduces new rules to the population and evolves accurate, general rules that apply to the training data. The three main portions of the GA are selection, crossover, and mutation, applied in that order.

Selection selects two parent classifiers from the correct set. Selection is most often done in two ways, proportionate selection or tournament selection. Proportionate selection makes the most logical sense at first, but can significantly hinder learning performance. In proportionate selection parents are selected directly proporional to their fitness. However, it is often the case during training that many classifiers will have similar, low accuracy and few classifiers will have high accuracy. The chance of picking a highly accurate classifier at random is small. This is normally visualized with a roulette wheel. If the slices of a roulette wheel were represented by classifier accuracy, one, highly accurate classifer might take up 25% of the wheel while thousands of classifiers with poor accuracy would take up 75% of the wheel. Spinning the wheel to choose a classifier means that you'll pick an inaccurate classifier 75% of the time. There are ways around this like fitness sharing for proportionate selection, but tournament selection is simpler and will be used here. Tournament selection randomly selects a number of classifiers from the correct set. The classifier with the highest accuracy is chosen as a parent. This is repeated for the second parent.

Crossover exchanges attributes of the parent classifier states to create potentially new classifiers. The three main crossover mechanisms are uniform, single point, and double point crossover. Uniform goes one attribute at a time and randomly exchanges the values between the two parent classifiers. Uniform crossover introduces the most diversity into the population but has two major drawbacks. Uniform crossover not only significantly more difficult to perform (in terms of computations and even physically coding it) than the other two, it can disrupt learning significantly. For example, if two very accurate classifiers are chosen as parents, uniform crossover can completely disrupt their attributes into new classifiers that look nothing like the original parents. Thus, single point or double point crossover is traditionally used. Single point crossover chooses a random index in the parent classifiers and swaps them at that point. In this way, as least 50% of the parent classifier attributes are maintained in their original order while introducing attributes from the other parent classifier. Two point crossover does the samething as single point but chooses two indices and swaps the portion between those two points. In this method, at least 66% of the parent classifiers are preserved.

Mutation is applied to the offspring of the two parent classifiers. Mutation is based off a small probability that either converts a generalized attribute into a specified one or vice versa. If converting a generalized attribute to a spcified one, the specified attribute is made sure to match the current training instance.

Lastly, subsumption is checked to see if the parents arer more general than their offspring. If so, the offspring are not added to the population and the numerosity of the subsuming parent is increased.

In [12]:
# Create a function that takes in a set, like the correct set, and selects two parent classifiers

def tournament_selection(correct_set, tournament_size):
    import random
    tournament = random.choices(correct_set, k=tournament_size)
    return tournament

# Create a function that selects a parent from the tournament

def parent_selection(tournament):
    max_fitness = 0
    parent_index = 0
    for i in tournament:
        if i['fitness'] > max_fitness:
            max_fitness = i['fitness']
            parent_index  = tournament.index(i)
    return tournament[parent_index]

### Step 6.1: Test tournament selection and parent selection

Create a test for the tournament selection and parent selection functions. Normally, the selection will only take place in the correct set. For testing sake, we apply it to the whole population.

In [13]:
population = testing(data, .5)
print(population)
print(len(population))

[{'state': [(1, 0), (4, 0), (5, 0), (6, 0), (7, 0)], 'action': 0, 'numerosity': 1, 'match count': 64, 'correct count': 48, 'accuracy': 0.75, 'fitness': 0.2373046875, 'deletion vote': 1, 'birth iteration': 1}, {'state': [(0, 0), (1, 0), (2, 0), (3, 0), (5, 0), (6, 0), (8, 0), (9, 0)], 'action': 0, 'numerosity': 1, 'match count': 6, 'correct count': 6, 'accuracy': 1.0, 'fitness': 1.0, 'deletion vote': 1, 'birth iteration': 9}, {'state': [(1, 0), (3, 0), (4, 0), (5, 0), (6, 0), (7, 1), (8, 0), (9, 1), (10, 0)], 'action': 0, 'numerosity': 1, 'match count': 4, 'correct count': 3, 'accuracy': 0.75, 'fitness': 0.2373046875, 'deletion vote': 1, 'birth iteration': 11}, {'state': [(2, 0), (5, 0), (6, 0), (7, 1), (8, 0), (9, 1), (10, 1)], 'action': 0, 'numerosity': 1, 'match count': 16, 'correct count': 6, 'accuracy': 0.375, 'fitness': 0.007415771484375, 'deletion vote': 1, 'birth iteration': 12}, {'state': [(0, 0), (1, 0), (2, 0), (7, 1), (8, 1), (9, 0), (10, 0)], 'action': 0, 'numerosity': 1, '

In [14]:
tournament1 = tournament_selection(population, round(len(population) / 5))
print(tournament1)
print(len(tournament1))

[{'state': [(2, 1), (3, 0), (6, 0), (7, 0), (8, 0), (9, 1), (10, 1)], 'action': 0, 'numerosity': 1, 'match count': 7, 'correct count': 3, 'accuracy': 0.42857142857142855, 'fitness': 0.014458261438686257, 'deletion vote': 1, 'birth iteration': 1316}, {'state': [(0, 0), (3, 0), (4, 0), (6, 0), (7, 1), (8, 0), (9, 1), (10, 1)], 'action': 0, 'numerosity': 1, 'match count': 6, 'correct count': 5, 'accuracy': 0.8333333333333334, 'fitness': 0.401877572016461, 'deletion vote': 1, 'birth iteration': 268}, {'state': [(0, 0), (1, 0)], 'action': 1, 'numerosity': 1, 'match count': 316, 'correct count': 188, 'accuracy': 0.5949367088607594, 'fitness': 0.07453389774543417, 'deletion vote': 1, 'birth iteration': 197}, {'state': [(1, 0), (2, 1), (3, 0)], 'action': 1, 'numerosity': 1, 'match count': 42, 'correct count': 22, 'accuracy': 0.5238095238095238, 'fitness': 0.03943364769872244, 'deletion vote': 1, 'birth iteration': 1367}, {'state': [(6, 1), (8, 1), (9, 0), (10, 1)], 'action': 1, 'numerosity': 1

In [15]:
parent1 = parent_selection(tournament1)
print(parent1)

{'state': [(0, 0), (1, 1), (2, 0), (3, 1), (5, 1), (7, 1), (8, 0)], 'action': 1, 'numerosity': 1, 'match count': 6, 'correct count': 6, 'accuracy': 1.0, 'fitness': 1.0, 'deletion vote': 1, 'birth iteration': 747}


In [16]:
tournament2 = tournament_selection(population, round(len(population) / 5))
print(tournament2)
print(len(tournament2))

[{'state': [(0, 0), (1, 0)], 'action': 1, 'numerosity': 1, 'match count': 316, 'correct count': 188, 'accuracy': 0.5949367088607594, 'fitness': 0.07453389774543417, 'deletion vote': 1, 'birth iteration': 197}, {'state': [(1, 0), (2, 1), (3, 0), (4, 0), (5, 0), (6, 0), (8, 1), (10, 1)], 'action': 0, 'numerosity': 1, 'match count': 6, 'correct count': 2, 'accuracy': 0.3333333333333333, 'fitness': 0.004115226337448558, 'deletion vote': 1, 'birth iteration': 270}, {'state': [(0, 0), (3, 0), (5, 1), (8, 1), (10, 0)], 'action': 1, 'numerosity': 1, 'match count': 31, 'correct count': 23, 'accuracy': 0.7419354838709677, 'fitness': 0.22481780895283973, 'deletion vote': 1, 'birth iteration': 551}, {'state': [(1, 0), (2, 0), (3, 1), (4, 0), (7, 0), (8, 0), (9, 1)], 'action': 0, 'numerosity': 1, 'match count': 4, 'correct count': 4, 'accuracy': 1.0, 'fitness': 1.0, 'deletion vote': 1, 'birth iteration': 1187}, {'state': [(3, 0), (5, 0), (6, 0)], 'action': 0, 'numerosity': 1, 'match count': 18, 'co

In [17]:
parent2 = parent_selection(tournament2)
print(parent1)
print(parent2)

{'state': [(0, 0), (1, 1), (2, 0), (3, 1), (5, 1), (7, 1), (8, 0)], 'action': 1, 'numerosity': 1, 'match count': 6, 'correct count': 6, 'accuracy': 1.0, 'fitness': 1.0, 'deletion vote': 1, 'birth iteration': 747}
{'state': [(1, 0), (2, 0), (3, 1), (4, 0), (7, 0), (8, 0), (9, 1)], 'action': 0, 'numerosity': 1, 'match count': 4, 'correct count': 4, 'accuracy': 1.0, 'fitness': 1.0, 'deletion vote': 1, 'birth iteration': 1187}


### Step 6.2: Perform Crossover

Create a function that takes two parent classifiers and crosses over their attibutes using single point crossover.

In [18]:
def crossover(parent1, parent2, birth_iteration):
    import random
    parent1_attributes = parent1['state']
    parent2_attributes = parent2['state']
    action = parent1['action'] # Assumes that crossover only takes place on the correct set, thus both parents have the same action
    offspring1_attributes = []
    offspring2_attributes = []
    if len(parent1_attributes) == 0 and len(parent2_attributes) == 0:
        largest_index = 0
    elif len(parent1_attributes) == 0:
        largest_index = parent2_attributes[-1][0]
    elif len(parent2_attributes) == 0:
        largest_index = parent1_attributes[-1][0]
    elif parent1_attributes[-1][0] >= parent2_attributes[-1][0]:
        largest_index = parent1_attributes[-1][0]
    else:
        largest_index = parent2_attributes[-1][0]
    crossover_point = random.randint(0, (largest_index - 1)) if largest_index > 0 else 0 # Use largest index minus 1 or else there is no crossover if the point is equal to largest index
    # The 4 for loops seem excessive, but it keeps the attributes in order by index value
    for i in parent1_attributes:
        if i[0] <= crossover_point:
            offspring1_attributes.append(i)
    for i in parent2_attributes:
        if i[0] <= crossover_point:
            offspring2_attributes.append(i)
    for i in parent1_attributes:
        if i[0] > crossover_point:
            offspring2_attributes.append(i)
    for i in parent2_attributes:
        if i[0] > crossover_point:
            offspring1_attributes.append(i)
    offspring1 = {'state': offspring1_attributes, 
                    'action': action, 
                    'numerosity': 1, 
                    'match count': 1, 
                    'correct count': 1, 
                    'accuracy': 1, 
                    'fitness': 1,
                    'deletion vote': 1, 
                    'birth iteration': birth_iteration}
    offspring2 = {'state': offspring2_attributes, 
                'action': action, 
                'numerosity': 1, 
                'match count': 1, 
                'correct count': 1, 
                'accuracy': 1, 
                'fitness': 1,
                'deletion vote': 1, 
                'birth iteration': birth_iteration}


    return offspring1, offspring2

son, daughter = crossover(parent1, parent2, 1)

print(parent1)
print(parent2)
print(son)
print(daughter)


{'state': [(0, 0), (1, 1), (2, 0), (3, 1), (5, 1), (7, 1), (8, 0)], 'action': 1, 'numerosity': 1, 'match count': 6, 'correct count': 6, 'accuracy': 1.0, 'fitness': 1.0, 'deletion vote': 1, 'birth iteration': 747}
{'state': [(1, 0), (2, 0), (3, 1), (4, 0), (7, 0), (8, 0), (9, 1)], 'action': 0, 'numerosity': 1, 'match count': 4, 'correct count': 4, 'accuracy': 1.0, 'fitness': 1.0, 'deletion vote': 1, 'birth iteration': 1187}
{'state': [(0, 0), (1, 1), (2, 0), (3, 1), (5, 1), (7, 1), (8, 0), (9, 1)], 'action': 1, 'numerosity': 1, 'match count': 1, 'correct count': 1, 'accuracy': 1, 'fitness': 1, 'deletion vote': 1, 'birth iteration': 1}
{'state': [(1, 0), (2, 0), (3, 1), (4, 0), (7, 0), (8, 0)], 'action': 1, 'numerosity': 1, 'match count': 1, 'correct count': 1, 'accuracy': 1, 'fitness': 1, 'deletion vote': 1, 'birth iteration': 1}


### Step 6.3: Mutation

Mutation randomly switches a defined attribute to undefined or an undefined attribute to defined that matches the current training instance.

In [19]:
def mutation(classifier, instance, rate):
    import random
    half_rate = rate / 2 # Split the mutation rate in half to determine if deleting an attribute or adding one. Thus, mutation only happens once for each attribute
    index = 0
    for j in range(len(instance) -1 ):
        rand = random.random()
        if rand >= half_rate and rand <= rate and (index, instance[j]) not in classifier['state']: # Checks to make sure that instance isn't already in the state, preventing duplicates
            classifier['state'].append((index, instance[j]))
        index += 1
    #classifier['state'] = list(set(classifier['state'])) # Turns the state into a set and removes duplicates.
    for i in classifier['state'][:]: # Again, does the slice matter here?
        rand = random.random()
        if rand < half_rate: # Rate divided by two to differentiate between deletion and specialization of attributes
            classifier['state'].remove(i) #randomly delete the specified attribute from the classifier state
    classifier['state'].sort() # Sorts the tuples by index, making sure that they are always in order.
    return classifier

mutated = mutation(daughter, instance, .5)

print(mutated)
            


{'state': [(0, 0), (1, 0), (2, 0), (3, 1), (4, 0), (9, 0)], 'action': 1, 'numerosity': 1, 'match count': 1, 'correct count': 1, 'accuracy': 1, 'fitness': 1, 'deletion vote': 1, 'birth iteration': 1}


### Step 6.4: Subsumption

Subsumption can either take place before the genetic algorithm, acting on the correct set, or it can take place only on the children of the genetic algorithm. XCS uses a toggle parameter that can choose to run correct set (action set) subsumption or not, but XCS always checks for subsumption on the children of the genetic algorthim. For simplicity's sake, only the GA subsumption will be used here. Correct set subsumption is relatively computationally intensive as it checks all classifiers against all other classifiers. For large populations and correct sets, this can take a while.

Correct set subsumption checks to see if a classifier is more general and at least as accurate as another classifier. If it is, the, the numerosity of the more general classifier is increased by one and the less general classifier is deleted from the population.

GA subsumption simply checks to see of the parent classifiers are more general than their children. If they are, their numerosity is is increased and the children are simply not added to the population. The numerosity is directly increased on the classifier in the population.

**Note: Total population subsumption needs to be checked with the offspring as it is possible that the child already exists in the population, but was not part of the match set or correct set and then the genetic algorithm can produce and equivalent offspring to a classifier that is already in the population. This creates multiple classifiers with the same state and action.

***Note: Although other researchers and their algorithms suggest that population subsumption is not necessary, it seems to be required to eliminate more specific rules. An alternative could be to alter the fitness function to that it takes into account spcificity.

In [20]:
import copy
def more_general(parent, offspring):
    for i in parent['state']:
        if i not in offspring['state']: # Simply checks to see if the parent has fewer specified attributes than the child, as long as the specified ones are in the child
            return False
    return True

def subsumption(classifier, population):
    for i in population: # Do you need to loop through the whole population here? Or can you just update the parent parameters
        if i['state'] == classifier['state'] and i['action'] == classifier['action']:
            i['numerosity'] += 1
            #i['deletion vote'] = i['numerosity'] / i['fitness']
            i['deletion vote'] = 1 / i['fitness']
    return
# Check to see if the offspring is already in the population
def already_in(offspring, population):
    for i in population:
        if i['state'] == offspring['state'] and i['action'] == offspring['action']:
            return True
    return False

# Create a function to do correct set subsumption
def set_subsumption(population):
    population_copy = copy.deepcopy(population)
    for i in population_copy:
        for j in population_copy:
            if more_general(i, j) and i['state'] != j['state'] and i['accuracy'] >= j['accuracy']:
                i['numerosity'] += j['numerosity']
                if j in population:
                    population.remove(j)
    return             



x = {'state': [(0, 0), (1, 1), (4, 1),], 'action': 1, 'numerosity': 1, 'match count': 1, 'correct count': 1, 'accuracy': 1, 'fitness': 1, 'deletion vote': 1, 'birth iteration': 1}
y = {'state': [(0, 0), (1, 1), (4, 1)], 'action': 1, 'numerosity': 1, 'match count': 1, 'correct count': 1, 'accuracy': 1, 'fitness': 1, 'deletion vote': 1, 'birth iteration': 1}
print(more_general(x, y))

True



### Step 6.5: Combine GA Parts

Lastly, we can combine all the GA parts into one function

In [21]:
def genetic_algorithm(population, correct_set, tournament_size_fraction, mutation_rate, training_instance, birth_iteration):
    import math
    # Create tournaments from correct set with size equal to a percentage of the correct set size
    tournament1 = tournament_selection(correct_set, math.ceil(tournament_size_fraction * len(correct_set))) # Rounds the tournament size up to a whole number if the correct set is small
    tournament2 = tournament_selection(correct_set, math.ceil(tournament_size_fraction * len(correct_set)))
    # Select parents from the two tournaments
    parent1 = parent_selection(tournament1)
    parent2 = parent_selection(tournament2)
    # Cross over the parents and produce two offspring
    offspring1, offspring2 = crossover(parent1, parent2, birth_iteration)
    # Mutate the offspring based off the mutation rate
    offspring1 = mutation(offspring1, training_instance, mutation_rate)
    offspring2 = mutation(offspring2, training_instance, mutation_rate)
    # Check if each parent is more general than each child, if so, subsume the child
    if more_general(parent1, offspring1):
        subsumption(parent1, population)
    elif more_general(parent2, offspring1):
        subsumption(parent2, population)
    elif already_in(offspring1,population):
        subsumption(offspring1, population)
    else: # If the child is not subsumed by either parent, and not already in the population add it to the population
        population.append(offspring1)
    if more_general(parent1, offspring2):
        subsumption(parent1, population)
    elif more_general(parent2, offspring2):
        subsumption(parent2, population)
    elif already_in(offspring2, population):
        subsumption(offspring2, population)
    else:
        population.append(offspring2)
    return
    

### Step 7: Deletion

The final step in the LCS algorithm is to delete classifiers from the population if the total numerosity is greater than some specified number. Classifiers are deleted from the population inversely proportional to their fitness and directly proportional to their numerosity. This prevents a small number of high numerosity classifiers from taking over the population. When a classifier is "deleted", its numerosity is reduced by 1. If a classifier's numerosity is ever dropped to zero, it is removed entirely from the population. Lastly, some LCS systems have added protection mechanisms for young classifiers so that classifiers can't be deleted right after they are born. However, since classifiers here are initialized with a numerosity and fitness of 1, they will always be the least likely to be deleted.

**Note: For most problems deletion vote based on numerosity/fitness makes sense since it keeps a few rules from taking over the population. However, sometimes the "correct" answer to a problems is only a few rules. For example, the 6 bit multiplexer problem has hundreds of potential rules (3^6 times 2), but only 8 optimal rules. In this case, the LCS should try to find only 8 optimal rules with high numerosity. Perhaps add a parameter that switches the deletion vote calculation based on problem type?

In [22]:
def deletion(population, max_size):
    cumulative_numerosity = 0
    for i in population: # Loop through the population and sum all the numerosities
        cumulative_numerosity += i['numerosity']
    if cumulative_numerosity <= max_size:
        return # if the cumulative numerosity is less than the allowable size, then no deletion occurs
    while cumulative_numerosity > max_size: # Continue deletion until the cumulative numerosity is less than or equal to the max size
        population.sort(key=lambda d: d['deletion vote'], reverse=True) # Sort the population based off deletion vote, the highest will be at the front
        if population[0]['numerosity'] > 1: # If the numerosity of the highest voted classifier is greater than 1, decrease its numerosity by 1
            population[0]['numerosity'] -= 1
            #population[0]['deletion vote'] = population[0]['numerosity'] / population[0]['fitness'] #Update the deletion vote of the first classifier
            population[0]['deletion vote'] = 1 / population[0]['fitness']
        else:
            population.pop(0) # If the numerosity is 1, simply remove the classifier from the population
        cumulative_numerosity = 0 # Reset the cumulative numerosity to 0
        for i in population:
            cumulative_numerosity += i['numerosity'] # Calculate the cumulative numerosity again and loop back up to the while loop
    return


### Step 8: Compaction

The last and final step of an LCS is to compact the population into a set of rules that can be used for testing. Compaction simply deletes inaccurate rules that will inevitably be generated during that last few training cycles. The deletion criteria can be highly elitist and can delete rules with an accuracy lower than a specified value. For simple problems, this might even be set to 1.

In [23]:
# This should be a simple function, but for some reason, seems to do nothing

def compaction(population, accuracy_cutoff):
    for i in population[:]: # Apparently this slice is important to make a copy of the population and remove its items while iterating over it
        if i['accuracy'] < accuracy_cutoff:
            population.remove(i)
    set_subsumption(population)
    return

### Step 9: Combine all Parts of the LCS Algorithm

Finally, we can combine all steps of the LCS algorithm together.

In [24]:
def binaryLCS(data, covering_specificity, tournament_size_fraction, mutation_rate, max_pop_size, accuracy_cutoff, learning_epochs):
    learning_epoch = 1 # Initialize the learning epoch
    population = initialize_population() # Create an empty population
    length = get_data_length(data) # Get the length of the training data file
    birth_iteration = 1 # Initialize the birth iteration
    while learning_epoch < learning_epochs: # Continue the training loop for as many epochs as requested
        for i in range(1, length): # Starting at line 1 assumes that the first line is the header
            instance = get_instance(data, i) # Get the first training instance from the data
            match_set = create_match_set(population, instance) # Create the match set from the population and the training instance
            correct_set = create_correct_set(match_set, instance) # Create the correct set from the match set and the training instance
            if len(correct_set) == 0: # Activate covering if the correct set is empty
                classifier = covering(instance, birth_iteration, covering_specificity) # Create a new classifier that matches the current training instance
                population.append(classifier) # add the new classifier to the population
                birth_iteration += 1
            else:
                set_subsumption(correct_set)
                # If the correct set is not empty, activate the genetic algorithm
                genetic_algorithm(population, correct_set, tournament_size_fraction, mutation_rate, instance, birth_iteration)
                birth_iteration += 1    
        #population_subsumption(population)
        deletion(population, max_pop_size)
        learning_epoch += 1
    compaction(population, accuracy_cutoff)
    return population

In [25]:
import pprint
rule_set = binaryLCS(data, .5, .5, .3, 1000, 1.0, 10)
pprint.pprint(rule_set)

[{'accuracy': 1.0,
  'action': 0,
  'birth iteration': 17523,
  'correct count': 6,
  'deletion vote': 1,
  'fitness': 1.0,
  'match count': 6,
  'numerosity': 1,
  'state': [(1, 0), (2, 0), (3, 0), (7, 0)]},
 {'accuracy': 1.0,
  'action': 1,
  'birth iteration': 17533,
  'correct count': 100,
  'deletion vote': 1.0,
  'fitness': 1.0,
  'match count': 100,
  'numerosity': 6,
  'state': [(1, 0), (7, 1), (8, 1)]},
 {'accuracy': 1.0,
  'action': 0,
  'birth iteration': 17585,
  'correct count': 84,
  'deletion vote': 1.0,
  'fitness': 1.0,
  'match count': 84,
  'numerosity': 10,
  'state': [(2, 0), (7, 0), (9, 0)]},
 {'accuracy': 1.0,
  'action': 1,
  'birth iteration': 17632,
  'correct count': 17,
  'deletion vote': 1,
  'fitness': 1.0,
  'match count': 17,
  'numerosity': 1,
  'state': [(0, 1), (1, 0), (2, 0), (7, 1)]},
 {'accuracy': 1.0,
  'action': 1,
  'birth iteration': 17669,
  'correct count': 128,
  'deletion vote': 1.0,
  'fitness': 1.0,
  'match count': 128,
  'numerosity': 3

### Step 10: Testing

Use the ruleset to test on the data and see how well the rule set predicts the classifications of the data.

In [26]:
def testing(rule_set, data):
    if len(rule_set) == 0:
        return None
    length = get_data_length(data)
    number_correct = 0
    for i in range(1, length):
        instance = get_instance(data, i)
        match_set = create_match_set(rule_set, instance)
        vote1 = sum(j['numerosity'] for j in match_set if j['action'] == 0)
        vote2 = sum(k['numerosity'] for k in match_set if k['action'] == 1)
        if vote1 > vote2:
            vote = 0
        if vote2 > vote1:
            vote = 1
        if vote == instance[-1]:
            number_correct += 1
    percent_correct = number_correct / (length - 1)
    return percent_correct

print(testing(rule_set, data))

0.57763671875


### Step 11: Compare to UrbsLab eLCS and XCS

Compare the core LCS above to similar LCS models. At first glance, the UrbLab models are many thousands of lines of code, but run considerably faster. Compared to the eLCS Jupyter notebook by UrbsLab, the from scratch LCS seems to be more accurate with smaller rule populations and less learning cycles.

In [27]:
#Import Necessary Packages/Modules
from skeLCS import eLCS
import numpy as np
import pandas as pd
from sklearn.model_selection import cross_val_score

#Load Data Using Pandas
data = pd.read_csv('./11Multiplexer_Data_Complete.csv') #REPLACE with your own dataset .csv filename
classLabel = 'Class'
dataFeatures = data.drop(classLabel,axis=1).values #DEFINE classLabel variable as the Str at the top of your dataset's class column
dataPhenotypes = data[classLabel].values

#Shuffle Data Before CV
formatted = np.insert(dataFeatures,dataFeatures.shape[1],dataPhenotypes,1)
np.random.shuffle(formatted)
dataFeatures = np.delete(formatted,-1,axis=1)
dataPhenotypes = formatted[:,-1]

#Initialize eLCS Model
model = eLCS(learning_iterations = 5000)

#3-fold CV
print(np.mean(cross_val_score(model,dataFeatures,dataPhenotypes,cv=3)))


0.9740972257950187


In [28]:
#Import Necessary Packages/Modules
from skeLCS import eLCS
import numpy as np
import pandas as pd
from sklearn.model_selection import cross_val_score

#Load Data Using Pandas
data = pd.read_csv('./11Multiplexer_Data_Complete.csv') #REPLACE with your own dataset .csv filename
classLabel = 'Class'
dataFeatures = data.drop(classLabel,axis=1).values #DEFINE classLabel variable as the Str at the top of your dataset's class column
dataPhenotypes = data[classLabel].values

#Shuffle Data Before CV
formatted = np.insert(dataFeatures,dataFeatures.shape[1],dataPhenotypes,1)
np.random.shuffle(formatted)
dataFeatures = np.delete(formatted,-1,axis=1)
dataPhenotypes = formatted[:,-1]

#Initialize eLCS Model
model = eLCS(learning_iterations = 5000)

#3-fold CV
print(np.mean(cross_val_score(model,dataFeatures,dataPhenotypes,cv=3)))

0.9965829774828077
