# Chapter 1

## Learning Frequent Patterns with Type I Feedback

In [1]:
from random import random
from random import choice

### Car Dataset

The following dataset contains three vehicles of class _Car_. They are characterized by five Boolean features. The first vehicle in the dataset, for instance, has _Four Wheels_, _Transports People_, does not have _Wings_, is _Blue_, but not _Yellow_.

In [2]:
cars = [
    {'Four Wheels':True, 'Transports People':True, 'Wings':False, 'Yellow':False, 'Blue':True},
    {'Four Wheels':True, 'Transports People':True, 'Wings':False, 'Yellow':True, 'Blue':False},
    {'Four Wheels':True, 'Transports People':True, 'Wings':False, 'Yellow':True, 'Blue':False}
]

### Rule Memory

You are now going to learn a single rule. To this end, the rule has its own memory for storing literals. A memory value in the range 1 to 10 measures how deeply the memory stores each literal.

* Values 1 to 5 stand for _Forgotten_. Value 1 means maximally forgotten while value 5 means almost memorized. Literals in this range _do not_ take part in the rule's condition.
* Values 6 to 10 mean _Memorized_. Value 6 stands for lightly retained, and value 10 represents maximally memorized. Literals in this range take part in the condition of the rule.

The rule starts up with all the literals in memory position 5. That is, the literals are about to be _Memorized_ but currently _Forgotten_.

In [3]:
memory = {'Four Wheels':5, 'NOT Four Wheels':5, 'Transports People':5, 'NOT Transports People':5, 'Wings':5, 'NOT Wings':5, 'Blue':5, 'NOT Blue':5, 'Yellow':5, 'NOT Yellow':5}

### Memorization and Forgetting

The _memorize value_ 0.1 decides how quickly the rule memorizes literals. Conversely, the _forget value_ 0.9 decides how quickly the rule forgets in the absence of observations.

In [4]:
memorize_value = 0.1
forget_value = 0.9

You memorize a literal by incrementing its position in memory unless already _Maximally Memorized_ in position 10. The memorization thus pushes the literals deeper into the rule's memory. The `memorize()` method implements memorization:

In [5]:
def memorize(literal):
    if random() <= memorize_value and memory[literal] < 10:
        memory[literal] += 1

You forget a literal by decrementing its position in memory unless already _Maximally Forgotten_ in position 1. The `forget()` method implements forgetting:

In [6]:
def forget(literal):
    if random() <= forget_value and memory[literal] > 1:
        memory[literal] -= 1

The `get_condition()` method implements that literals in memory position 6 to 10 takes part in the condition of the rule.

In [7]:
def get_condition():
    condition = []
    for literal in memory:
        if memory[literal] >= 6:
            condition.append(literal)
    return condition

### Creating Patterns with AND and NOT

The Tsetlin machine solves pattern recognition problems by building _if-then rules_ from object observations. Each rule has the form:

<p><center>$\mathbf{if~} \mathit{condition} \mathbf{~then~}\mathit{class}.$</center></p>

The _condition_ part is a placeholder for a Boolean expression that describes a pattern in the data, to be learnt by the Tsetlin machine. The condition

<p><center>$\mathit{Four~Wheels} \mathbf{~and~} \mathit{Transports~People},$</center></p>

for instance, characterises the three cars. 

The `evaluate_condition()` method implements an __and__ of a list of literals.

In [8]:
def evaluate_condition(observation, condition):
    truth_value_of_condition = True
    for feature in observation:
        if feature in condition and observation[feature] == False:
            truth_value_of_condition = False
            break
        if 'NOT ' + feature in condition and observation[feature] == True:
            truth_value_of_condition = False
            break
    return truth_value_of_condition

### Type I Feedback

The `type_i_feedback()` method implements the two learning steps:

1. Check if the condition part of the rule is _True_ by assessing the object's literals. If the condition part is _True_, then memorize all the literals that are _True_ for the object. You memorize a literal by incrementing its position in memory unless already _Maximally Memorized_ in position 10. The memorization thus pushes the literals deeper into the rule's memory.

2. Forget all remaining literals by pushing them towards being _Maximally Forgotten_. You forget a literal by decrementing its position in memory unless already _Maximally Forgotten_ in position 1.

In [9]:
def type_i_feedback(observation):
    remaining_literals = list(memory.keys())
    if evaluate_condition(observation, get_condition()) == True:
        for feature in observation:
            if observation[feature] == True:
                memorize(feature)
                remaining_literals.remove(feature)
            elif observation[feature] == False:
                memorize('NOT ' + feature)
                remaining_literals.remove('NOT ' + feature)
    for literal in remaining_literals:
        forget(literal)

### Example Run

The following code randomly selects a car and then updates the rule. This procedure is repeated 100 times.

In [10]:
for i in range(100):
    observation_id = choice([0,1,2])
    type_i_feedback(cars[observation_id])

The memory now looks like this:

In [11]:
print(memory)

{'Four Wheels': 10, 'NOT Four Wheels': 1, 'Transports People': 10, 'NOT Transports People': 1, 'Wings': 1, 'NOT Wings': 10, 'Blue': 1, 'NOT Blue': 1, 'Yellow': 1, 'NOT Yellow': 1}


Observe that some of the literals are now either deeply memorized or forgotten. You can print out the resulting rule:

In [12]:
print("IF " + " AND ".join(get_condition()) + " THEN Car")

IF Four Wheels AND Transports People AND NOT WingsTHEN Car
