# XCS Tutorial

This is the official tutorial for the [xcs package](https://pypi.python.org/pypi/xcs) for Python 3. You can find the latest release and get updates on the project's status at the [project home page](https://github.com/hosford42/xcs) on GitHub.com.

## Installation

If you have pip, installation is straight forward:

If this goes as planned, you should see a message like this:

It is recommended that you also install numpy, though the xcs package will still work, albeit more slowly, without it.

If you are unable to use pip, you can still install xcs manually. The latest release can be found [here](https://github.com/hosford42/xcs/releases) or [here](https://pypi.python.org/pypi/xcs). Download the zip file, unpack it, and cd into the directory. Then run:

You should see a message indicating that the package was successfully installed. For instructions on how to manually install numpy, visit the [numpy installation instructions page](http://docs.scipy.org/doc/numpy/user/install.html) at SciPy.org.

## Testing the Newly Installed Package

Let's start things off with a quick test, to verify that everything has been installed properly. First, we set up logging so we can see the test's progress.

In [1]:
import logging
logging.root.setLevel(logging.INFO)

Then we import the xcs module and run the built-in test() function. By default, the test() function runs the canonical XCS algorithm on the 11-bit (3-bit address) MUX problem for 10,000 steps.

In [2]:
import xcs
xcs.test()

INFO:xcs.problems:Possible actions:
INFO:xcs.problems:    False
INFO:xcs.problems:    True
INFO:xcs.problems:Steps completed: 0
INFO:xcs.problems:Average reward per step: 0.00000
INFO:xcs.problems:Steps completed: 100
INFO:xcs.problems:Average reward per step: 0.46000
INFO:xcs.problems:Steps completed: 200
INFO:xcs.problems:Average reward per step: 0.52000
INFO:xcs.problems:Steps completed: 300
INFO:xcs.problems:Average reward per step: 0.52333
INFO:xcs.problems:Steps completed: 400
INFO:xcs.problems:Average reward per step: 0.52500
INFO:xcs.problems:Steps completed: 500
INFO:xcs.problems:Average reward per step: 0.53600
INFO:xcs.problems:Steps completed: 600
INFO:xcs.problems:Average reward per step: 0.54667
INFO:xcs.problems:Steps completed: 700
INFO:xcs.problems:Average reward per step: 0.54571
INFO:xcs.problems:Steps completed: 800
INFO:xcs.problems:Average reward per step: 0.54875
INFO:xcs.problems:Steps completed: 900
INFO:xcs.problems:Average reward per step: 0.55444
INFO:xcs.pr

## Usage

Now we'll run through a quick demo of how to use existing algorithms and problems. This is essentially the same code that appears in the test() function we called above.

First, we're going to need to import a few things:

In [1]:
from xcs import XCSAlgorithm, LCS
from xcs.problems import MUXProblem, OnLineObserver

The XCSAlgorithm class contains the actual XCS algorithm implementation. The LCS class combines the selected algorithm with its state (a Population instance) to form a learning classifier system. MUXProblem is the classic [multiplexer](http://en.wikipedia.org/wiki/Multiplexer) problem, which defaults to 3 address bits (11 bits total). OnLineObserver is a wrapper for problems which logs the inputs, actions, and rewards as the algorithm attempts to solve the problem.

Now that we've imported the necessary tools, we can define the actual problem, telling it to give the algorithm 10000 reward cycles to attempt to learn the appropriate input/output mapping, and wrapping it with an observer so we can see the algorithm's progress.

In [2]:
problem = OnLineObserver(MUXProblem(50000))

Next, we'll create the algorithm which will be used to manage the classifier population and learn the mapping defined by the problem we have selected:

In [3]:
algorithm = XCSAlgorithm(problem.get_possible_actions())

We ask the problem for the possible actions that can be taken, and pass them to the XCS algorithm so they can be used in covering operations. (Covering is the generation of a random classifier rule when too few match the current situation.) The algorithm's parameters are set to appropriate defaults for most problems, but it is straight forward to modify them if it becomes necessary.

In [4]:
algorithm.exploration_probability = .1
algorithm.discount_factor = 0
algorithm.do_GA_subsumption = True
algorithm.do_action_set_subsumption = True

Here we have selected an exploration probability of .1, which will sacrifice most learning opportunities in favor of taking advantage of what has already been learned so far. This makes sense in real-time learning environment; a lower value is more appropriate in cases where the classifier is being trained in advance or is being used simply to learn a minimal rule set. The discount factor is set to 0, since future rewards are not affected at all by the currently selected action. We have also elected to turn on GA and action set subsumption, which help the system to converge to the minimal effective rule set more quickly in some types of problems.

Next, we create the classifier itself:

In [5]:
classifier = LCS(algorithm)

The LCS will create an empty population for us if we do not provide one.

And finally, this is where all the magic happens:

In [6]:
classifier.learn(problem)

We pass the problem to the classifier and ask it to learn the appropriate input/output mapping. It executes training cycles until the problem dictates that training should stop. Note that if you wish to see the progress as the algorithm learns the problem, you will need to set the logging level to INFO, as described in the previous section, before calling the learn() method.

Now we can look at the fruits of our labors.

In [7]:
print(classifier.population)

00##00####1 => True
    Time Stamp: 49977
    Average Reward: 0.87821679153
    Error: 0.202410212397
    Fitness: 0.00194208652311
    Experience: 0
    Action Set Size: 1
    Numerosity: 1
#001######1 => True
    Time Stamp: 49979
    Average Reward: 0.858333333333
    Error: 0.179444444444
    Fitness: 0.00747704267322
    Experience: 0
    Action Set Size: 1
    Numerosity: 1
#00##1##### => True
    Time Stamp: 49979
    Average Reward: 0.635650994586
    Error: 0.368471665952
    Fitness: 0.00813172254357
    Experience: 26
    Action Set Size: 16.559358046536477
    Numerosity: 1
0#######0## => True
    Time Stamp: 49952
    Average Reward: 0.844032029758
    Error: 0.283404312016
    Fitness: 0.00863571013407
    Experience: 22
    Action Set Size: 14.53429379211057
    Numerosity: 1
#0#0###0### => True
    Time Stamp: 49872
    Average Reward: 0.670690583534
    Error: 0.40450684357
    Fitness: 0.0108553463159
    Experience: 37
    Action Set Size: 13.680180299432106
    Nume

This gives us a printout of each rule, in the form *condition => action*, followed by various stats about the rule pertaining to the algorithm we selected. The population can also be accessed as an iterable container:

In [8]:
print(len(classifier.population))

104


In [9]:
for condition, action in classifier.population:
    metadata = classifier.population.get_metadata(condition, action)
    if metadata.fitness > .5:
        print(condition, '=>', action, ' [%.5f]' % metadata.fitness)

101#####0## => False  [0.96003]
001#0###### => False  [0.93244]
011###0#### => False  [0.85302]
100####0### => False  [0.67857]
111#######0 => False  [0.93936]
111#######1 => True  [0.89453]
110######1# => True  [0.88251]
#001###1### => True  [0.82224]
01###00#0## => True  [0.78859]
1#0####0#0# => False  [0.63669]
011###1#### => True  [0.90256]
#011###11#1 => True  [0.51872]
