-
Notifications
You must be signed in to change notification settings - Fork 12
Python Library Usage
Documentation can be displayed within Python:
import xcsf
help(xcsf.xcsf)
Example:
import xcsf
xcs = xcsf.XCS(
x_dim=8, # number of input feature variables
y_dim=1, # number of predicted target variables (1 for reinforcement learning)
n_actions=2 # number of actions or classes (1 for supervised learning)
)
Library Stub:
def __init__(self, x_dim: int, y_dim: int, n_actions: int) -> None ...
Default parameter values are hard-coded within XCSF. At run-time, the values may be overridden within Python by using the following properties:
# General XCSF
xcs.OMP_NUM_THREADS = 8 # number of CPU cores to use
xcs.POP_INIT = True # whether to seed the population with random rules
xcs.POP_SIZE = 200 # maximum population size
xcs.MAX_TRIALS = 1000 # number of trials to execute for each xcs.fit()
xcs.PERF_TRIALS = 1000 # number of trials to average performance output
xcs.LOSS_FUNC = "mae" # mean absolute error
xcs.LOSS_FUNC = "mse" # mean squared error
xcs.LOSS_FUNC = "rmse" # root mean squared error
xcs.LOSS_FUNC = "log" # log loss (cross-entropy)
xcs.LOSS_FUNC = "binary_log" # binary log loss
xcs.LOSS_FUNC = "onehot" # one-hot encoding classification error
xcs.LOSS_FUNC = "huber" # Huber error
xcs.HUBER_DELTA = 1 # delta parameter for Huber error calculation
xcs.seed(seed) # sets the random number seed; uses the current time if not set
# General Classifier
xcs.E0 = 0.01 # target error, under which accuracy is set to 1
xcs.ALPHA = 0.1 # accuracy offset for rules above E0 (1=disabled)
xcs.NU = 5 # accuracy slope for rules with error above E0
xcs.BETA = 0.1 # learning rate for updating error, fitness, and set size
xcs.DELTA = 0.1 # fraction of least fit classifiers to increase deletion vote
xcs.THETA_DEL = 20 # min experience before fitness used in probability of deletion
xcs.INIT_FITNESS = 0.01 # initial classifier fitness
xcs.INIT_ERROR = 0 # initial classifier error
xcs.M_PROBATION = 10000 # trials since creation a rule must match at least 1 input or be deleted
xcs.STATEFUL = True # whether classifiers should retain state across trials
xcs.SET_SUBSUMPTION = False # whether to perform set subsumption
xcs.THETA_SUB = 100 # minimum experience of a classifier to become a subsumer
xcs.COMPACTION = False # if enabled and sys err < E0, the largest of 2 roulette spins is deleted
# Multi-step Problems
xcs.TELETRANSPORTATION = 50 # num steps to reset a multistep problem if goal not found
xcs.GAMMA = 0.95 # discount factor in calculating the reward for multistep problems
xcs.P_EXPLORE = 0.9 # probability of exploring vs. exploiting in a multistep trial
# Evolutionary Algorithm
xcs.EA_SELECT_TYPE = "roulette" # roulette wheel parental selection
xcs.EA_SELECT_TYPE = "tournament" # tournament parental selection
xcs.EA_SELECT_SIZE = 0.4 # fraction of set size for tournament parental selection
xcs.THETA_EA = 50 # average set time between EA invocations
xcs.LAMBDA = 2 # number of offspring to create each EA invocation (use multiples of 2)
xcs.P_CROSSOVER = 0.8 # probability of applying crossover
xcs.ERR_REDUC = 1.0 # amount to reduce an offspring error (1=disabled)
xcs.FIT_REDUC = 0.1 # amount to reduce an offspring fitness (1=disabled)
xcs.EA_SUBSUMPTION = False # whether to try and subsume offspring classifiers
xcs.EA_PRED_RESET = False # whether to reset offspring predictions instead of copying
Please note that the default parameters are not intended as general values suitable for all problems and must be set appropriately for the specific learning task.
The use of always matching conditions results in the match set being equal to the population set, i.e., [M] = [P]. The evolutionary algorithm and classifier updates are thus performed within [P], and global models are designed (e.g., neural networks) that cover the entire state-space. This configuration operates as a more traditional evolutionary algorithm, which can be useful for debugging and benchmarking.
Additionally, a single global model (e.g., a linear regression) can be fit by
also setting POP_SIZE = 1
and disabling the evolutionary algorithm by setting
the invocation frequency to a larger number than will ever be executed, e.g.,
THETA_EA = 5000000
. This can also be useful for debugging and benchmarking.
xcs.condition("dummy")
With ternary bitstrings, each classifier's condition is represented as x_dim
multiplied by the number of encoding bits
.
For binary problems, the number of encoding bits is simply: bits = 1
.
For real-valued inputs, the values are binarised to the specified number of bits
with the
assumption that the inputs are in the range [0,1].
For example with bits = 2
, an input vector [0.23,0.76,0.45,0.5]
will be converted to
[0,0,1,1,0,1,0,1]
before being tested for matching with the ternary bitstring
using the alphabet {0,1,#}
where the don't care symbol #
matches either bit.
Uniform crossover is applied with probability P_CROSSOVER
and a single
self-adaptive mutation rate (log normal) is used.
args = {
"bits": 2, # number of bits per float to binarise inputs
"p_dontcare": 0.5, # don't care probability during covering
}
xcs.condition("ternary", args)
Related Literature:
- S. W. Wilson (1995) Classifier fitness based on accuracy
Hyperellipsoids currently use the center-spread representation (and axis-rotation is not yet implemented.)
Hyperrectangles currently implement the center-spread and unordered-bound representations.
With the hyperrectangle center-spread representation, each classifier condition is represented as a concatenation
of interval predicates, x_dim
and
With the hyperrectangle unordered-bound representation, each classifier condition is represented as a concatenation
of interval predicates, x_dim
and
Uniform crossover is applied with probability P_CROSSOVER
. A single
self-adaptive mutation rate (log normal) specifies the standard
deviation used to sample a random Gaussian (with zero mean) which is added to
each center and spread value (or bound for unordered-bounds).
For center-spread representations, if eta > 0
each classifier's centers are
adjusted at rate
args = {
"min": 0, # minimum value of a center/bound
"max": 1, # maximum value of a center/bound
"min_spread": 0.1, # minimum initial spread
"eta": 0, # gradient descent rate for moving centers to mean inputs matched
}
xcs.condition("hyperrectangle_csr", args) # center-spread
xcs.condition("hyperrectangle_ubr", args) # unordered-bound
xcs.condition("hyperellipsoid", args) # center-spread
Related Literature:
- S. W. Wilson (2000) Get real! XCS with continuous-valued inputs
- C. Stone and L. Bull (2003) For real! XCS with continuous-valued inputs
- M. V. Butz (2005) Kernel-based, ellipsoidal conditions in the real-valued XCS classifier system
- M. V. Butz, P.-L. Lanzi, and S. W. Wilson (2006) Hyper-ellipsoidal conditions in XCS: rotation, linear approximation, and solution structure
- K. Tamee, L. Bull, and O. Pinngern (2007) Towards clustering with XCS
GP trees currently use
arithmetic operators from the set {+,-,/,*}
. Return values from each node
are clamped [-1000,1000]. The rule matches if the output node is greater than
0.5. Subsumption is not implemented.
Sub-tree crossover is applied with probability P_CROSSOVER
. A single
self-adaptive mutation rate (rate selection) is used to specify the
per allele probability of performing mutation where terminals are randomly
replaced with other terminals and functions randomly replaced with other
functions.
args = {
"min_constant": 0, # minimum value of a constant
"max_constant": 1, # maximum value of a constant
"n_constants": 100, # number of (global) constants available
"init_depth": 5, # initial depth of a tree
"max_len": 10000, # maximum initial length of a tree
}
xcs.condition("tree_gp", args)
See also: Visualising GP Trees.
Related Literature:
- M. Ahluwalia and L. Bull (1999) A genetic programming-based classifier system.
- P.-L. Lanzi (1999) Extending the representation of classifier conditions Part II: From messy coding to S-expressions
- P.-L. Lanzi (2003) XCS with stack-based genetic programming
- C. Ioannides and W. Browne (2007) Investigating scaling of an abstracted LCS utilising ternary and S-expression alphabets
- S. W. Wilson (2008) Classifier conditions using gene expression programming
- M. Iqbal, W. N. Browne, and M. Zhang (2014) Reusing building blocks of extracted knowledge to solve complex, large-scale Boolean problems
Temporally dynamic graphs
with fuzzy symbolic functions selected
from the CFMQVS set: {fuzzy NOT, fuzzy AND, fuzzy OR}
. Each graph is initialised
with a randomly selected function assigned to each node and random connectivity
(including recurrent connections) and is synchronously updated in parallel for T cycles
before sampling the output node(s). These graphs can exhibit inherent memory by
retaining state across inputs. Inputs must be in the range [0,1].
Currently implements a fixed number of nodes with the connectivity and update cycles evolved along with the function for each node. Log normal self-adaptive mutation is used for node function and connectivity and uniform self-adaptive mutation for the number of update cycles.
When used as conditions, the number of nodes n
must be at least 1 and the
rule matches a given input if the state of that node is greater than 0.5 after
updating the graph T times. When used as condition + action rules, the action
is encoded as binary (discretising the node outputs with threshold 0.5); for
example with 8 actions, a minimum of 3 additional nodes are required.
Subsumption is not implemented.
args = {
"max_k": 2, # number of connections per node
"max_t": 10, # maximum number of cycles to update graphs
"n": 20, # number of nodes in the graph
"evolve_cycles": True, # whether to evolve the number of update cycles
}
xcs.condition("dgp", args)
xcs.condition("rule_dgp", args) # conditions + actions in single DGP graphs
See also: Visualising DGP Graphs.
Related Literature:
- R. J. Preen and L. Bull (2013) Dynamical genetic programming in XCSF
- R. J. Preen and L. Bull (2014) Discrete and fuzzy dynamical genetic programming in the XCSF learning classifier system
- M. Iqbal, W. N. Browne, and M. Zhang (2017) Extending XCS with cyclic graphs for scalability on complex Boolean problems
Condition output layers should be set to a single neuron, i.e., "n_init": 1
. A classifier matches an input if this output neuron is greater than 0.5.
When used to represent conditions and actions within a single network ("rules")
the output layers should be "n_init": 1 + binary
where binary is the
number of outputs required to output binary actions. For example, for 8
actions, 3 binary outputs are required and the output layer should contain 4
neurons. Again, the neuron states of the action outputs are discretised with
threshold 0.5. Subsumption is not implemented.
See Neural Network Initialisation.
xcs.condition("neural", layer_args)
xcs.condition("rule_neural", layer_args) # conditions + actions in single neural nets
Related Literature:
- L. Bull (2002) On using constructivism in neural classifier systems
- R. J. Preen, S. W. Wilson, and L. Bull (2021) Autoencoding with a classifier system
- R. J. Preen and L. Bull (2021) Deep learning with a classifier system: Initial results
A constant integer value. A single self-adaptive mutation rate (log normal) specifies the probability of randomly reselecting the value.
xcs.action("integer")
Related Literature:
- S. W. Wilson (1995) Classifier fitness based on accuracy
Output layer should be a softmax. See Neural Network Initialisation.
xcs.action("neural", layer_args)
Related Literature:
- T. O'Hara and L. Bull (2005) A memetic accuracy-based neural learning classifier system
- P.-L. Lanzi and D. Loiacono (2007) Classifier systems that compute action mappings
- D. Howard, L. Bull, and P.-L. Lanzi (2015) A cognitive architecture based on a learning classifier system with spiking classifiers
Original XCS behaviour can be specified with piece-wise constant predictions. These are updated with (reward or payoff) target
- if
$exp_j < 1 / \beta$ :$p_j \leftarrow (p_j \times (exp_j - 1) + y) / exp_j$
- otherwise:
$p_j \leftarrow p_j + \beta (y - p_j)$
xcs.BETA = 0.1 # classifier update rate includes constant predictions
xcs.prediction("constant")
Related Literature:
- S. W. Wilson (1995) Classifier fitness based on accuracy
If eta
is evolved, the rate is initialised uniformly random [eta_min, eta]
.
Offspring inherit the rate and a single (log normal) self-adaptive
mutation rate specifies the standard deviation used to sample a random Gaussian
(with zero mean) which is added to eta
(similar to
evolution strategies).
args = {
"x0": 1, # offset value
"eta": 0.1, # gradient descent update rate (maximum value, if evolved)
"eta_min": 0.0001, # minimum gradient descent update rate (if evolved)
"evolve_eta": True, # whether to evolve the gradient descent rate
}
xcs.prediction("nlms_linear", args)
xcs.prediction("nlms_quadratic", args)
Related Literature:
- S. W. Wilson (2001) Function approximation with a classifier system
- S. W. Wilson (2002) Classifiers that approximate functions
- P.-L. Lanzi, D. Loiacono, S. W. Wilson, and D. E. Goldberg (2005) XCS with computed prediction for the learning of Boolean functions
- P.-L. Lanzi, D. Loiacono, S. W. Wilson, and D. E. Goldberg (2005) XCS with computed prediction in multistep environments
- P.-L. Lanzi, D. Loiacono, S. W. Wilson, and D. E. Goldberg (2005) Extending XCSF beyond linear approximation
args = {
"x0": 1, # offset value
"scale_factor": 1000, # initial diagonal values of the gain-matrix
"lambda": 1, # forget rate (small values may be unstable)
}
xcs.prediction("rls_linear", args)
xcs.prediction("rls_quadratic", args)
Related Literature:
- P.-L. Lanzi, D. Loiacono, S. W. Wilson, and D. E. Goldberg (2006) Prediction update algorithms for XCSF: RLS, Kalman filter, and gain adaptation
- D. Loiacono and P.-L. Lanzi (2007) Recursive least squares and quadratic prediction in continuous multistep problems
- M. V. Butz, P.-L. Lanzi, and S. W. Wilson (2008) Function approximation with XCS: Hyperellipsoidal conditions, recursive least squares, and compaction
- D. Loiacono and P.-L. Lanzi (2008) Computed prediction in binary multistep problems
- D. Loiacono and P.-L. Lanzi (2009) Recursive least squares and quadratic prediction in continuous multistep problems
Output layer should be "n_init": y_dim
.
See Neural Network Initialisation.
xcs.prediction("neural", layer_args)
Related Literature:
- P.-L. Lanzi and D. Loiacono (2006) XCSF with neural prediction
- T. O'Hara and L. Bull (2007) Backpropagation in accuracy-based neural learning classifier systems
- R. J. Preen, S. W. Wilson, and L. Bull (2021) Autoencoding with a classifier system
- R. J. Preen and L. Bull (2021) Deep learning with a classifier system: Initial results
layer_args = {
"layer_0": { # first hidden layer
"type": "connected", # layer type
..., # layer specific parameters
},
..., # as many layers as desired
"layer_n": { # output layer
"type": "connected", # layer type
..., # layer specific parameters
},
}
Note: Neuron states are clamped [-100,100] before activations are applied. Weights are clamped [-10,10].
"logistic", # logistic [0,1]
"relu", # rectified linear unit [0,inf]
"tanh", # tanh [-1,1]
"linear", # linear [-inf,inf]
"gaussian", # Gaussian (0,1]
"sin", # sine [-1,1]
"cos", # cosine [-1,1]
"softplus", # soft plus [0,inf]
"leaky", # leaky rectified linear unit [-inf,inf]
"selu", # scaled exponential linear unit [-1.7581,inf]
"loggy", # logistic [-1,1]
layer_args = {
"layer_0": {
"type": "connected", # layer type
"activation": "relu", # activation function
"evolve_weights": True, # whether to evolve weights
"evolve_connect": True, # whether to evolve connectivity
"evolve_functions": True, # whether to evolve activation function
"evolve_neurons": True, # whether to evolve the number of neurons
"max_neuron_grow": 5, # maximum number of neurons to add or remove per mut
"n_init": 10, # initial number of neurons
"n_max": 100, # maximum number of neurons (if evolved)
"sgd_weights": True, # whether to use gradient descent (only for predictions)
"evolve_eta": True, # whether to evolve the gradient descent rate
"eta": 0.1, # gradient descent update rate (maximum value, if evolved)
"eta_min": 0.0001, # minimum gradient descent update rate (if evolved)
"momentum": 0.9, # momentum for gradient descent update
"decay": 0, # weight decay during gradient descent update
},
}
layer_args = {
"layer_0": {
"type": "recurrent",
..., # other parameters same as for connected layers
}
}
layer_args = {
"layer_0": {
"type": "lstm",
"activation": "tanh", # activation function
"recurrent_activation": "logistic", # recurrent activation function
..., # other parameters same as for connected layers
}
}
Softmax layers can be composed of a linear connected layer and softmax:
layer_args = {
"layer_0": {
"type": "connected",
"activation": "linear",
"n_init": N_ACTIONS, # number of (softmax) outputs
..., # other parameters same as for connected layers
},
"layer_1": {
"type": "softmax",
"scale": 1, # softmax temperature
},
}
layer_args = {
"layer_0": {
"type": "dropout",
"probability": 0.2, # probability of dropping an input
}
}
Gaussian noise adding layers.
layer_args = {
"layer_0": {
"type": "noise",
"probability": 0.2, # probability of adding noise to an input
"scale": 1.0, # standard deviation of Gaussian noise added
}
}
Convolutional layers require image inputs and produce image outputs. If used as
the first layer, the width, height, and number of channels must be specified.
If "evolve_neurons": True
the number of filters will be evolved using an
initial number of filters "n_init"
and maximum number "n_max"
.
layer_args = {
"layer_0": {
"type": "convolutional",
"activation": "relu", # activation function
"height": 16, # input height
"width": 16, # input width
"channels": 1, # number of input channels
"n_init": 6, # number of convolutional kernel filters
"size": 3, # the size of the convolution window
"stride": 1, # the stride of the convolution window
"pad": 1, # the padding of the convolution window
..., # other parameters same as for connected layers
},
"layer_1": {
"type": "convolutional",
..., # parameters same as above; height, width, channels not needed
},
}
Max-pooling layers require image inputs and produce image outputs. If used as the first layer, the width, height, and number of channels must be specified.
layer_args = {
"layer_0": {
"type": "maxpool",
"height": 16, # input height
"width": 16, # input width
"channels": 1, # number of input channels
"size": 2, # the size of the maxpooling operation
"stride": 2, # the stride of the maxpooling operation
"pad": 0, # the padding of the maxpooling operation
},
"layer_1": {
"type": "maxpool",
"size": 2,
"stride": 2,
"pad": 0,
},
}
Average-pooling layers require image inputs. If used as the first layer, the width, height, and number of channels must be specified. Outputs an average for each input channel.
layer_args = {
"layer_0": {
"type": "avgpool",
"height": 16, # input height
"width": 16, # input width
"channels": 1, # number of input channels
},
"layer_1": {
"type": "avgpool",
},
}
Upsampling layers require image inputs and produce image outputs. If used as the first layer, the width, height, and number of channels must be specified.
layer_args = {
"layer_0": {
"type": "upsample",
"height": 16, # input height
"width": 16, # input width
"channels": 1, # number of input channels
"stride": 2, # the stride of the upsampling operation
},
"layer_1": {
"type": "upsample",
"stride": 2,
},
}
XCSF provides support for pickle and also provides the following functions for serializing to a binary file.
Example saving the entire current state of XCSF to a binary file:
xcs.save("saved_name.bin")
Example loading the entire state of XCSF from a binary file:
xcs.load("saved_name.bin")
Functions return the total number of elements written or read.
Library Stub:
def save(self, filename: str) -> int: ...
def load(self, filename: str) -> int: ...
Example storing the current XCSF population in memory for later retrieval, overwriting any previously stored population:
xcs.store()
Example retrieving the previously stored XCSF population from memory:
xcs.retrieve()
Library Stub:
def store(self) -> None: ...
def retrieve(self) -> None: ...
Example printing the current XCSF parameters:
xcs.print_params()
Example printing the current XCSF population:
xcs.print_pset()
Library Stub:
def print_params(self) -> None: ...
def print_pset(self, condition: bool = True, action: bool = True, prediction: bool = True) -> None: ...
Values for all general parameters are directly accessible via the property. Specific getter functions:
# General
xcs.pset_size() # returns the mean population size
xcs.pset_num() # returns the mean population numerosity
xcs.mset_size() # returns the mean match set size
xcs.aset_size() # returns the mean action set size
xcs.mfrac() # returns the mean fraction of inputs matched by the best rule
xcs.time() # returns the current EA time
xcs.version_major() # returns the XCSF major version number
xcs.version_minor() # returns the XCSF minor version number
xcs.version_build() # returns the XCSF build version number
xcs.pset_mean_cond_size() # returns the mean condition size
xcs.pset_mean_pred_size() # returns the mean prediction size
# Neural network specific - population set averages
# "layer" argument is an integer specifying the location of a layer: first layer=0
xcs.pset_mean_pred_eta(layer) # returns the mean eta for a prediction layer
xcs.pset_mean_pred_neurons(layer) # returns the mean number of neurons for a prediction layer
xcs.pset_mean_pred_layers() # returns the mean number of layers in the prediction networks
xcs.pset_mean_pred_connections(layer) # returns the number of active connections for a prediction layer
xcs.pset_mean_cond_neurons(layer) # returns the mean number of neurons for a condition layer
xcs.pset_mean_cond_layers() # returns the mean number of layers in the condition networks
xcs.pset_mean_cond_connections(layer) # returns the number of active connections for a condition layer
Library Stub:
def aset_size(self) -> float: ...
def mfrac(self) -> float: ...
def mset_size(self) -> float: ...
def pset_mean_cond_connections(self, layer: int) -> float: ...
def pset_mean_cond_layers(self) -> float: ...
def pset_mean_cond_neurons(self, layer: int) -> float: ...
def pset_mean_cond_size(self) -> float: ...
def pset_mean_pred_connections(self, layer: int) -> float: ...
def pset_mean_pred_eta(self, layer: int) -> float: ...
def pset_mean_pred_layers(self) -> float: ...
def pset_mean_pred_neurons(self, layer: int) -> float: ...
def pset_mean_pred_size(self) -> float: ...
def pset_num(self) -> int: ...
def pset_size(self) -> int: ...
def time(self) -> int: ...
def version_build(self) -> int: ...
def version_major(self) -> int: ...
def version_minor(self) -> int: ...
import json
json_string = xcs.json()
parsed = json.loads(json_string)
Then to print the current population:
print(json.dumps(parsed, indent=4))
Example printing ternary conditions, integer actions, and fitnesses:
fitness = [cl["fitness"] for cl in parsed["classifiers"]]
ternary = [cl["condition"]["string"] for cl in parsed["classifiers"]]
actions = [cl["action"]["action"] for cl in parsed["classifiers"]]
for i in range(len(fitness)):
print("%s %d %.5f" % (ternary[i], actions[i], fitness[i]))
Printing and returning the individual weights from neural networks is disabled by default.
To enable, change the flags in the neural_json_export()
functions in cond_neural.c
, pred_neural.c
, etc.
Library Stub:
def json(self, condition: bool = True, action: bool = True, prediction: bool = True) -> str: ...
Example getting and printing the current parameters:
import json
json_params = xcs.json_parameters()
parsed_args = json.loads(json_params)
print(json.dumps(parsed_args, indent=4))
Library Stub:
def json_parameters(self) -> str: ...
Classifiers can be inserted into the population in a number of ways.
The json_insert_cl()
function can be used to insert a single new classifier into the population.
The new classifier is initialised with a random condition, action, prediction, and then
any supplied properties overwrite these values. This means that all properties are optional.
If the population set numerosity exceeds xcs.POP_SIZE
after inserting the rule, the standard
roulette wheel deletion mechanism will be invoked to maintain the population limit.
GP trees and neural networks are not yet implemented.
Example inserting a rule with specified hyperrectangle condition and integer action, while the prediction is initialised as normal. See notebook example.
import json
import xcsf
xcs = xcsf.XCS(x_dim=8, y_dim=1, n_actions=2)
xcs.condition("hyperrectangle_ubr")
xcs.action("integer")
xcs.prediction("nlms_linear")
cl_dict = {
"error": 10, # each of these properties are optional
"fitness": 1.01,
"accuracy": 2,
"set_size": 100,
"numerosity": 2,
"experience": 3,
"time": 3,
"samples_seen": 2,
"samples_matched": 1,
"condition": {
"type": "hyperrectangle_ubr",
"bound1": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
"bound2": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
"mutation": [0.2] # this parameter still self-adapts
},
"action": {
"type": "integer",
"action": 1,
"mutation": [0.28]
}
}
json_str = json.dumps(cl_dict) # dictionary to JSON
xcs.json_insert_cl(json_str)
xcs.print_pset()
Note: when manually adding classifiers, be careful that the keys are correct because if an exact match is not found it will be ignored silently.
Multiple classifiers can be added through the same mechanism as a single JSON string with json_insert()
.
Additionally, the entire population set can be written in JSON format to a plain text file:
xcs.json_write("pset.json")
And read into the population with:
xcs.json_read("pset.json")
Note that this is not the recommended way to backup the system to persistent storage since temporary memory buffers (e.g., update matrices) and parameters are not saved and reloaded. For this purpose, see Saving and Loading XCSF.
Library Stub:
def json_insert(self, clset_json: str) -> None: ...
def json_insert_cl(self, cl_json: str) -> None: ...
def json_read(self, filename: str) -> None: ...
def json_write(self, filename: str) -> None: ...
The TreeViz
class from viz.py
will generate a tree with graphviz.
The first argument must be the tree array; and the second, the filename to save the output as a pdf.
Optionally accepts a list of strings representing the feature_names
.
Optionally accepts a string note
, which will add a note/caption at the bottom.
Example plotting the first classifier condition:
import json
from xcsf.utils.viz import TreeViz
parsed = json.loads(xcs.json())
trees = [cl["condition"]["tree"]["array"] for cl in parsed["classifiers"]]
TreeViz(trees[0], "test")
Note this will require the graphviz package installed with:
$ pip install graphviz
TreeViz Stub:
def __init__(self,
tree: list[str],
filename: str,
note: str | None = None,
feature_names: list[str] | None = None,
) -> None: ...
The DGPViz
class from viz.py
will generate a graph with graphviz.
The first argument must be the graph; and the second, the filename to save the output as a pdf.
Optionally accepts a list of strings representing the feature_names
.
Optionally accepts a string note
, which will add a note/caption at the bottom.
Example plotting the first classifier condition and passing the error as a note:
import json
from xcsf.utils.viz import DGPViz
parsed = json.loads(xcs.json())
errors = [cl["error"] for cl in parsed["classifiers"]]
graphs = [cl["condition"]["graph"] for cl in parsed["classifiers"]]
note = "Error = %.5f" % errors[0]
DGPViz(graphs[0], "test", note=note)
DGPViz Stub:
def __init__(self,
graph: dict,
filename: str,
note: str | None = None,
feature_names: list[str] | None = None,
) -> None: ...
Initialise XCSF with y_dim = 1
for predictions to estimate the scalar reward.
import xcsf
xcs = xcsf.XCS(x_dim=X_DIM, y_dim=1, n_actions=N_ACTIONS)
The standard method involves the basic loop as shown below. state
must be a
1-D numpy array representing the feature values of a single instance; reward
must be a scalar value representing the current environmental reward for having
performed the action; and done
must be a boolean value representing whether
the environment is currently in a terminal state.
state = env.reset()
xcs.init_trial()
for cnt in range(xcs.TELETRANSPORTATION):
xcs.init_step()
action = xcs.decision(state, explore) # explore specifies whether to explore/exploit
next_state, reward, done = env.step(action)
xcs.update(reward, done) # update the current action set and/or previous action set
err += xcs.error(reward, done, env.max_payoff()) # system prediction error
xcs.end_step()
if done:
break
state = next_state
cnt += 1
xcs.end_trial()
See notebook example.
Library Stub:
def init_step(self) -> None: ...
def init_trial(self) -> None: ...
def end_step(self) -> None: ...
def end_trial(self) -> None: ...
def error(self) -> float: ...
def update(self, reward: float, done: bool) -> None: ...
def decision(
self,
state: np.ndarray[Any, np.dtype[np.float64]], # shape = (x_dim, )
explore: bool,
) -> int: ...
The fit()
function may be used as below to execute one single-step learning
trial, i.e., creation of the match and action sets, updating the action set and
running the EA as appropriate. The vector state
must be a 1-D numpy array
representing the feature values of a single instance; action
must be an
integer representing the selected action (and therefore the action set to
update); and reward
must be a scalar value representing the current
environmental reward for having performed the action.
xcs.fit(state, action, reward)
The entire prediction array for a given state can be returned using the
supervised predict()
function, which must receive a 2-D numpy array. For
example:
prediction_array = xcs.predict(state.reshape(1,-1))[0]
See notebook example.
Library Stub:
@typing.overload
def fit(
self,
state: np.ndarray[Any, np.dtype[np.float64]], # shape = (x_dim, )
action: int,
reward: float,
) -> float: ...
def predict(
self,
X_predict: np.ndarray[Any, np.dtype[np.float64]], # shape = (n_samples, x_dim)
) -> np.ndarray[Any, np.dtype[np.float64]]: ... # shape = (n_samples, y_dim)
The supervised fit()
and predict()
functions can be used for reinforcement
learning without action sets, i.e., [A] = [M].
See notebook example using experience replay.
Related Literature:
- A. Stein, R. Maier, L. Rosenbauer, and J. Hähner (2020) XCS classifier system with experience replay
Initialise XCSF with a single (dummy) integer action. Set conditions and predictions as desired.
import xcsf
xcs = xcsf.XCS(x_dim, y_dim, 1) # single action
xcs.action("integer") # dummy integer actions
The fit()
function may be used as below to execute xcs.MAX_TRIALS
number of
learning iterations (i.e., single-step trials) using a supplied training set.
The input arrays X_train
and y_train
must be 2-D numpy arrays of the shape
(n_samples, x_dim) and (n_samples, y_dim). The third parameter specifies whether
to randomly shuffle the training data. The function will return a scalar representing
the training prediction error using the loss function as specified by xcs.LOSS_FUNC
.
Note that while the training data is supplied as a batch, learning proceeds in the
usual online way: one sample at a time. To execute a single trial simply pass a
batch size of one by reshaping the data and set xcs.MAX_TRIALS = 1
.
train_error = xcs.fit(X_train, y_train, shuffle=True)
Library Stub:
@typing.overload
def fit(
self,
X_train: np.ndarray[Any, np.dtype[np.float64]], # shape = (n_samples, x_dim)
y_train: np.ndarray[Any, np.dtype[np.float64]], # shape = (n_samples, y_dim)
shuffle: bool = True,
) -> float: ...
The score()
function may be used as below to calculate the prediction error
over a single pass of a supplied data set without updates or the EA being
invoked (e.g., for scoring a validation set). An argument N
may be
supplied that specifies the maximum number of iterations performed; if this
value is less than the number of instances supplied, samples will be drawn
randomly. Returns a scalar representing the error. 2-D numpy arrays are expected
as inputs.
Note that if the match set is empty for a given sample then covering will be invoked
and this may alter the population set. If this behaviour is undesirable, an optional
argument cover
can be used to specify the values to use as system output
instead of invoking covering. cover
must be an array of length y_dim
.
val_error = xcs.score(X_val, y_val)
val_error = xcs.score(X_val, y_val, N=1000, cover=[0.1])
Library Stub:
def score(
self,
X_val: np.ndarray[Any, np.dtype[np.float64]], # shape = (n_samples, x_dim)
y_val: np.ndarray[Any, np.dtype[np.float64]], # shape = (n_samples, y_dim)
N: int = 0, # max number of samples to use
cover: Optional[np.ndarray[Any, np.dtype[np.float64]]], # shape = (1, y_dim)
) -> float: ...
The predict()
function may be used as below to calculate the XCSF predictions
for a supplied data set. No updates or EA invocations are performed. The input
vector must be a 2-D numpy array of the shape (n_samples, x_dim). Returns a 2-D
numpy array of shape (n_samples, y_dim).
Note that similar to score()
, if the match set is empty for a given sample then
covering will be invoked and this may alter the population set. If this behaviour
is undesirable, an optional argument cover
can be used to specify the
values to use as system output instead of invoking covering. cover
must be an
array of length y_dim
.
predictions = xcs.predict(X_test)
predictions = xcs.predict(X_test, cover=[0.1])
Library Stub:
def predict(
self,
X_test: np.ndarray[Any, np.dtype[np.float64]], # shape = (n_samples, x_dim)
cover: Optional[np.ndarray[Any, np.dtype[np.float64]]], # shape = (1, y_dim)
) -> np.ndarray[Any, np.dtype[np.float64]]: ... # shape = (n_samples, y_dim)
Currently 3 self-adaptive mutation methods are implemented and their use is
defined within the various implementations of conditions, actions, and
predictions. The smallest allowable mutation rate MU_EPSILON = 0.0005
.
- Uniform adaptation: selects rates from a uniform random distribution. Initially the rate is drawn at random ~U[MU_EPSILON,1]. Offspring inherit the parent's rate, but with 10% probability the rate is randomly redrawn.
- Log normal adaptation: selects rates using a log normal method (similar to
evolution strategies).
Initially the rate is selected at random from a uniform distribution
~U[MU_EPSILON,1]. Offspring inherit the parent's rate, before
applying log normal adaptation:
$\mu \leftarrow \mu e^{\mathcal{N}(0,1)}$ . - Rate selection adaptation: selects rates from the following set of 10 values:
{0.0005, 0.001, 0.002, 0.003, 0.005, 0.01, 0.015, 0.02, 0.05, 0.1}
. Initially the rate is selected at random. Offspring inherit the parent's rate, but with 10% probability the rate is randomly reselected.
Related Literature:
- L. Bull and J. Hurst (2003) A neural learning classifier system with self-adaptive constructivism
- G. D. Howard, L. Bull, and P.-L. Lanzi (2008) Self-adaptive constructivism in neural XCS and XCSF
- M. V. Butz, P. O. Stalph, and P.-L. Lanzi (2008) Self-adaptive mutation in XCSF
- M. Serpell and J. E. Smith (2010) Self-adaptation of mutation operator and probability for permutation representations in genetic algorithms
This project is released under the terms of the GNU General Public License v3.0 (GPLv3).