cgp-vec
is a small library implementing Cartesian Genetic Programming
(CGP) parallelized over multiple independent populations, using PyTorch
as a backend for basic vectorized operations.
Since this library is still in a very early development stage, the API may vary from commit to commit, together with the accuracy and completeness of the documentation.
More detailed benchmarks will be available soon. Testing the examples on a Google Colab GPU runtime led to execution times in the order of minutes for 1000 runs of 500 evolution steps of CGP on Koza symbolic regression problems, with configurations similar to those in this GECCO '07 paper, which uses bigger populations than most CGP literature.
The library is still in a very early development stage. The provided functionality works as intended, and is sufficient to implement common CGP experiments; however, for now it only supports CGP configurations with 1-row genotypes and no "levels back" parameter. This is the most common CGP configuration, since it can encode any (bounded) directed acyclic graph. Some minor unit tests are also missing.
-
support for all classical CGP configurations (multiple rows, levels back parameter);
-
explicit generation of efficient (vectorized) phenotypes, in the form of either python callables or PyTorch modules;
-
notable CGP variants from J. Miller's book "Cartesian Genetic Programming";
-
convenience functions (vectorized) for common CGP problems: symbolic regression losses, etc.
While this repository contains a proper Python package, it is not yet
registered on PyPy, so it must be installed via cloning and pip install .
for now; the only external dependency is the PyTorch library. This
library has only been tested on the major Python versions 3.7 and 3.10.
Unit tests are in the tests
directory; to run all of them, one can run
python -m unittest -v "tests.test_cgpv"
at the top directory.
An overview of the API is given below (also see
Documentation) Full symbolic regression examples are
also available in the examples
directory, both as python scripts and
as notebooks.
The whole API is contained in the cgpv
package. The vectorized CGP
operations
are available either as simple functions, or as methods of the Populations
class; most users may find the latter more convenient.
The following (vectorized) operations are available:
-
counting the number of (valid) alleles for each locus:
count_alleles
(performed automatically when creating newPopulations
objects, ifn_alleles
is not provided); -
generation of multiple random populations:
random_populations
orPopulations.random
(static method); -
random mutation:
mutate
andPopulations.mutate
; -
evaluating populations on a tensor input:
eval_populations
or simply calling aPopulations
object (since it implements__call__
); -
roulette-wheel selection:
roulette_wheel
orPopulations.roulette_wheel
; -
plus-selection:
plus_selection
orPopulations.plus_selection
;
Populations
objects also provide a fitnesses
tensor attribute for
convenience, used to store fitness matrices for the populations. If set,
this attribute is then used by the selection methods.
To avoid doing a lot of things twice or copy the same tensors around too
much, most methods that return new populations (including the __init__
method) don't make any attempt to deepcopy the given objects, so the
configuration attributes of different populations may point to the same
objects/tensors. This is not a problem for intended use cases, but users
should be aware of it; it also may change in the future.
Seed parity between the methods of Populations
and corresponding
functions can be explicitely tested on CPU devices in the unit tests;
note, however, that running on CUDA devices may break reproducibility
anyway (see the PyTorch documentation on
reproducibility.
Documentation can be generated in various format with
Sphinx, using docs/Makefile
or docs/make.bat
. For example, to generate the documentation in HTML
format, one can run:
cd docs
make html
docs/requirements.txt
contains the packages needed to build the
documentation.
CONTRIBUTING.md
is the starting point for all information concerning
the development of this library, both for code and documentation.
@misc{cgpvec-2022-git,
author = {Fanti, Andrea and Gallotta, Roberto},
title = {{cgp-vec}: Vectorized Cartesian Genetic Programming},
year = {2022},
publisher = {GitHub},
journal = {GitHub repository}}
}