tinyga
A Tiny GA written in C as a submission to The Murphy Cup
Usage
./tinyga length popsize generations xo_percentage mutation_percentage random_seed
E.g.,
./tinyga 10 10 100 50 5 1
Also
./tinytinyga 10 10 100 50 5 1
which is a more concisely written version of tinyga.
Description
The goal is to create as a short as possible (but still comprehensible) a fixed length binary GA to solve the One-Max problem. The GA is required to use crossover (xo), reproduction and mutation as genetic operators and fitness proportionate selection for... selection.
Fitness/Error is minimising.
rand
from stdlib.h
is used throughout. The params[6]
is used as a seed using srand
.
Parameters
There are six command line arguments/parameters. params
holds a pointer to them:
- Problem or individual length
- Population Size
- Number of generations/iterations
- Crossover probability/1-Reproduction probability
- Mutation probability
- Random number generator seed
atoi
is used throughout to convert the parameter strings to integer values.
Data Structures
The data structures are as follows:
- A 2*PopSize*IndiviualSize _Bool array to store the current generation's population and the next generation's population. These alternate using
mod 2
- A PopSize int array to store the fitness values
- A PopSize double array to store the selection probabilities
Allocation
The data structures are then allocated on the heap using the recursive allocateNdArrays
function.
- The memory allocated at each recursive call depends on two parameters,
dimensions
, how many dimensions out of N are left to be allocated, andparamLevel
, which index fromparams
should read for the size of this current dimension- There are two cases for
dimensions
:> 1
the data type size being allocated is that of a pointer- Otherwise, allocate the actual data type size, stored in
baseSize
- There are also two cases for
paramLevel
paramLevel == 3
i.e., generational level, only use size 2 for this dimension- Otherwise use size of
params[paramLevel]
for this dimension
- There are two cases for
Initialisation
Initialisation is as follows:
- Normally two for-loops but since size is an issue, the product of the two ranges are used as a new range for a single loop:
params[2] * params[1]
- To extract the first variable from this range, divide by the second value
- To extract the second variable from this range, mod by the second
Generation Loop
Loop from generation 0 while less than the total number of generations and the best individual evaluated has a fitness > 0.
- Evaluate/Call fitness function
- Print stats
- Perform genetic operations
After exiting the loop, only perform the last evaluation if the final generation was reached. Otherwise Success!.
Fitness Evaluation
Iterate over each individual of the population to assign fitness values
- Calculate the fitness by iterating over the individual and counting the number of bits set to 1
- The fitness value used is actually the error, which is minimised, so subtract this number from the length of individual/problem size/
params[1]
- Also keep track of index of the most fit individual in
best
- Over the loop a number of variables are accumulated, the
sum
of fitness and theavg
. fitness of the population
- The fitness value used is actually the error, which is minimised, so subtract this number from the length of individual/problem size/
Next, the fitness values are used to assign selection probabilities, selprob
- Iterating over each individual dividing the individual's fitness by the sum of all fitness values in the population
- The number of bits set to 1 is used here, as opposed to the error, so some additional calculation is required.
Print Stats
Outputs the current generation, average fitness in the population, best fitness in the population and a string representation of the most fit individual to the terminal.
Selection
During fitness evaluation each individual is assigned a selection probability. A random number in the range [0, 1] is generated. Iterating through the population, the selection probability of each individual is subtracted from the random number. This continues until the random number is < zero. The last individual iterated over is selected.
Genetic Operations
- An individual is selected from the population using
selecti
and is copied into the next generation's population - Another individual is selected
- A number from the range [0, 100) is generated and compared against the crossover probability in
params[4]
:- If < this probability, another random number is chosen in the range [0, IndividualSize) and over the start of the previously copied individual (crossover)
- Otherwise, nothing else is copied (reproduction)
- A number from the range [0, 100) is generated and compared against the crossover probability in
- This new individual is iterated over, checking each bit to see if mutation events should occur,
rand() % 100 < params[5]