# Genetic Programming for Symbolic Programming

In this question, your task is to build a GP system to automatically evolve a number of genetic programs for the following regression problem:

You can use a GP library. You should:

- Determine and describe the terminal set and the function set.

- Design the fitness cases and fitness function.

- Set the necessary parameters, such as population size, max tree depth, termination criteria, crossover and mutation rates.

- Run the GP system for 3 times with different random seeds. Report the best genetic programs (their structure and performance) of each of the 3 runs. Do your observations, discussions and draw your conclusions

Code is inspired from 
> https://deap.readthedocs.io/en/master/examples/gp_symbreg.html
>
> https://blog.csdn.net/ocd_with_naming/article/details/99585140

In [1]:

from matplotlib import pyplot
import operator  
import pandas as pd
from deap.benchmarks.tools import diversity, convergence, hypervolume
from copy import deepcopy
from collections import deque

from json import tool
from deap import creator, base, gp, tools, algorithms # core functionality of DEAP
import array
import random
import json
import numpy as np
import math
import random
# Python internal operators for object comparisons, 
# logical comparisons, arithmetic operations, and sequence operations
import operator 

# primitives are internal nodes, terminals are leaf nodes that can be either args or constants.

# tree based encoding to represent and evolve programs using evolutionary operators(i.e. selection, crossover, mutation). 
# Solution trees are composed of primitive functions (e.g., arithmetic operations, mathematical functions, logical operations) and terminals (variables or constants linked to the problem).

# a population of heuristics is evolved in order to improve their performance
# To reduce the complexity, only the part which has a direct impact on the TSP heuristic results is evolved, i.e., scoring function. 



In [None]:


# 1st, initialization of population
# set the primitive set, which is the union of the function set  and terminal set
# note(from blogs): primitives are not functions + terminals, primitives just refer to functions 
def protectDiv(left, right):
    """For protecting the case that the division by zero
    % is used for protection of the case that the division by zero
    """
    if right == 0:
        return 1
    else:
        return left / right
    
pset = gp.PrimitiveSet("MAIN",1) # main is the name, 1 is the input num
pset.addPrimitive(operator.add, 2)
pset.addPrimitive(operator.sub, 2)
pset.addPrimitive(operator.mul, 2)
pset.addPrimitive(operator.neg, 1)
pset.addPrimitive(operator.mod, 2) # modulo operator: %,  for divide protection
pset.addPrimitive(protectDiv, 2)

# then add terminals
# terminals should be: https://ieeexplore.ieee.org/document/8778254
# nunber of nodes in the graph
# it is important to add the terminals to the primitive set, otherwise the program will not be able to evolve.
# ephemeral constants are not fixed, when they are selected into trees, 
# the included functions will be executed, and its output will be used as the value of the constant that will be added into the tree.
pset.addEphemeralConstant(f"{random.randint(-1,99999999999)}", lambda: random.randint(-1,1))


pset.renameArguments(ARG0='x') 

# 2. use the createor to construct the Individual and fitness
# minimize the cost, so -1,0 is used as the weight input
creator.create("FitnessMin", base.Fitness, weights=(-1.0,)) #TODO: need to follow the requirements to develop and implement the fitness evaluation of GP Individual 

creator.create("Individual", gp.PrimitiveTree, fitness=creator.FitnessMin)

# 3.  register some self defined functions into the toolbox 
# so that the algorthims part can be called via toolbox.name
# algorithms refer to the implementation of genetic iteration, 
# you can customize, can also use the algorithms provided in the algorithms module. 
# When using methods in the algorithms module, the toolbox registers functions with fixed names,
# such as fitness evaluation must be registered as“evaluate,” 
# genetic operations cross-registered as“Mate,”
# variants registered as“Mute,” and so on.
toolbox = base.Toolbox()
toolbox.register("expr", gp.genHalfAndHalf, pset=pset, min_=1, max_=2)
toolbox.register("Individual", tools.initIterate, creator.Individual, toolbox.expr)
toolbox.register("population", tools.initRepeat, list, toolbox.Individual)
toolbox.register("compile", gp.compile, pset=pset)
# Attribute generator
# toolbox.register("indices", random.sample, range(IND_SIZE), IND_SIZE)
