# Portfolio Optimization with Genetic Programming

# 1. Defining the terminals
In this first step we define the Primitive set (`pset`). The `pset` contains the functions, constants and input parameters our tree will be build of.

First define the input parameters for the generated trees.

In [None]:
from deap import gp
from stock import Stock
import pandas as pd

pset = gp.PrimitiveSetTyped("main", [pd.Series], pd.Series)

Define primitive functions for the nodes. The first parameter defines the function, the second the types of the input parameters and the third one the type of the output.

In [None]:
from stock_functions import *

pset.addPrimitive(buy_random, [pd.Series], pd.Series)
pset.addPrimitive(sell, [pd.Series], pd.Series)

Renaming the input parameter

In [None]:
pset.renameArguments(ARG0="stocks")

# 2. Defining Object types
In any evolutionary program, we need some basic object types. In this case we need two, a fitness type and the type for individuals. In this problem we are facing an maximization and minimization problem (maximize the value and minimize the tree size)(so the one value is positive and the other one is negative). The individual will be based upon a tree, to which we add the defined fitness.

In [None]:
creator.create("Fitness", base.Fitness, weights=(1.0, ))
creator.create("Individual", gp.PrimitiveTree, fitness=creator.Fitness, pset=pset)

# 3. Define helper functions
Register functions that we during the whole algorithm (generate, evaluate, mutate, ...). Any structure with access to the toolbox will also have access to all of those registered parameters.

## 3.1 Generating individuals

In [None]:
toolbox = base.Toolbox()
# Defines how a tree expression looks like
toolbox.register("expr", gp.genHalfAndHalf, pset=pset, min_=1, max_=3)
# How an individual should be generated (in this case as a tree)
toolbox.register("individual", tools.initIterate, creator.Individual, toolbox.expr)
# How the population of individuals should look like
toolbox.register("population", tools.initRepeat, list, toolbox.individual)

## 3.2 Evaluation of trees
Define functions that help us to evaluate an individuum. This includes to calculate the fitness. But first we need to generate executable pythoncode out of our tree individuals.

To get working python code out of our generated tree we can use the `gp.compile` function.

In [None]:
# Generates Python code out of trees
toolbox.register("compile", gp.compile, pset=pset)

def evaluate(tree):
    # using the previously defined compile function
    function = toolbox.compile(tree)
    # Executing the function with some test input

    mse = 0.0
    for i in range(100):
        result = function(i)
        mse += math.sqrt((y(i) - result)**2)
        

    return mse / 100, 

# Now add the evaluation Function to our toolbox
toolbox.register("evaluate", evaluate)

In [10]:
import pandas as pd
import numpy as np

class Stock:
  def __init__(self, sym: str):
    self.data = pd.read_csv('./data/stocks/' + sym + '.csv', usecols=lambda x:x.lower() in ["date", "close"])
    self.data['Date'] = pd.to_datetime(self.data['Date'])

  def get_avg_profit(self, start: pd.Timestamp, to: pd.Timestamp):
    diff = to - start
    diff_years = round(diff / np.timedelta64(1, 'Y'))
    print('Diff years: ' + str(diff_years))
    if diff_years > 1:
        avg_profit = 0
        for year in range(diff_years):
            first = pd.Timestamp(start.year + year, 1, 1)
            last = pd.Timestamp(start.year + year + 1, 1, 1)
            close_first = self.data[self.data['Date'] == first]
            close_last = self.data[self.data['Date'] == last]
            avg_profit += (close_last - close_first) / close_first
        avg_profit = avg_profit / diff_years


In [11]:
stock = Stock('AAPL')

stock.get_avg_profit(start=pd.Timestamp(2000, 3, 1), to=pd.Timestamp(2019, 7, 12))

Diff years: 19


TypeError: cannot perform __rtruediv__ with this index type: DatetimeArray