<html>
<div>
  <img src="https://www.engineersgarage.com/wp-content/uploads/2021/11/TCH36-01-scaled.jpg" width=360px width=auto style="vertical-align: middle;">
  <span style="font-family: Georgia; font-size:30px; color: white;"> <br/> University of Tehran <br/> AI_CA2 <br/> Spring 02 </span>
</div>
<span style="font-family: Georgia; font-size:15pt; color: white; vertical-align: middle;"> low_mist - std id: 810100186 </span>
</html>

in this notebook we are to learn about genetic algorithms and how we can use them to find results when ordinary search algorithms are not effective.

## Problem Description
in this problem we are given return and risk of investment in some companies (in a sample.csv file) and we want to find coefficients for each company so that we gain  a certain amount of profit and other constraints that are as below:
- return should at least be 1000 percent
- risk should be at most 60 percent
- we have to invest in at least 30 different companies
a sample result is given in sample_coeffs.csv file.
Since normal search algorithms wont be effective here we use genetic algorithms to find a solution. In this algorithm we start that is derived from natural selection we have an initial population which will evolve over time so that only the best survives.

## Modeling

### Consts
Since we have lots of const variables like chance of mutation and so forth, I decided to store all of them in a class.

In [4]:
from __future__ import annotations
import random
import bisect
from dataclasses import dataclass
from itertools import accumulate
import pandas as pd
from copy import deepcopy
from typing import Any, Callable, Optional

@dataclass
class Consts:
    crossover_probability: int 
    mutation_probability: int
    maximum_number_of_evolutions: int
    carry_percentage: int
    chromosome_size: int
    return_list: list[float]
    risk_list: list[float]
    return_threshold: float
    risk_threshold: float
    num_of_investments_threshold: int
    population_size: int

## DataFrame
We should read csv file and store it in a data frame and then get return and risk information. 

In [5]:
CSV_ADDRESS = "assets/sample.csv"
df = pd.read_csv(CSV_ADDRESS)

consts = Consts(
    crossover_probability = 0.6,
    mutation_probability = 0.1,
    chromosome_size = len(df["Unnamed: 0"]),
    carry_percentage = 0.2,
    maximum_number_of_evolutions = 1000,
    return_list = df["return"],
    risk_list = df["risk"],
    population_size = 400,
    return_threshold = 10,
    risk_threshold = 0.6,
    num_of_investments_threshold = 30
)

### Chromosome
It is every individual in our population. every gene equals to a coefficient which shows how much we invest in that company. It has some useful methods like mutation, mating, etc.
- `mutate` is a function which mutates the current chromosome
- `mate` takes another chromosome and returns a new offspring of that chromosome
- `calc_fitness` in genetic algorithms we need a fitness function to calculate how good this new chromosome is 
- `normalize` will change coefficients so that the sum is 1
- `is_goal` check to see if have met the needs

In [6]:
class Chromosome:
    def __init__(self, coefficients: Optional[list[int]] = None):
        if coefficients is not None:
            self.coefficients = coefficients
        else: 
            self.coefficients = [random.random() for _ in range(consts.chromosome_size)]
        
    def mutate(self):
        if random.random() < consts.mutation_probability:
            first_index = random.randint(0, consts.chromosome_size - 1)
            second_index = random.randint(0, consts.chromosome_size - 1)
            self.coefficients[first_index], self.coefficients[second_index] = self.coefficients[second_index], self.coefficients[first_index]
            
    def mate(self, other: Any) -> Chromosome:
        if not isinstance(other, Chromosome):
            raise ValueError("can't mate with another type")
        
        if random.random() < consts.mutation_probability:
            crossover_point = random.randint(0, consts.chromosome_size - 1)
            if random.randint(0, 1) % 2 == 0:
                offspring_coefficients = self.coefficients[:crossover_point] + other.coefficients[crossover_point:]
            else:
                offspring_coefficients = other.coefficients[:crossover_point] + self.coefficients[crossover_point:]
            
        return Chromosome(offspring_coefficients) 
    
    def calc_return(self) -> float:
        return sum([x * y for x, y in zip(self.coefficients, consts.return_list)])

    def calc_risk(self) -> float:
        return sum([i * j for i, j in zip(self.coefficients, consts.risk_list)])
    
    def calc_fitness(self) -> float:
        return self.calc_return() - 2 * self.calc_risk
               
    def normalize(self):
        sum_of_coefficients = sum(self.coefficients)
        self.coefficients = [x / sum_of_coefficients for x in self.coefficients]
        
    def is_goal(self) -> bool:
        return sum([x * y for x, y in zip(self.coefficients, consts.return_list)]) >= consts.return_threshold and \
               sum([i * j for i, j in zip(self.coefficients, consts.risk_list)]) <= consts.risk_threshold and \
               len([i for i in self.coefficients if i != 0]) >= consts.num_of_investments_threshold

## Population
The whole population which is consists of many chromosomes. It shows the world which we have now till next evolution.

In [None]:
class Population:
    def __init__(self):
        self.chromosomes = [Chromosome() for _ in range(consts.population_size)]
        
    def found_goal(self) -> tuple[bool, Chromosome]:
        for chromosome in self.chromosomes:
            if chromosome.is_goal():
                return True, chromosome
        return False, None
    
    def evolve(self):
        pass
    
    def genetic_algorithm(self) -> Chromosome:
        num_of_evolutions = 0
        while num_of_evolutions < consts.maximum_number_of_evolutions:
            found, chromosome = self.found_goal()
            if found:
                return chromosome
    
            self.evolve()
            num_of_evolutions += 1
        return None
        