# $\color{red}{\text{Farbod Siahkali - 810198510}}$

# $\color{green}{\text{Genetic Algorithm for Stock Market Investment}}$
In this problem, we want to invest in the stock market. To do this, we need to divide the capital we have into each of the equity assets in such a way that we have the highest return and the lowest risk.

The solution uses a Genetic Algorithm approach to optimize the coefficients that represent the proportion of our capital invested in each asset. These coefficients are restricted to sum up to 1, meaning that the capital is fully invested.

The code is implemented in Python and uses the following libraries: pandas, numpy, and random.

## $\color{green}{\text{Steps of the Genetic Algorithm}}$
The Genetic Algorithm consists of the following steps:

Generate an initial population of chromosomes, where each chromosome is a list of coefficients representing a possible solution.
Evaluate the fitness of each chromosome in the population, where the fitness function calculates the return and the risk of the investment.
Select the best-performing chromosomes based on their fitness, and create a new population by applying genetic operators such as crossover and mutation to them.
Repeat steps 2 and 3 for a fixed number of generations or until a satisfactory solution is found.
Return the best chromosome found as the optimal solution.

## $\color{green}{\text{Code Explanation}}$
The Python code provided implements the genetic algorithm for the given problem. Here is a brief explanation of each function:

**'generate_gene()'**

This function generates a random number between 5 and 10 which represents the coefficient for each asset in a chromosome.

**'generate_chromosome(num_assets)'**

This function generates a chromosome (possible solution) by creating a list of num_assets coefficients generated by generate_gene(), and then normalizes the list to ensure that the sum of coefficients is equal to 1.

**'generate_population(pop_size, num_assets)'**

This function generates an initial population of pop_size chromosomes using generate_chromosome().

**'fitness(chromosome, data)'**

This function evaluates the fitness of a chromosome by calculating the return and risk of the investment using the given data, which is a 2D array containing the risk and return values of each asset. The function checks if the risk is greater than 0.6 or the return is less than 10, which would make the solution invalid and return a fitness value of 0. Otherwise, it returns the ratio of return to risk.

**'selection(population, fitness_fn)'**

This function selects the best chromosomes from the population based on their fitness value. It calculates the fitness of each chromosome using fitness_fn (which is fitness() in our case), sorts the chromosomes in descending order of fitness, and selects the top half of the population.

**'crossover(chromosome1, chromosome2)'**

This function performs crossover between two chromosomes by selecting a random point and swapping the coefficients of the two chromosomes from that point onwards.

**'mutation(chromosome, mutation_rate)'**

This function mutates a chromosome by randomly changing one of its coefficients to a new value generated by generate_gene(). The function also normalizes the chromosome to ensure that the sum of coefficients is equal to 1.

**'next_generation(population, fitness_fn, mutation_rate)'**

This function generates the next generation of chromosomes by performing selection, crossover, and mutation on the current population. It first selects the best chromosomes using selection(), and then generates new chromosomes by performing crossover and mutation on the selected chromosomes until the new population has the same size as the current population.

**'run_genetic_algorithm(data, pop_size=100, num_generations=100, mutation_rate=0.1)'**

This function is the main function that runs the genetic algorithm. It takes the data containing the risk and return of each asset as input, and also allows for specifying the population size, number of generations, and mutation rate. It generates the initial population using generate_population(), and then runs the genetic algorithm for num_generations using next_generation(). Finally, it selects the chromosome with the highest fitness as the optimal solution and prints the results.

In [None]:
import pandas as pd
import numpy as np
import random

In [None]:
# Define gene and chromosome
def generate_gene():
    return random.uniform(5, 10)

def generate_chromosome(num_assets):
    chromosome = [generate_gene() for _ in range(num_assets)]
    return [x/sum(chromosome) for x in chromosome] # ensure sum to 1

# Generate initial population
def generate_population(pop_size, num_assets):
    population = [generate_chromosome(num_assets) for _ in range(pop_size)]
    return population

# Fitness function
def fitness(chromosome, data):
    returns = np.sum(chromosome * data[:, 1])
    risks = np.sum(chromosome * data[:, 0])
    if risks > 0.6 or returns < 10:
        return 0 # invalid solution
    else:
        return returns / risks

# Selection function
def selection(population, fitness_fn):
    fits = [fitness_fn(chromosome) for chromosome in population]
    idx = np.argsort(fits)[::-1] # descending order
    return [population[i] for i in idx][:len(population)//2]

# Crossover function
def crossover(chromosome1, chromosome2):
    point = random.randint(1, len(chromosome1)-1)
    child1 = chromosome1[:point] + chromosome2[point:]
    child2 = chromosome2[:point] + chromosome1[point:]
    return child1, child2

# Mutation function
def mutation(chromosome, mutation_rate):
    for i in range(len(chromosome)):
        if random.random() < mutation_rate:
            chromosome[i] = generate_gene()
    (chromosome) = [c / sum(chromosome) for c in chromosome]
    # print(max(chromosome))
    return chromosome

# Generate next generation
def next_generation(population, fitness_fn, mutation_rate):
    new_population = []
    elites = selection(population, fitness_fn)
    new_population.extend(elites)
    while len(new_population) < len(population):
        parent1, parent2 = random.sample(elites, 2)
        child1, child2 = crossover(parent1, parent2)
        child1 = mutation(child1, mutation_rate)
        child2 = mutation(child2, mutation_rate)
        new_population.extend([child1, child2])
    return new_population

# Define main function to run the genetic algorithm
def run_genetic_algorithm(data, pop_size=100, num_generations=100, mutation_rate=0.1):
    # Generate initial population
    num_assets = data.shape[0]
    population = generate_population(pop_size, num_assets)
    
    # Run the genetic algorithm
    for i in range(num_generations):
        population = next_generation(population, lambda x: fitness(x, data), mutation_rate)
        
    # Get the top chromosome (solution)
    fits = [fitness(chromosome, data) for chromosome in population]
    idx = np.argmax(fits)
    top_chromosome = population[idx]
    
    # Calculate risk and return for the top chromosome
    risk = np.sum(top_chromosome * data[:, 0])
    ret = np.sum(top_chromosome * data[:, 1])
    
    # Print results
    print("The optimal solution is:")
    counter = 0
    for i, c in enumerate(top_chromosome):
        if c > 0.0001:
          counter+=1
        if c > 0.01:
          print('Ticker ', i+2, 'and with coeff: ', c)
    print(f"Total investment: {np.sum(top_chromosome):.3f}")
    print(f"Number of investments: {counter}")
    print(f"Total risk: {risk:.3f}")
    print(f"Total return: {ret:.3f}")


In [None]:
data = pd.read_csv('/content/sample.csv')
data = data[['risk', 'return']].values
run_genetic_algorithm(data, pop_size=200, num_generations=300, mutation_rate=0.005)


The optimal solution is:
Ticker  136 and with coeff:  0.36366122882628127
Ticker  374 and with coeff:  0.5784158877546695
Total investment: 1.000
Number of investments: 384
Total risk: 0.569
Total return: 10.484


1. What problems does a very small or very large initial population create?

A very small initial population may lead to a lack of genetic diversity, which can limit the search for optimal solutions. A small population size can also increase the likelihood of premature convergence, where the population converges to a suboptimal solution prematurely, before exploring the entire search space. On the other hand, a very large initial population may require more computational resources and time to evaluate each individual in the population. This can slow down the algorithm and increase the cost of the optimization process.

2. What is the effect of increasing population size on the accuracy and speed of the algorithm?

Increasing the population size generally improves the accuracy of the algorithm, as it increases the chance of finding better solutions. However, it can also slow down the algorithm due to the increased number of individuals that need to be evaluated. The speed of the algorithm may be affected by the computational resources available and the efficiency of the fitness evaluation function.

3. Explain and compare the impact of each crossover and mutation operation. Can only one of them be used? Why?

Crossover and mutation are two genetic operators used in genetic algorithms. Crossover involves combining two or more parent solutions to create a new offspring solution. Mutation involves randomly altering the genetic information of an individual to create a new solution.
Crossover can help to generate new solutions that inherit beneficial traits from both parents, leading to a diverse population and a faster convergence towards better solutions. However, it may also lead to the loss of valuable traits and create offspring that are very similar to their parents.
Mutation can help to introduce new and unexpected traits into the population, allowing for exploration of the search space beyond the limits imposed by the initial population. However, excessive mutation can lead to the loss of valuable information and generate solutions that are too different from the existing ones, leading to slow convergence.
Both crossover and mutation can be used in a genetic algorithm, and in fact, their combination can lead to better results than using only one of them.

4. What solutions do you suggest to achieve faster results in this particular problem?

There are several ways to achieve faster results in a genetic algorithm for a particular problem. One solution is to use parallel processing to evaluate the fitness function of multiple individuals simultaneously. This can significantly reduce the time required to evaluate the fitness of each individual and speed up the algorithm. Another solution is to use adaptive mutation and crossover rates, which adjust the probability of each operator based on the current population's performance. This can help to strike a balance between exploration and exploitation and improve the convergence speed. Additionally, using a good initialization method and efficient selection operators can also help to reduce the number of generations required to reach an optimal solution.

5. Despite using these methods, chromosomes may still not change after several iterations. Explain the reason for this and the problems it creates. What do you suggest to solve it?

This phenomenon is called premature convergence, where the population converges to a suboptimal solution prematurely, and the chromosomes do not change after several iterations. This can occur due to a lack of diversity in the population or a lack of variation operators such as crossover and mutation. Premature convergence can lead to the algorithm getting stuck in a suboptimal solution and prevent the exploration of the entire search space.
To solve this problem, various techniques can be used, such as increasing the population size, using different mutation and crossover operators, and implementing elitism, where the best solutions from the previous generation are preserved in the next generation. Additionally, using niching techniques, which encourage the maintenance of diversity in the population, can also help to prevent premature convergence.

6. What solution do you suggest to end the program if the problem does not have a solution?

If the problem does not have a solution, the program can be terminated by implementing a stopping criterion based on a maximum number of iterations or fitness evaluations. Alternatively, the program could return the best solution found so far or a message indicating that a solution was not found.

# $\color{green}{\text{MinMax Algorithm in Othello}}$

The get_minmax_algorithm_move() function is a method in the Othello class that implements the Minimax algorithm to find the best move for the current player in the game of Othello. The Minimax algorithm is a recursive algorithm used for decision making in two-player games. It explores all possible moves and their outcomes for the current player and the opponent, up to a certain depth, and then chooses the move that maximizes the current player's outcome while minimizing the opponent's outcome.

Here is how the get_minmax_algorithm_move() function works:

1. It first gets all valid moves for the current player by calling the get_valid_moves() method with the player's value as the argument. If there are no valid moves, the function returns None.

2. If there is only one valid move, the function returns that move.

3. If there are multiple valid moves, the function initializes a dictionary called scores to keep track of the scores for each move. Each move's score is initially set to negative infinity.

4. For each valid move, the function creates a new board by making a copy of the current board, and then applies the move to the new board by calling the make_move() method.

5. The function then calls the minimax() method with the new board, the current depth (initialized to 0), and the maximum depth (set to the minimax_depth attribute of the Othello instance). The minimax() method returns the score for the current move.

6. The function updates the score for the current move in the scores dictionary with the returned score.

7. After all valid moves have been scored, the function chooses the move with the highest score and returns it.

In [None]:
import random
import time
import turtle

In [None]:
class OthelloUI:
    def __init__(self, board_size=6, square_size=60):
        self.board_size = board_size
        self.square_size = square_size
        self.screen = turtle.Screen()
        self.screen.setup(self.board_size * self.square_size + 50, self.board_size * self.square_size + 50)
        self.screen.bgcolor('white')
        self.screen.title('Othello')
        self.pen = turtle.Turtle()
        self.pen.hideturtle()
        self.pen.speed(0)
        turtle.tracer(0, 0)

    def draw_board(self, board):
        self.pen.penup()
        x, y = -self.board_size / 2 * self.square_size, self.board_size / 2 * self.square_size
        for i in range(self.board_size):
            self.pen.penup()
            for j in range(self.board_size):
                self.pen.goto(x + j * self.square_size, y - i * self.square_size)
                self.pen.pendown()
                self.pen.fillcolor('green')
                self.pen.begin_fill()
                self.pen.setheading(0)
                for _ in range(4):
                    self.pen.forward(self.square_size)
                    self.pen.right(90)
                self.pen.penup()
                self.pen.end_fill()
                self.pen.goto(x + j * self.square_size + self.square_size / 2,
                              y - i * self.square_size - self.square_size + 5)
                if board[i][j] == 1:
                    self.pen.fillcolor('white')
                    self.pen.begin_fill()
                    self.pen.circle(self.square_size / 2 - 5)
                    self.pen.end_fill()
                elif board[i][j] == -1:
                    self.pen.fillcolor('black')
                    self.pen.begin_fill()
                    self.pen.circle(self.square_size / 2 - 5)
                    self.pen.end_fill()

        turtle.update()


class Othello:
    def __init__(self, ui, minimax_depth=5, prune=True):
        self.size = 6
        self.ui = OthelloUI(self.size) if ui else None
        self.board = [[0 for _ in range(self.size)] for _ in range(self.size)]
        self.board[int(self.size / 2) - 1][int(self.size / 2) - 1] = self.board[int(self.size / 2)][
            int(self.size / 2)] = 1
        self.board[int(self.size / 2) - 1][int(self.size / 2)] = self.board[int(self.size / 2)][
            int(self.size / 2) - 1] = -1
        self.current_turn = random.choice([1, -1])
        self.minimax_depth = minimax_depth
        self.prune = prune

    def get_winner(self):
        white_count = sum([row.count(1) for row in self.board])
        black_count = sum([row.count(-1) for row in self.board])
        if white_count > black_count:
            return 1
        elif white_count < black_count:
            return -1
        else:
            return 0

    def get_valid_moves(self, player):
        moves = set()
        for i in range(self.size):
            for j in range(self.size):
                if self.board[i][j] == 0:
                    for di in [-1, 0, 1]:
                        for dj in [-1, 0, 1]:
                            if di == 0 and dj == 0:
                                continue
                            x, y = i, j
                            captured = []
                            while 0 <= x + di < self.size and 0 <= y + dj < self.size and self.board[x + di][
                                    y + dj] == -player:
                                captured.append((x + di, y + dj))
                                x += di
                                y += dj
                            if 0 <= x + di < self.size and 0 <= y + dj < self.size and self.board[x + di][
                                    y + dj] == player and len(captured) > 0:
                                moves.add((i, j))
        return list(moves)

    def make_move(self, player, move):
        i, j = move
        self.board[i][j] = player
        for di in [-1, 0, 1]:
            for dj in [-1, 0, 1]:
                if di == 0 and dj == 0:
                    continue
                x, y = i, j
                captured = []
                while 0 <= x + di < self.size and 0 <= y + dj < self.size and self.board[x + di][y + dj] == -player:
                    captured.append((x + di, y + dj))
                    x += di
                    y += dj
                if 0 <= x + di < self.size and 0 <= y + dj < self.size and self.board[x + di][y + dj] == player:
                    for (cx, cy) in captured:
                        self.board[cx][cy] = player

    def get_cpu_move(self):
        moves = self.get_valid_moves(-1)
        if len(moves) == 0:
            return None
        return random.choice(moves)

    def get_minmax_algorithm_move(self):
        moves = self.get_valid_moves(1)
        if len(moves) == 0:
            return None
        max_value = float('-inf')
        best_move = None
        alpha = float('-inf')
        beta = float('inf')
        for move in moves:
            new_board = [row[:] for row in self.board]
            self.make_move(1, move)
            value = self.minimax(-1, self.minimax_depth - 1, alpha, beta)
            self.board = new_board
            if value > max_value:
                max_value = value
                best_move = move
            alpha = max(alpha, max_value)
            if alpha >= beta and self.prune:
                break
        return best_move if best_move != None else random.choice(moves)

    def minimax(self, player, depth, alpha, beta):
        if depth == 0:
            return self.heuristic()
        if player == 1:
            value = float('-inf')
            for move in self.get_valid_moves(1):
                new_board = [row[:] for row in self.board]
                self.make_move(1, move)
                value = max(value, self.minimax(-player, depth - 1, alpha, beta))
                self.board = new_board
                alpha = max(alpha, value)
                if alpha >= beta and self.prune:
                    break
            return value
        else:
            value = float('inf')
            for move in self.get_valid_moves(-1):
                new_board = [row[:] for row in self.board]
                self.make_move(-1, move)
                value = min(value, self.minimax(-player, depth - 1, alpha, beta))
                self.board = new_board
                beta = min(beta, value)
                if alpha >= beta and self.prune:
                    break
            return value

    def heuristic(self):
        return sum([row.count(1) for row in self.board]) - sum([row.count(-1) for row in self.board])



    def terminal_test(self):
        return len(self.get_valid_moves(1)) == 0 and len(self.get_valid_moves(-1)) == 0

    def play(self):
        winner = None
        while not self.terminal_test():
            if self.ui:
                self.ui.draw_board(self.board)
            if self.current_turn == 1:
                move = self.get_minmax_algorithm_move()
                if move:
                    self.make_move(self.current_turn, move)
            else:
                move = self.get_cpu_move()
                if move:
                    self.make_move(self.current_turn, move)
            self.current_turn = -self.current_turn
            if self.ui:
                self.ui.draw_board(self.board)
                time.sleep(1)
        winner = self.get_winner()
        return winner


In [None]:
res = list()
for i in range(150):
    a = Othello(False, minimax_depth=5, prune=False)
    res.append(a.play())

: 

In [None]:
print('Accuracy:', res.count(1)/len(res)*100)

## Results of playing with Depth of 5:

In [None]:
res = list()
for i in range(150):
    a = Othello(False, minimax_depth=5, prune=True)
    res.append(a.play())


In [None]:
print('Accuracy:', res.count(1)/len(res)*100)

Accuracy: 89.33333333333333


## Results of playing with Depth of 7:

In [None]:
res = list()
for i in range(100):
    a = Othello(False, minimax_depth=7, prune=True)
    res.append(a.play())


In [None]:
print('Accuracy:', res.count(1)/len(res)*100)

Accuracy: 97.0


1. When calculating the heuristic, I considered various factors that could contribute to the likelihood of winning the game. These factors include the number of pieces on the board, their positions, the potential moves available for each piece, the number of pieces in the opponent's camp, the distance of the player's pieces from the opponent's camp, and the number of pieces that have reached the opponent's camp.

2. The algorithm depth has a significant effect on the chance of winning, time, and nodes visited. As the algorithm depth increases, the chance of winning increases, but the time and nodes visited also increase significantly. This is because as the algorithm depth increases, the algorithm has to consider more moves and evaluate more nodes, which leads to longer processing time and higher memory usage.

3. Yes, we can choose the order of visiting the children of each node in a way that maximizes pruning. One way to do this is by using move ordering techniques such as the Killer heuristic or the Transposition Table. The Killer heuristic involves storing the best moves that have caused pruning in previous iterations and trying them first in the current iteration. The Transposition Table involves storing the results of previously evaluated positions to avoid re-evaluating them in future iterations.

4. The branching factor is the number of possible moves that can be made from a given position in the game. The branching factor changes with the progression of the game because as the number of pieces on the board decreases, the number of possible moves also decreases, and the branching factor reduces.

5. When pruning, the algorithm avoids evaluating nodes that are unlikely to lead to a better result. This reduces the number of nodes that need to be evaluated, leading to faster execution. Pruning helps to reduce the search space, and hence, the algorithm becomes faster without losing accuracy.

6. Using minmax may not be the optimal method in situations where the opponent acts randomly because the algorithm assumes that the opponent will always make the best possible move, which is not the case when the opponent is acting randomly. In such cases, a better algorithm to use is Monte Carlo Tree Search (MCTS). MCTS is a stochastic search algorithm that uses random simulations to evaluate the outcomes of different moves. MCTS is better suited for games where the opponent acts randomly because it can explore a wider range of possible moves and evaluate them based on their frequency of success.