# Question 1

## 3.11.3 (6)

**question:**

*After an individual is chosen to serve as parent, it may either be discarded or returned to the population so that it can be selected again. How does this choice affect selection pressure (for concreteness, consider each selection mechanism separately)?*

**answer:**

In general, if a successful individual is discarded, the selection pressure is decreased because the same successful individual cannot be selected again, and that leaves the option for lesser individuals to be selected. If the individual is returned to the population, the selection pressure remains high since the same individual will probably be chosen again.

For the roulette wheel selection mechanism, if the individual is discarded, the selection pressure is decreased for the reasons mentioned above. If the individual is returned to the population, the selection pressure remains high because the same individual can be chosen again.

For the rank-based selection mechanism, it is the same as the last one but the selection pressure is higher because the best individuals are chosen without considering any probability.

For the tournament selection mechanism, if the individual is discarded, the selection pressure is decreased lightly because the same individual cannot be chosen again, but the remaining best individuals will still be chosen. If the individual is returned to the population, the selection pressure remains high.

For the steady-state selection mechanism, if the individual is discarded, the selection pressure decreases slightly, as the same individual cannot participate in subsequent selection rounds. However, since steady-state selection generally involves replacing only a small portion of the population at each step, the impact on selection pressure is less pronounced compared to other mechanisms. If the individual is returned to the population, the selection pressure remains the same because the most fit individuals have a higher chance of being selected multiple times, maintaining their dominance in the population.

## 3.11.5 (19)

**question:**

*Explain why the niche method lowers the risk of early convergence.*

**answer:**

The niche method lowers the risk of early convergence by maintaining diversity in the population. The niche method uses a fitness function that considers both the quality of the individual and the diversity of the population. This encourages the population to explore different regions of the search space and prevents the population from converging prematurely to a local optimum. By maintaining diversity, the niche method increases the chances of finding a global optimum and avoids getting stuck in a suboptimal solution.

# Question 2

## Imports

In [None]:
from IPython.display import clear_output
from dataclasses import dataclass, field
from abc import ABC, abstractmethod
from typing import List, Tuple, Dict, Any, Union, Any, Callable
import logging
import time
from pprint import pprint

# Make sure to install these packages.
import torch
import torch.nn.functional as F
import numpy as np
import matplotlib.pyplot as plt
from tqdm.notebook import tqdm

logging.basicConfig(
    level=logging.INFO, 
    format="%(asctime)s - %(name)s.%(levelname)s: %(message)s",
    datefmt="%d-%m-%y %H:%M:%S",
    force=True
)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
logging.info(f"Using device {device}")

## Simulation

The simulation of Conway's Game of Life is fully vectorized and uses [PyTorch](https://pytorch.org/) for the calculation.

I've also taken into consideration a few things:

1. The finite side of the grid: If the agent goes out of the grid, it will be placed back at the opposite side using circular padding.
2. The simulation is batched, so we can simulate multiple runs at once efficiently. Also, all the operations can be done on the GPU if available.
3. The simulation keeps the initial state of the grid, so we can evaluate the performance at the end of the simulation. Only the initial and current states are saved - so there is no way to check the stability of the grid within the simulation. I've decided to not check the stability of the grid because it would require much more memory and computation. For example, one of the ways to check the stability of the grid is to save the state of the grid at each step and compare it with the previous states. This would require a lot of memory and computation, especially for large grids and long simulations.

In [None]:
class BaseVecConwaysGOL:
    # (curr_state, #neighbours) -> new_state
    rules = {
        (0, 0): 0,
        (0, 1): 0,
        (0, 2): 0,
        (0, 3): 1,
        (0, 4): 0,
        (0, 5): 0,
        (0, 6): 0,
        (0, 7): 0,
        (0, 8): 0,

        (1, 0): 0,
        (1, 1): 0,
        (1, 2): 1,
        (1, 3): 1,
        (1, 4): 0,
        (1, 5): 0,
        (1, 6): 0,
        (1, 7): 0,
        (1, 8): 0
    }
    # k=2
    # r=1

    def __init__(self, grid: Union[np.ndarray, torch.Tensor]) -> None:
        self._orig_grid: np.ndarray = (
            grid 
            if isinstance(grid, np.ndarray) 
            else grid.detach().cpu().numpy()
        )
        self._sum_kernel: torch.Tensor = torch.tensor(
            [[1, 1, 1], [1, 0, 1], [1, 1, 1]], dtype=torch.float32, device=device,
        ).unsqueeze(0).unsqueeze(0)
        self._set_tensor_rules()
        self.reset_grid()

    def reset_grid(self) -> None:
        self.__step: int = 0
        self._pt_grid: torch.Tensor = torch.tensor(
            self._orig_grid, dtype=torch.long, device=device
        ).view(-1, 1, self._orig_grid.shape[-2], self._orig_grid.shape[-1])
    
    @property
    def grid(self) -> np.ndarray:
        return self._pt_grid.squeeze().bool().cpu().numpy()
    
    @property
    def step(self) -> int:
        return self.__step
    
    def _set_tensor_rules(self) -> None:
        self._tensor_rules = torch.zeros((2, 9), dtype=torch.long, device=device)
        for state, num_neighbours in self.rules:
            self._tensor_rules[state, num_neighbours] = self.rules[(state, num_neighbours)]
    
    def _torch_update(self) -> None:
        """Updates the grid using PyTorch operations."""
        padded_pt_grid = F.pad(self._pt_grid, (1, 1, 1, 1), "circular")
        sum_grid = F.conv2d(padded_pt_grid.float(), self._sum_kernel, padding=0).long().squeeze()
        
        new_grid_rule_slice = [self._pt_grid.flatten(), sum_grid.flatten()]
        new_grid = self._tensor_rules[new_grid_rule_slice].view_as(self._pt_grid)
        
        self._pt_grid = new_grid
    
    def update_n(self, n: int) -> None:
        for _ in range(n):
            self.update()

    def update(self) -> None:
        """Can be overriden to add custom or better update logic."""
        self._torch_update()
        self.__step += 1


class VecConwaysGOL(BaseVecConwaysGOL):
    """Overrides the base class to add plotting capabilities."""
    def __init__(self, grid: Union[np.ndarray, torch.Tensor]) -> None:
        if grid.ndim != 2:
            raise ValueError("Grid must be a 2D array")
        if grid.shape[0] < 3 or grid.shape[1] < 3:
            raise ValueError("Grid must be at least 3x3")
        if grid.shape[0] != grid.shape[1]:
            raise ValueError("Grid must be square")
        super().__init__(grid)
    
    def update_n(self, n: int, plot: bool = False, plot_every: int = 1, 
                 sleep_s: float = 0., **plot_kwargs) -> None:
        if plot:
            if sleep_s < 0:
                raise ValueError("Sleep must be non-negative")
            if plot_every < 1:
                raise ValueError("Plot every must be at least 1")
            
            for i in range(n):
                self.update()
                if (i % plot_every) == 0:
                    clear_output(wait=True)
                    self.plot(**plot_kwargs)
                    if sleep_s != 0:
                        time.sleep(sleep_s)
        else:
            for _ in range(n):
                self.update()

    def plot(self, **kwargs) -> None:
        plt.figure(figsize=kwargs.pop("figsize", (10, 10)))
        plt.title(f"Step {self.step}")
        plt.imshow(self.grid * 255, **kwargs)
        plt.show()


class BatchedConwaysGOL(BaseVecConwaysGOL):
    def __init__(self, grids: Union[np.ndarray, torch.Tensor]) -> None:
        if grids.ndim != 3:
            raise ValueError("Grid must be a 3D array")
        if grids.shape[0] < 3 or grids.shape[1] < 3:
            raise ValueError("Grids must be at least 3x3")
        if grids.shape[1] != grids.shape[2]:
            raise ValueError("Grids must be squares")
        super().__init__(grids)

    reset_grids = BaseVecConwaysGOL.reset_grid

    @property
    def grids(self) -> np.ndarray:
        return super().grid

In [None]:
class KnownMethuselahs:
    """Contains some known methuselahs."""
    R_PENTOMINO: np.ndarray = np.array(
        [
            [0, 1, 1],
            [1, 1, 0],
            [0, 1, 0],
        ],
        dtype=int,
    )

    ACORN: np.ndarray = np.array(
        [
            [0, 0, 0, 0, 0, 0, 0],  # Row 5
            [0, 0, 0, 0, 0, 0, 0],  # Row 0
            [0, 1, 0, 0, 0, 0, 0],  # Row 1
            [0, 0, 0, 1, 0, 0, 0],  # Row 2
            [1, 1, 0, 0, 1, 1, 1],  # Row 3
            [0, 0, 0, 0, 0, 0, 0],  # Row 4
            [0, 0, 0, 0, 0, 0, 0],  # Row 6
        ],
        dtype=int,
    )

    PI_HEPTOMINO: np.ndarray = np.array(
        [
            [1, 1, 1],
            [1, 0, 1],
            [1, 0, 1],
        ],
        dtype=int,
    )

    M2513: np.ndarray = np.array(
        [
            [1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1],
            [1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0],
            [0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1],
            [0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0],
            [0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1],
            [1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0],
            [0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1],
            [0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0],
            [1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1],
            [1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0],
            [1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0],
            [0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1],
            [1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1],
            [0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1],
            [1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0],
            [1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1]
        ],
        dtype=int
    )


class Grider:
    """Utility class to generate grids."""
    
    @staticmethod
    def grid(grd: np.ndarray) -> np.ndarray:
        """Validates and returns a grid."""
        if not isinstance(grd, np.ndarray):
            raise ValueError("grid must be a numpy array")
        if grd.ndim != 2:
            raise ValueError("grid must be a 2D array")
        if grd.shape[0] != grd.shape[1]:
            raise ValueError("grid must be a square array")
        return grd

    @staticmethod
    def get_random_grid(size: int, states: List[int] = [0, 1], 
                        probs: Union[float, List[float]] = 0.7) -> np.ndarray:
        if isinstance(probs, float):
            probs = [probs]
        if len(states) == len(probs) + 1:
            probs.append(1 - sum(probs))
        if len(states) != len(probs):
            raise ValueError("States and probs must have the same length")
        
        return np.random.choice(states, size=(size, size), p=probs).astype(np.uint8)
    
    @classmethod
    def _enter_center_to_grid(cls, grid: np.ndarray, center: np.ndarray) -> None:
        grid = cls.grid(grid)
        center = cls.grid(center)
        if center.shape[0] > grid.shape[0] or center.shape[1] > grid.shape[1]:
            raise ValueError("Center must be smaller than grid")
        
        grid[
            (grid.shape[0] - center.shape[0])//2 : (grid.shape[0] + center.shape[0])//2, 
            (grid.shape[1] - center.shape[1])//2 : (grid.shape[1] + center.shape[1])//2
        ] = center

    @classmethod
    def get_random_center_grid(cls,
            size: int, center_size: int, states: List[int] = [0, 1], 
            probs: Union[float, List[float]] = 0.7
        ) -> np.ndarray:
        grid = np.zeros((size, size), dtype=np.uint8)
        center = cls.get_random_grid(center_size, states, probs)
        cls._enter_center_to_grid(grid, center)

        return grid

    @classmethod
    def get_empty_grid_with_defined_center(cls, size: int, center: np.ndarray) -> np.ndarray:
        grid = np.zeros((size, size), dtype=np.uint8)
        cls._enter_center_to_grid(grid, center)

        return grid

## Visualization - r_pentomino

In [None]:
grid = Grider.get_empty_grid_with_defined_center(200, KnownMethuselahs.R_PENTOMINO)
gol = VecConwaysGOL(grid)
gol.update_n(1400, plot=True, plot_every=11, sleep_s=0., cmap="gray", figsize=(6, 6))

## Fitness Function

To evaluate a grid, I've chosen to use the F/I metric ([source](https://conwaylife.com/wiki/Methuselah#Measuring_methuselahs)). This metric is the ratio of the final population to the initial population. The higher the ratio, the better the grid is at sustaining and generating life.

The positive of the metric is that we don't need to check the stability of the grid, which would require us to hold a buffer of the previous states. Instead, we can just count the number of cells that are alive at the end and divide it by the number of cells that were alive at the beginning.

**Note:** The implementation of the metric is abstract in the way that other fitness functions can be implemented and chosen.

In [None]:
class classproperty:
    """A decorator to create class-level properties."""
    def __init__(self, fget):
        self.fget = fget

    def __get__(self, obj, owner):
        return self.fget(owner)


class MMetric(ABC):
    """Abstract Base Methuselah Metric class."""
    def __init_subclass__(cls) -> None:
        original_call = cls.__call__

        def new_call(self, *args: Any, **kwds: Any) -> Union[float, np.ndarray]:
            result = original_call(self, *args, **kwds)
            if not isinstance(result, (float, np.ndarray)):
                raise RuntimeError("Metric result must be a number or numpy array")
            if (
                (isinstance(result, float) and result < 0) or
                (isinstance(result, np.ndarray) and np.any(result < 0))
            ):
                raise RuntimeError("Metric result must be non-negative")
            
            return result
        
        cls.__call__ = new_call
    
    @abstractmethod
    def __call__(self, *args: Any, **kwds: Any) -> Any:
        pass

    def __repr__(self) -> str:
        return "Unspecified Methuselah Metric"


class MMetricFuncs:
    """Usage: MMetricFuncs.{metric_name} -> MMetric instance with __call__ method."""

    @classproperty
    def FperI(cls) -> MMetric:
        class _FperI(MMetric):
            def __call__(self, gol: BaseVecConwaysGOL) -> Union[float, np.ndarray]:
                """F/I: Final population per Initial population"""
                grid, orig_grid = gol.grid, gol._orig_grid
                if grid.ndim == 2:
                    grid, orig_grid = grid[np.newaxis, ...], orig_grid[np.newaxis, ...]
                f = gol.grid.sum(axis=(-2, -1))
                i = gol._orig_grid.astype(bool).sum(axis=(-2, -1))

                # Avoid division by zero, and if f / i is too small (the grid's 
                #   population died out), return 1.
                # This is to avoid returning 0 as a score so the probability of 
                #  the gene being selected is not 0.
                return np.where(i != 0, np.maximum(f / (i + 1e-8), 1), 0)
            
            def __repr__(self) -> str:
                return "F/I Metric"
            
        return _FperI()

## The Genetic Algorithm

Next code block shows the implementation of the genetic algorithm framework. The algorithm is implemented as a class with the following methods:
- `__init__`: Initializes the genetic algorithm with the specified parameters.
- `find`: Runs the genetic algorithm to find the best solution.

This framework uses the roulette wheel selection mechanism, and the mating strategy is configured in the child classes, to provide abstraction and flexibility.
It uses batched & vectorized implementation that can be run on GPU for a much faster execution.

See the documentation of the class for more details of the methods and parameters.

In [None]:
class BaseMethuselahsFinder(ABC):
    """
    Base Methuselahs Finder class, that uses the roulette wheel selection for mating.
    This is an abstract class and should be inherited.
    Override the _mate method to define the mating strategy, and the __version__ attribute.
    
    Attributes:
        methuselahs: List[np.ndarray] - List of found Methuselahs.
        metric_values - List[np.ndarray] - List of metric values for each generation.

    Methods:
        __init__: Initialize the MethuselahsFinder.
        __repr__: Return the string representation of the MethuselahsFinder.
        reset_finder: Reset the finder to the initial state, while keeping saved Methuselahs.
        find: Find Methuselahs by maximizing the given metric.
        iter: Return the current iteration number (find can be called several times).
    """
    @classmethod
    @property
    @abstractmethod
    def __version__(cls) -> str:
        """Version of the Methuselahs Finder"""
        pass

    def __init__(
            self, pop_size: int, center_side_len: int = 8, 
            grid_side_len: int = 250, methuselah_min_life: int = 100,
            start_live_cell_probs: float = 0.3
        ) -> None:
        """
        Initialize the MethuselahsFinder.
        
        Args:
            pop_size: int - Population size for each generation.
            center_side_len: int - Side length of the center grid, where the initial live cells are found.
            grid_side_len: int - Side length of the grid.
            methuselah_min_life: int - Minimum life of a methuselah.
            start_live_cell_probs: float - Probability of a cell being alive in the initial grid.
        """
        self.pop_size = pop_size
        self.center_side_len = center_side_len
        self.grid_side_len = grid_side_len
        self.methuselah_min_life = methuselah_min_life
        self.start_live_cell_probs = start_live_cell_probs
        self.methuselahs: List[np.ndarray] = []
        self.reset_finder()
        self.__pbar: tqdm

    def __repr__(self) -> str:
        return (
            f"MethuselahsFinder_v{self.__version__}(\n"
            f"    pop_size={self.pop_size}, "
            f"center_side_len={self.center_side_len}, " 
            f"grid_side_len={self.grid_side_len}, "
            f"methuselah_min_life={self.methuselah_min_life}\n"
            f"    -------------------------\n"
            f"    #found_methuselahs={len(self.methuselahs)} "
            f"finder_iter={self.iter}\n"
            f")"
        )
    
    @abstractmethod
    def _mate(self, parent1: np.ndarray, parent2: np.ndarray) -> np.ndarray:
        pass

    def reset_finder(self) -> None:
        self.candidates = np.array([
            Grider.get_random_grid(self.center_side_len, probs=1 - self.start_live_cell_probs) 
            for _ in range(self.pop_size)
        ])
        self.metric_values: List[np.ndarray] = []
        self.__iter: int = 0

    @property
    def iter(self) -> int:
        return self.__iter

    def find(self, iters: int, metric: MMetric, metric_ths: float) -> None:
        """
        Find Methuselahs using the given metric.

        Args:
            iters: int - Number of iterations to run.
            metric: MMetric - Metric to use for finding Methuselahs.
            metric_ths: float - Metric threshold to consider a candidate as a Methuselah.
        """
        self.__pbar = tqdm(range(iters), initial=self.iter, total=self.iter + iters, 
                           desc="Finding Methuselahs")
        for _ in self.__pbar:
            gol = BatchedConwaysGOL(self._expand_candidates())
            gol.update_n(self.methuselah_min_life)
            self._update_candidates(gol, metric, metric_ths)
            self.__iter += 1

    def _expand_candidates(self) -> np.ndarray:
        return np.array([
            Grider.get_empty_grid_with_defined_center(
                self.grid_side_len, candidate
            ) for candidate in self.candidates
        ])
    
    def _update_candidates(self, gol: BaseVecConwaysGOL, metric: MMetric, metric_ths: float) -> None:
        metric_vals: np.ndarray = metric(gol)
        self.metric_values.append(metric_vals)
        self.__pbar.set_postfix({"mean_gen_score": round(metric_vals.mean(), 2), 
                                 "max_gen_score": round(metric_vals.max(), 2)})

        winners_mask = metric_vals >= metric_ths
        if np.any(winners_mask):  # Found grid(s) that passed the threshold
            # Filter to unique winners
            unique_winners_idxs = np.unique(gol._orig_grid[winners_mask], return_index=True, axis=0)[1]
            unique_winners = gol._orig_grid[winners_mask][unique_winners_idxs]
            unfound_winners_mask = (
                (unique_winners[:, np.newaxis, ...] != self.methuselahs)
                .any(axis=(-2, -1))
                .all(axis=1)
            ) if self.methuselahs != [] else np.ones(len(unique_winners), dtype=bool)
            if np.any(unfound_winners_mask):  # Found undiscovered methuselah(s)
                winner_metrics = metric_vals[winners_mask][unique_winners_idxs][unfound_winners_mask]
                self.__pbar.write(f"Found new methuselah(s) with: {metric}={winner_metrics}")
                self.methuselahs.extend(unique_winners[unfound_winners_mask])

        mating_probs = metric_vals / (metric_vals.sum() + 1e-7)
        logging.debug(f"Mating probabilities: {mating_probs}")
        mating_idxs = np.random.choice(np.arange(self.pop_size), size=self.pop_size * 2, p=mating_probs)

        new_candidates = np.zeros_like(self.candidates)
        for i in range(self.pop_size):
            parent1 = self.candidates[mating_idxs[2 * i]]
            parent2 = self.candidates[mating_idxs[2 * i + 1]]
            child = self._mate(parent1, parent2)
            new_candidates[i] = child

        self.candidates = new_candidates
    
    def plot_metric(self, figsize: Tuple[int, int] = (10, 5)) -> None:
        """Plot mean and max for each generation."""
        metric_vals = np.array(self.metric_values)
        mean_vals = metric_vals.mean(axis=1)
        max_vals = metric_vals.max(axis=1)

        plt.figure(figsize=figsize)
        plt.plot(mean_vals, label="Mean", marker="o", linestyle="-", color="C0", linewidth=2, markersize=6)
        plt.plot(max_vals, label="Max", marker="s", linestyle="--", color="C3", linewidth=2, markersize=6)
        plt.title("Generation-wise Metric Values")
        plt.xlabel("Generation")
        plt.ylabel("Metric Value")
        plt.grid(True, which="both", linestyle="--", alpha=0.7)
        plt.legend(loc="best")
        plt.tight_layout()
        plt.show()

The next code block contains the different mating strategies. The strategies are implemented as child classes of the BaseMethuselahsFinder class and override the `_mate` method to implement the specific mating strategy.

I've experimented with the following mating strategies:

1. **MethuselahsFinderV1**: This strategy selects a random cut point and combines the parents by taking the first part from the first parent and the second part from the second parent.

2. **MethuselahsFinderV2**: This strategy is the same as the first one, but this time the cell in the cut point is selected randomly as a mutation.

3. **MethuselahsFinderV3**: This strategy combines the genes of the parents uniformly at random.

4. **MethuselahsFinderV4**: Same as v3, but with a mutation.

In [None]:
class MethuselahsFinderV1(BaseMethuselahsFinder):
    __version__ = "1"

    def _mate(self, parent1: np.ndarray, parent2: np.ndarray) -> np.ndarray:
        cut = np.random.randint(0, parent1.size)
        child = np.zeros(parent1.size)
        child[:cut] = parent1.flatten()[:cut]
        child[cut:] = parent2.flatten()[cut:]
        
        return child.reshape(parent1.shape)


class MethuselahsFinderV2(BaseMethuselahsFinder):
    __version__ = "2"

    def _mate(self, parent1: np.ndarray, parent2: np.ndarray) -> np.ndarray:
        cut = np.random.randint(0, parent1.size - 1)
        child = np.zeros(parent1.size, dtype=np.uint8)
        child[:cut] = parent1.flatten()[:cut]
        child[cut] = np.random.randint(0, 2, dtype=np.uint8)
        child[cut + 1:] = parent2.flatten()[cut + 1:]
        
        return child.reshape(parent1.shape)
    
class MethuselahsFinderV3(BaseMethuselahsFinder):
    __version__ = "3"

    def _mate(self, parent1: np.ndarray, parent2: np.ndarray) -> np.ndarray:
        p1_mask = np.random.randint(0, 2, parent1.size).astype(bool)
        child = np.zeros(parent1.size, dtype=np.uint8)
        child[p1_mask] = parent1.flatten()[p1_mask]
        child[~p1_mask] = parent2.flatten()[~p1_mask]

        return child.reshape(parent1.shape)
    
class MethuselahsFinderV4(BaseMethuselahsFinder):
    __version__ = "4"

    def __init__(self, mutation_rate: float, *args, **kwds) -> None:
        super().__init__(*args, **kwds)
        self.mutation_rate = mutation_rate

    def __repr__(self) -> str:
        return (
            super().__repr__()[:-1] + 
            f"    -------------------------\n" +
            f"    mutation_rate={self.mutation_rate}" +
            f"\n)"
        )
    
    def _mate(self, parent1: np.ndarray, parent2: np.ndarray) -> np.ndarray:
        p1_mask = np.random.randint(0, 2, parent1.size).astype(bool)
        child = np.zeros(parent1.size, dtype=np.uint8)
        child[p1_mask] = parent1.flatten()[p1_mask]
        child[~p1_mask] = parent2.flatten()[~p1_mask]
        
        mutation_mask = np.random.choice(
            [0, 1], parent1.size, p=[1 - self.mutation_rate, self.mutation_rate]
        )
        child[mutation_mask.astype(bool)] = np.random.randint(0, 2, mutation_mask.sum())

        return child.reshape(parent1.shape)

In [None]:
@dataclass
class BaseFinderArgs:
    pop_size: int
    center_side_len: int
    grid_side_len: int
    methuselah_min_life: int

if device == "cpu":
    finder_args = BaseFinderArgs(
        pop_size=15, center_side_len=5, grid_side_len=125, methuselah_min_life=350
    ).__dict__
    num_iters = 30
    metric_ths = 15
else:
    finder_args = BaseFinderArgs(
        pop_size=30, center_side_len=8, grid_side_len=250, methuselah_min_life=1300
    ).__dict__
    num_iters = 50
    metric_ths = 30

pprint(finder_args)

In [None]:
m_finder_v1 = MethuselahsFinderV1(**finder_args)
logging.info(m_finder_v1)

m_finder_v1.find(num_iters, metric=MMetricFuncs.FperI, metric_ths=metric_ths)
m_finder_v1.plot_metric()

In [None]:
m_finder_v2 = MethuselahsFinderV2(**finder_args)
logging.info(m_finder_v2)

m_finder_v2.find(num_iters, metric=MMetricFuncs.FperI, metric_ths=metric_ths)
m_finder_v2.plot_metric()

In [None]:
m_finder_v3 = MethuselahsFinderV3(**finder_args)
logging.info(m_finder_v3)

m_finder_v3.find(num_iters, metric=MMetricFuncs.FperI, metric_ths=metric_ths)
m_finder_v3.plot_metric()

In [None]:
m_finder_v4 = MethuselahsFinderV4(0.01, **finder_args)
logging.info(m_finder_v4)

m_finder_v4.find(num_iters, metric=MMetricFuncs.FperI, metric_ths=metric_ths)
m_finder_v4.plot_metric()

In [None]:
num_found_methuselahs = len(m_finder_v4.methuselahs)

if num_found_methuselahs == 0:
    logging.info("No methuselahs found!")
    logging.info("Let's plot the last generation...")
    gol = VecConwaysGOL(m_finder_v4._expand_candidates()[0])

else:
    logging.info(f"found #{num_found_methuselahs} methuselahs!")
    logging.info(f"Let's plot one of them...")    
    gol = VecConwaysGOL(m_finder_v4.methuselahs[0])

In [None]:
gol.update_n(1500, plot=True, plot_every=17, sleep_s=0., cmap="gray", figsize=(6, 6))

## Summary

As seen, V1 and V3, both had no mutation, converged to a specific gene - it's easy to see that from the metric plot. V2, was more successful than V1 and V3, as it continued to improve with each generation with its mutations. V4, the best out of all, with a mutation rate of 0.01, was able to find many methuselahs. It is interesting to see that the mutation rate was very low, and still, it was able to find some methuselahs. 

**Note** in experiments I've made, if the mutation rate was higher, the algorithm diverged and was unable to find any methuselahs.