# Genetic🧬Algorithm

## Introduction

### What is a Genetic Algorithm?

A **genetic algorithm** is a search technique that **mimics natural selection** to find optimal solutions by iteratively refining a population of candidate solutions.

### Why use genetic algorithms?

- be beneficial in optimization issues when traditional methods fail.
- efficiently navigate large and complex search spaces, making them ideal for tasks that require finding optimal solutions under restrictions.

## Gene Expression Programming (GEP)

### What is GEP?

is a variant of genetic algorithms where individuals are encoded as linear strings of fixed length, which are then expressed as nonlinear entities of different sizes and shapes.

GEP has shown effectiveness in solving complex problems because it combines the advantages of genetic algorithms and genetic programming.

### Applications of GEP

- Symbolic Regression: Discover mathematical models that best fit a set of data points.
- Classification: Develop models to classify data into predefined categories.
- Time Series Prediction: Forecast future values based on historical data.

## Understand the genetic optimization

Genetic optimization refers to the use of genetic algorithms to solve optimization problems. This process involves generating a population of possible solutions and iteratively improving them based on their performance against a defined objective.

## Algorithm of Genetic Algorithm

### Initialization

- Generate an initial population of potential solutions at random or using certain strategies. 

- The size of the population is an important parameter.

### Fitness Function

- evaluates how well each individual in the population performs.
- e.g., in the case of recommendation system, the fitness function was based on user engagement metrics such as click-through rates and user satisfaction scores.

### Selection

choose the best-performing individuals to act as parents for the next generation. The most common selection methods are:

- Roulette Wheel Selection: Individuals are selected based on their fitness proportion.
- Tournament Selection: A set of individuals is chosen randomly, and the best among them is selected.
- Rank Selection: Individuals are ranked based on their fitness, and selection is based on these ranks.

### Crossover (recombination)

is the merging of two parent solutions to form offspring. Common crossover strategies include the following:

1. Single-point crossover: select a crossover point and exchange the genes before and after this point between parents.
2. Two-Point Crossover: Two crossover points are selected, and the genes between these points are exchanged.
3. Parents randomly exchange genes in Uniform Crossover.

### Mutation

- Makes random changes to individual solutions in order to maintain variation in genetics. 

- Mutation rates must be carefully balanced so that appropriate exploration can be done while preserving good solutions.

### Termination

Repeats the process of selection, crossover, and mutation until a stopping criterion:
- a predetermined number of generations
- a certain fitness level
- a lack of considerable improvement over future generations.

## Implementation: Genetic Algorithm for Function Optimization

In [1]:
import numpy as np

# Define the fitness function
def fitness(x):
  # Maximize the function f(x) = x^2
  return x**2

# Define the GA parameters
POP_SIZE = 100
GENS = 100
CROSSOVER_PROB = 0.8
MUTATION_PROB = 0.2

# Initialize the population
pop = np.random.rand(POP_SIZE)

# Evaluate the fitness of the initial population
fitness_values = np.array([fitness(x) for x in pop])

# Main GA loop
for gen in range(GENS):
    # Selection
    parents = np.array([pop[np.argmax(fitness_values)] for _ in range(POP_SIZE//2)])

    # Crossover
    offspring = []
    for _ in range(POP_SIZE//2):
        parent1, parent2 = parents[np.random.randint(0, len(parents), 2)]
        child = (parent1 + parent2) / 2
        offspring.append(child)

    # Mutation
    for i in range(len(offspring)):  # Iterate over the correct range of offspring
        if np.random.rand() < MUTATION_PROB:
            offspring[i] += np.random.normal(0, 0.1)

    # Replace the population with the new offspring
    pop = offspring

    # Evaluate the fitness of the new population
    fitness_values = np.array([fitness(x) for x in pop])

    # Print the best fitness value
    print(f"Generation {gen+1}, Best Fitness: {np.max(fitness_values)}")
# Print the final best solution
print(f"Final Best Solution: {pop[np.argmax(fitness_values)]}")

Generation 1, Best Fitness: 1.3805798734864907
Generation 2, Best Fitness: 1.8918933698527065
Generation 3, Best Fitness: 2.272549162093107
Generation 4, Best Fitness: 2.551340813154705
Generation 5, Best Fitness: 3.1288981735058266
Generation 6, Best Fitness: 3.6049920349176996
Generation 7, Best Fitness: 4.177347905458683
Generation 8, Best Fitness: 4.630325922407399
Generation 9, Best Fitness: 5.089008404137174
Generation 10, Best Fitness: 6.248126665139105
Generation 11, Best Fitness: 6.97702731961774
Generation 12, Best Fitness: 8.280769174704464
Generation 13, Best Fitness: 9.411689105921049
Generation 14, Best Fitness: 9.70377124693442
Generation 15, Best Fitness: 10.419032672986253
Generation 16, Best Fitness: 10.828679978910708
Generation 17, Best Fitness: 11.776683762080092
Generation 18, Best Fitness: 12.98318221402017
Generation 19, Best Fitness: 13.537779560534759
Generation 20, Best Fitness: 14.005319207874866
Generation 21, Best Fitness: 14.700010958741165
Generation 22,

## Genetic Algorithm in ML

### Why Use Genetic Algorithms in ML?

### Hyperparameter Optimization

### Feature Selection

### Implementation of GA for feature selection in ML

In [2]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier
from deap import base, creator, tools, algorithms

# Load the iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Define the number of features to select
num_features = 3

# Define the fitness function
def fitness(individual):

    # Select the features based on the individual
    selected_indices = [i for i, x in enumerate(individual) if x == 1]
    
    # Handle the case where no features are selected
    if not selected_indices:
        return 0,  # Return a low fitness value if no features are selected
    selected_features = np.array([X[:, i] for i in selected_indices]).T
    
    # Create a random forest classifier with the selected features
    clf = RandomForestClassifier(n_estimators=100)
    
    # Evaluate the model using cross-validation
    scores = cross_val_score(clf, selected_features, y, cv=5)
    
    # Return the mean score as the fitness value
    return np.mean(scores),

# Create a DEAP creator for the fitness function
creator.create("FitnessMax", base.Fitness, weights=(1.0,))
creator.create("Individual", list, fitness=creator.FitnessMax)

# Create a DEAP toolbox for the GA
toolbox = base.Toolbox()
toolbox.register("attr_bool", np.random.choice, [0, 1])
toolbox.register("individual", tools.initRepeat, creator.Individual, toolbox.attr_bool, n=len(X[0]))
toolbox.register("population", tools.initRepeat, list, toolbox.individual)
toolbox.register("mate", tools.cxTwoPoint)
toolbox.register("mutate", tools.mutFlipBit, indpb=0.05)
toolbox.register("select", tools.selTournament, tournsize=3)
toolbox.register("evaluate", fitness)

# Create a population of 50 individuals
pop = toolbox.population(n=50)

# Evaluate the initial population
fitnesses = toolbox.map(toolbox.evaluate, pop)
for ind, fit in zip(pop, fitnesses):
    ind.fitness.values = fit

# Run the GA for 20 generations
for g in range(20):
    offspring = algorithms.varAnd(pop, toolbox, cxpb=0.5, mutpb=0.1)
    fits = toolbox.map(toolbox.evaluate, offspring)
    for fit, ind in zip(fits, offspring):
        ind.fitness.values = fit
    pop = toolbox.select(offspring, k=len(pop))

# Print the best individual and the corresponding fitness value
best_individual = tools.selBest(pop, k=1)[0]
print("Best Individual:", best_individual)
print("Best Fitness:", best_individual.fitness.values[0])

# Select the features based on the best individual
selected_features = np.array([X[:, i] for i, x in enumerate(best_individual) if x == 1]).T

# Print the selected features
print("Selected Features:", selected_features)

Best Individual: [0, 0, 1, 1]
Best Fitness: 0.9666666666666668
Selected Features: [[1.4 0.2]
 [1.4 0.2]
 [1.3 0.2]
 [1.5 0.2]
 [1.4 0.2]
 [1.7 0.4]
 [1.4 0.3]
 [1.5 0.2]
 [1.4 0.2]
 [1.5 0.1]
 [1.5 0.2]
 [1.6 0.2]
 [1.4 0.1]
 [1.1 0.1]
 [1.2 0.2]
 [1.5 0.4]
 [1.3 0.4]
 [1.4 0.3]
 [1.7 0.3]
 [1.5 0.3]
 [1.7 0.2]
 [1.5 0.4]
 [1.  0.2]
 [1.7 0.5]
 [1.9 0.2]
 [1.6 0.2]
 [1.6 0.4]
 [1.5 0.2]
 [1.4 0.2]
 [1.6 0.2]
 [1.6 0.2]
 [1.5 0.4]
 [1.5 0.1]
 [1.4 0.2]
 [1.5 0.2]
 [1.2 0.2]
 [1.3 0.2]
 [1.4 0.1]
 [1.3 0.2]
 [1.5 0.2]
 [1.3 0.3]
 [1.3 0.3]
 [1.3 0.2]
 [1.6 0.6]
 [1.9 0.4]
 [1.4 0.3]
 [1.6 0.2]
 [1.4 0.2]
 [1.5 0.2]
 [1.4 0.2]
 [4.7 1.4]
 [4.5 1.5]
 [4.9 1.5]
 [4.  1.3]
 [4.6 1.5]
 [4.5 1.3]
 [4.7 1.6]
 [3.3 1. ]
 [4.6 1.3]
 [3.9 1.4]
 [3.5 1. ]
 [4.2 1.5]
 [4.  1. ]
 [4.7 1.4]
 [3.6 1.3]
 [4.4 1.4]
 [4.5 1.5]
 [4.1 1. ]
 [4.5 1.5]
 [3.9 1.1]
 [4.8 1.8]
 [4.  1.3]
 [4.9 1.5]
 [4.7 1.2]
 [4.3 1.3]
 [4.4 1.4]
 [4.8 1.4]
 [5.  1.7]
 [4.5 1.5]
 [3.5 1. ]
 [3.8 1.1]
 [3.7 1. ]
 [3.9 1.2]
 [5.1

# Inference

[Genetic🧬Algorithm: Complete Guide With Python Implementation](https://levelup.gitconnected.com/genetic-algorithm-complete-guide-with-python-implementation-747d62dbe9bd)