<a href="https://colab.research.google.com/github/damladmrk/GeneticAlgorithms/blob/main/GeneticAlgorithmsProjects.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🔹 Support Vector Machines (SVM)

### What is SVM?
Support Vector Machine (SVM) is a **supervised learning algorithm** mainly used for **classification** and sometimes **regression**.  
It finds the **best hyperplane** that separates classes while maximizing the margin between them.

---

### How does it work?
- **Hyperplane**: A line (2D) or a plane (3D) that separates classes.  
- **Support Vectors**: The closest data points to the hyperplane; they "support" the boundary.  
- **Margin**: The distance between the hyperplane and support vectors. SVM maximizes this margin.

---

### Linear vs. Nonlinear
- **Linear SVM**: Works if data is linearly separable.  
- **Nonlinear SVM**: Uses the **kernel trick** to project data into higher dimensions so that it becomes separable.

---

### Key Parameters
1. **kernel**  
   - Defines the transformation function of data.  
   - Common choices:  
     - `"linear"` → straight hyperplane  
     - `"poly"` → polynomial transformation  
     - `"rbf"` (Radial Basis Function, default) → maps data into higher dimension smoothly  
     - `"sigmoid"` → behaves like a neural network activation  

2. **degree**  
   - Used only with `"poly"` kernel.  
   - Controls the degree of the polynomial (e.g., degree=3 means cubic).  
   - Higher degree → more complex decision boundary.  

3. **gamma**  
   - Defines how far the influence of a single training example reaches.  
   - Low gamma → wide influence (smoother boundary).  
   - High gamma → narrow influence (tighter boundary, risk of overfitting).  

4. **C (Regularization parameter)**  
   - Controls the trade-off between maximizing the margin and minimizing misclassification.  
   - Low C → wider margin, allows some misclassifications (better generalization).  
   - High C → tries to classify all training examples correctly (risk of overfitting).  



# 🔹 Neural Networks (NN)

### What is a Neural Network?
A **Neural Network (NN)** is a machine learning model inspired by the human brain.  
It consists of layers of interconnected nodes (**neurons**) that process inputs and learn complex patterns.  

---

### Structure
1. **Input Layer**  
   - Takes raw data (features) and passes it to the network.

2. **Hidden Layers**  
   - Intermediate layers that transform inputs into higher-level features.  
   - Each layer applies weights, bias, and an **activation function**.

3. **Output Layer**  
   - Produces the final prediction (e.g., class probabilities, regression value).

---

### Activation Functions
Activation functions decide whether a neuron should "fire" and introduce non-linearity.  
Common ones:  
- **Sigmoid** → outputs values between 0 and 1 (good for probabilities).  
- **Tanh** → outputs between -1 and 1 (zero-centered).  
- **ReLU (Rectified Linear Unit)** → outputs max(0, x); fast and widely used.  
- **Leaky ReLU / ELU** → variations of ReLU to fix the "dying ReLU" problem.  
- **Softmax** → converts outputs into probabilities for classification.

---

### Classic Types of Neural Networks
1. **Feedforward Neural Network (FNN / MLP)**  
   - The simplest type; data flows one way from input to output.  

2. **Convolutional Neural Network (CNN)**  
   - Designed for images; uses convolution + pooling to capture spatial patterns.  

3. **Recurrent Neural Network (RNN)**  
   - Designed for sequential data (text, time series); outputs depend on previous states.  

4. **LSTM / GRU (advanced RNNs)**  
   - Handle long-term dependencies in sequences better than vanilla RNNs.  

5. **Transformer-based Models**  
   - Use attention mechanisms; dominant in NLP and vision today (e.g., BERT, GPT).  

---

### Key Hyperparameters
1. **Number of Layers**  
   - More layers = deeper network = can learn more complex functions (risk of overfitting).

2. **Number of Neurons per Layer**  
   - Controls capacity of each layer.  

3. **Learning Rate**  
   - Step size for weight updates.  
   - Too high = unstable, too low = slow learning.  

4. **Batch Size**  
   - Number of samples processed before weights are updated.  

5. **Number of Epochs**  
   - How many times the entire datas


# Complexity Classes: P vs NP

**P (Polynomial Time):**  
- Class of problems that can be solved in polynomial time by a deterministic algorithm.  
- Examples: Sorting numbers, finding the greatest common divisor, shortest path in a graph (Dijkstra).  
- If a problem is in **P**, it means there exists an efficient algorithm that solves it in "reasonable" time as input size grows.  

**NP (Nondeterministic Polynomial Time):**  
- Class of problems where a given solution can be verified in polynomial time, even if we don’t know how to solve it efficiently.  
- Examples: Sudoku, Boolean satisfiability (SAT), Traveling Salesman Problem (decision form).  
- Being in **NP** does not mean we can solve it fast, but we can *check* a solution quickly.  

**P vs NP Problem:**  
- The biggest open question in computer science: **Is P = NP?**  
- If P = NP → all problems with verifiable solutions could also be solved efficiently.  
- If P ≠ NP → there exist problems that are easy to check but hard to solve.  

**NP-Complete Problems:**  
- Hardest problems in NP.  
- If you solve one NP-complete problem efficiently, all NP problems can be solved efficiently.  
- Examples: SAT, Traveling Salesman, Knapsack Problem.  

**NP-Hard:**  
- At least as hard as NP-complete problems but not necessarily in NP (solution may not even be verifiable in polynomial time).  


# Heuristics and Metaheuristics

## 🔹 Heuristics
Heuristics are problem-solving strategies that use rules of thumb or approximations to find solutions faster when exact methods are too slow or impractical.  
They do not guarantee an optimal solution but can provide "good enough" results within a reasonable time.

### Characteristics
- Problem-specific  
- Fast and simple to implement  
- May lead to suboptimal or biased solutions  
- Useful for NP-hard problems where exact algorithms are infeasible  

### Examples
- Greedy Algorithm  
- Hill Climbing  
- Nearest Neighbor for TSP  
- A* Search  

---

## 🔹 Metaheuristics
Metaheuristics are higher-level strategies designed to guide heuristics in exploring the solution space more effectively.  
They are **general-purpose** and can be applied to a wide range of optimization problems.

### Characteristics
- Problem-independent  
- Balance between **exploration** (global search) and **exploitation** (local search)  
- Usually stochastic (randomness is used)  
- Aim to avoid local optima  

### Examples
- **Genetic Algorithms (GA)**  
- **Simulated Annealing (SA)**  
- **Particle Swarm Optimization (PSO)**  
- **Ant Colony Optimization (ACO)**  
- **Tabu Search**  

---

## 🔹 Key Differences

| Feature              | Heuristic                  | Metaheuristic                        |
|----------------------|---------------------------|--------------------------------------|
| **Scope**            | Problem-specific          | Problem-independent                  |
| **Optimality**       | May get stuck in local optima | Can escape local optima              |
| **Complexity**       | Simple                    | More complex, often population-based |
| **Speed**            | Fast                      | Slower but more reliable             |
| **Examples**         | Greedy, A*, Hill Climbing | GA, PSO, SA, ACO, Tabu Search        |

---

## 🔹 Hyperparameters in Metaheuristics
Each metaheuristic algorithm comes with **tunable hyperparameters**, such as:
- **Genetic Algorithm**: population size, mutation rate, crossover rate, number of generations  
- **Simulated Annealing**: initial temperature, cooling schedule  
- **PSO**: number of particles, inertia weight, cognitive/social coefficients  
- **ACO**: pheromone evaporation rate, alpha/beta parameters  

---

✅ **In summary:**  
- Heuristics = fast, problem-specific shortcuts.  
- Metaheuristics = general strategies to explore search space effectively, often inspired by nature.  


### 📌 Introduction to Genetic Algorithms (GA)

---

#### 1. How Genetic Algorithms Were Born
Genetic Algorithms (GAs) are inspired by **Charles Darwin's theory of natural evolution**. They were first formalized in the 1960s and 1970s by **John Holland** at the University of Michigan. Holland wanted to model the process of natural selection and evolution as a way to solve optimization and search problems.

---

#### 2. Who Developed Them
- **John Holland (1960s–1975)** → introduced the concept of Genetic Algorithms.
- **David E. Goldberg (1989)** → expanded practical applications in engineering and optimization.

---

#### 3. What GAs Are Used For
Genetic Algorithms are mainly used to solve **optimization and search problems**, especially when the search space is huge and complex. Applications include:
- Machine learning & neural networks (feature selection, hyperparameter tuning)
- Engineering design optimization
- Game strategy development
- Robotics (path planning)
- Bioinformatics (gene sequence alignment)
- Economics and scheduling problems

---

#### 4. Main Terms and Their Meanings
- **Population** → A set of candidate solutions.
- **Chromosome** → A representation of a single solution (often as a string of numbers/bits).
- **Gene** → A part of the chromosome (one variable/feature).
- **Fitness Function** → Measures how good a solution is.
- **Selection** → Choosing parents based on fitness to reproduce.
- **Crossover (Recombination)** → Combining two parents to form new offspring.
- **Mutation** → Randomly changing genes to maintain diversity.
- **Generation** → One iteration of the GA process.

---

#### 5. Selection Types in Genetic Algorithms
1. **Roulette Wheel Selection (Fitness Proportionate)**
   - Probability of being selected is proportional to fitness.
2. **Tournament Selection**
   - A few candidates are chosen randomly, and the best among them is selected.
3. **Rank Selection**
   - Solutions are ranked, and selection is based on rank rather than raw fitness.
4. **Elitism**
   - The best individuals are guaranteed to survive to the next generation.





In [None]:
pip install ucimlrepo



# **SVM Genetic Algorithm Parameter Optimisation**

***Chromosome*** : First half represents gamma other half is c, but in binary.
- c is in range 10 to 1000
- gamma is in range 0.05 to 0.95

In [None]:
from ucimlrepo import fetch_ucirepo

energy_efficiency = fetch_ucirepo(id=242)

X = energy_efficiency.data.features
y = energy_efficiency.data.targets

print(energy_efficiency.metadata)

print(energy_efficiency.variables)


{'uci_id': 242, 'name': 'Energy Efficiency', 'repository_url': 'https://archive.ics.uci.edu/dataset/242/energy+efficiency', 'data_url': 'https://archive.ics.uci.edu/static/public/242/data.csv', 'abstract': 'This study looked into assessing the heating load and cooling load requirements of buildings (that is, energy efficiency) as a function of building parameters.', 'area': 'Computer Science', 'tasks': ['Classification', 'Regression'], 'characteristics': ['Multivariate'], 'num_instances': 768, 'num_features': 8, 'feature_types': ['Integer', 'Real'], 'demographics': [], 'target_col': ['Y1', 'Y2'], 'index_col': None, 'has_missing_values': 'no', 'missing_values_symbol': None, 'year_of_dataset_creation': 2012, 'last_updated': 'Mon Feb 26 2024', 'dataset_doi': '10.24432/C51307', 'creators': ['Athanasios Tsanas', 'Angeliki Xifara'], 'intro_paper': {'ID': 379, 'type': 'NATIVE', 'title': 'Accurate quantitative estimation of energy performance of residential buildings using statistical machine 

In [None]:
import pandas as pd
from sklearn.utils import shuffle

X_df = pd.DataFrame(X)
y_df = pd.DataFrame(y)

data = pd.concat([X_df, y_df], axis=1)
data.sample(frac=1)
data.head()

print(data)

       X1     X2     X3      X4   X5  X6   X7  X8     Y1     Y2
0    0.98  514.5  294.0  110.25  7.0   2  0.0   0  15.55  21.33
1    0.98  514.5  294.0  110.25  7.0   3  0.0   0  15.55  21.33
2    0.98  514.5  294.0  110.25  7.0   4  0.0   0  15.55  21.33
3    0.98  514.5  294.0  110.25  7.0   5  0.0   0  15.55  21.33
4    0.90  563.5  318.5  122.50  7.0   2  0.0   0  20.84  28.28
..    ...    ...    ...     ...  ...  ..  ...  ..    ...    ...
763  0.64  784.0  343.0  220.50  3.5   5  0.4   5  17.88  21.40
764  0.62  808.5  367.5  220.50  3.5   2  0.4   5  16.54  16.88
765  0.62  808.5  367.5  220.50  3.5   3  0.4   5  16.44  17.11
766  0.62  808.5  367.5  220.50  3.5   4  0.4   5  16.48  16.61
767  0.62  808.5  367.5  220.50  3.5   5  0.4   5  16.64  16.03

[768 rows x 10 columns]


In [None]:
data.columns

Index(['X1', 'X2', 'X3', 'X4', 'X5', 'X6', 'X7', 'X8', 'Y1', 'Y2'], dtype='object')

In [None]:
import numpy as np
import random as rd
from sklearn import preprocessing
from sklearn.model_selection import KFold
from sklearn import svm

In [None]:
# original data
x_org_data = pd.DataFrame(data,columns=['X1', 'X2', 'X3', 'X4',
                                        'X5', 'X6', 'X7', 'X8'])
y = pd.DataFrame(data,columns=["Y1"]).values #output

x_with_dummies = pd.get_dummies(x_org_data,columns=["X6","X8"])
var_prep = preprocessing.MinMaxScaler()

x = var_prep.fit_transform(x_with_dummies)
y = pd.DataFrame(data,columns=["Y1"]).values
data_count = len(x)
print("number of obsrvations:",data_count)

number of obsrvations: 768


In [None]:
# hyperparameters (user inputted parameters)
prob_crsvr = 1 # probablity of crossover
prob_mutation = 0.3 # probablity of mutation
population = 10 # population number
generations = 50 # generation number

kfold = 3

In [None]:
# calculate fitness value for the chromosome of 0s and 1s
def objective_value(x,y,chromosome,kfold=3):

    # x = c
    lb_x = 10 # lower bound for chromosome x
    ub_x = 1000 # upper bound for chromosome x
    len_x = (len(chromosome)//2) # length of chromosome x

    # y = gamma
    lb_y = 0.05 # lower bound for chromosome y
    ub_y = 0.99 # upper bound for chromosome y
    len_y = (len(chromosome)//2) # length of chromosome y

    precision_x = (ub_x-lb_x)/((2**len_x)-1) # precision for decoding x
    precision_y = (ub_y-lb_y)/((2**len_y)-1) # precision for decoding y

    z = 0 # because we start at 2^0, in the formula
    t = 1 # because we start at the very last element of the vector [index -1]
    x_bit_sum = 0 # initiation (sum(bit)*2^i is 0 at first)
    for i in range(len(chromosome)//2):
        x_bit = chromosome[-t]*(2**z)
        x_bit_sum = x_bit_sum + x_bit
        t = t+1
        z = z+1

    z = 0 # because we start at 2^0, in the formula
    t = 1 + (len(chromosome)//2) # [6,8,3,9] (first 2 are y, so index will be 1+2 = -3)
    y_bit_sum = 0 # initiation (sum(bit)*2^i is 0 at first)
    for j in range(len(chromosome)//2):
        y_bit = chromosome[-t]*(2**z)
        y_bit_sum = y_bit_sum + y_bit
        t = t+1
        z = z+1

    # the formulas to decode the chromosome of 0s and 1s to an actual number, the value of x or y
    c_hyperparameter = (x_bit_sum*precision_x)+lb_x
    gamma_hyperparameter = (y_bit_sum*precision_y)+lb_y


    kf = KFold(n_splits=kfold)

    # objective function value for the decoded x and decoded y
    sum_of_error = 0
    for train_index,test_index in kf.split(x):
        x_train,x_test = x[train_index],x[test_index]
        y_train,y_test = y[train_index],y[test_index]

        model = svm.SVR(kernel="rbf",
                        C=c_hyperparameter,
                        gamma=gamma_hyperparameter)
        model.fit(x_train,np.ravel(y_train))

        accuracy = model.score(x_test,y_test)
        error = 1-(accuracy)
        sum_of_error += error

    avg_error = sum_of_error/kfold

    # the defined function will return 3 values
    return c_hyperparameter,gamma_hyperparameter,avg_error




In [None]:
# finding 2 parents from the pool of solutions
# using the tournament selection method
def find_parents_ts(all_solutions,x,y):

    # make an empty array to place the selected parents
    parents = np.empty((0,np.size(all_solutions,1)))

    for i in range(2): # do the process twice to get 2 parents

        # select 3 random parents from the pool of solutions you have

        # get 3 random integers
        indices_list = np.random.choice(len(all_solutions),3,
                                        replace=False)

        # get the 3 possible parents for selection
        posb_parent_1 = all_solutions[indices_list[0]]
        posb_parent_2 = all_solutions[indices_list[1]]
        posb_parent_3 = all_solutions[indices_list[2]]


        # get objective function value (fitness) for each possible parent
        # index no.2 because the objective_value function gives the fitness value at index no.2
        obj_func_parent_1 = objective_value(x=x,y=y,
                                            chromosome=posb_parent_1)[2] # possible parent 1
        obj_func_parent_2 = objective_value(x=x,y=y,
                                            chromosome=posb_parent_2)[2] # possible parent 2
        obj_func_parent_3 = objective_value(x=x,y=y,
                                            chromosome=posb_parent_3)[2] # possible parent 3


        # find which parent is the best
        min_obj_func = min(obj_func_parent_1,obj_func_parent_2,
                           obj_func_parent_3)

        if min_obj_func == obj_func_parent_1:
            selected_parent = posb_parent_1
        elif min_obj_func == obj_func_parent_2:
            selected_parent = posb_parent_2
        else:
            selected_parent = posb_parent_3

        # put the selected parent in the empty array we created above
        parents = np.vstack((parents,selected_parent))

    parent_1 = parents[0,:] # parent_1, first element in the array
    parent_2 = parents[1,:] # parent_2, second element in the array

    return parent_1,parent_2 # the defined function will return 2 arrays



In [None]:
# crossover between the 2 parents to create 2 children
# functions inputs are parent_1, parent_2, and the probability you would like for crossover
# default probability of crossover is 1
def crossover(parent_1,parent_2,prob_crsvr=1):

    child_1 = np.empty((0,len(parent_1)))
    child_2 = np.empty((0,len(parent_2)))


    rand_num_to_crsvr_or_not = np.random.rand() # do we crossover or no???

    if rand_num_to_crsvr_or_not < prob_crsvr:
        index_1 = np.random.randint(0,len(parent_1))
        index_2 = np.random.randint(0,len(parent_1))

        # get different indices
        # to make sure you will crossover at least one gene
        while index_1 == index_2:
            index_2 = np.random.randint(0,len(parent_1))

        index_parent_1 = min(index_1,index_2)
        index_parent_2 = max(index_1,index_2)


        ### FOR PARENT_1 ###

        # first_seg_parent_1 -->
        # for parent_1: the genes from the beginning of parent_1 to the
                # beginning of the middle segment of parent_1
        first_seg_parent_1 = parent_1[:index_parent_1]

        # middle segment; where the crossover will happen
        # for parent_1: the genes from the index chosen for parent_1 to
                # the index chosen for parent_2
        mid_seg_parent_1 = parent_1[index_parent_1:index_parent_2+1]

        # last_seg_parent_1 -->
        # for parent_1: the genes from the end of the middle segment of
                # parent_1 to the last gene of parent_1
        last_seg_parent_1 = parent_1[index_parent_2+1:]


        ### FOR PARENT_2 ###

        # first_seg_parent_2 --> same as parent_1
        first_seg_parent_2 = parent_2[:index_parent_1]

        # mid_seg_parent_2 --> same as parent_1
        mid_seg_parent_2 = parent_2[index_parent_1:index_parent_2+1]

        # last_seg_parent_2 --> same as parent_1
        last_seg_parent_2 = parent_2[index_parent_2+1:]


        ### CREATING CHILD_1 ###

        # the first segmant from parent_1
        # plus the middle segment from parent_2
        # plus the last segment from parent_1
        child_1 = np.concatenate((first_seg_parent_1,mid_seg_parent_2,
                                  last_seg_parent_1))


        ### CREATING CHILD_2 ###

        # the first segmant from parent_2
        # plus the middle segment from parent_1
        # plus the last segment from parent_2
        child_2 = np.concatenate((first_seg_parent_2,mid_seg_parent_1,
                                  last_seg_parent_2))


    # when we will not crossover
    # when rand_num_to_crsvr_or_not is NOT less (is greater) than prob_crsvr
    # when prob_crsvr == 1, then rand_num_to_crsvr_or_not will always be less
            # than prob_crsvr, so we will always crossover then
    else:
        child_1 = parent_1
        child_2 = parent_2

    return child_1,child_2 # the defined function will return 2 arrays

In [None]:
# mutation for the 2 children
# functions inputs are child_1, child_2, and the probability you would like for mutation
# default probability of mutation is 0.2
def mutation(child_1,child_2,prob_mutation=0.2):

    # mutated_child_1
    mutated_child_1 = np.empty((0,len(child_1)))

    t = 0 # start at the very first index of child_1
    for i in child_1: # for each gene (index)

        rand_num_to_mutate_or_not = np.random.rand() # do we mutate or no???

        # if the rand_num_to_mutate_or_not is less that the probability of mutation
                # then we mutate at that given gene (index we are currently at)
        if rand_num_to_mutate_or_not < prob_mutation:
            child_1[t] = abs(child_1[t]-1) # if we mutate, a 1 becomes a 0 and a 0 becomes a 1
            mutated_child_1 = child_1
            t = t+1
        else:
            mutated_child_1 = child_1

            t = t+1


    # mutated_child_2
    # same process as mutated_child_1
    mutated_child_2 = np.empty((0,len(child_2)))

    t = 0
    for i in child_2:

        rand_num_to_mutate_or_not = np.random.rand() # prob. to mutate

        if rand_num_to_mutate_or_not < prob_mutation:
            child_2[t] = abs(child_2[t]-1) # if we mutate, a 1 becomes a 0 and a 0 becomes a 1
            mutated_child_1 = child_1
            t = t+1
        else:
            mutated_child_2 = child_2

            t = t+1

    return mutated_child_1,mutated_child_2 # the defined function will return 2 arrays


# **Optimizing the SVM**

In [None]:
# x and y decision variables' encoding
# 12 genes for x and 12 genes for y (arbitrary number)
x_y_string = np.array([0,1,0,0,0,1,0,0,1,0,0,1,
                       0,1,1,1,0,0,1,0,1,1,1,0]) # initial solution


# create an empty array to put initial population
pool_of_solutions = np.empty((0,len(x_y_string)))


# create an empty array to store a solution from each generation
# for each generation, we want to save the best solution in that generation
# to compare with the convergence of the algorithm
best_of_a_generation = np.empty((0,len(x_y_string)+1))

In [None]:
# shuffle the elements in the initial solution (vector)
# shuffle n times, where n is the no. of the desired population
for i in range(population):
    rd.shuffle(x_y_string)
    pool_of_solutions = np.vstack((pool_of_solutions,x_y_string))


# so now, pool_of_solutions, has n (population) chromosomes
print(pool_of_solutions, len(pool_of_solutions))

[[1. 0. 1. 1. 1. 0. 1. 0. 1. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 1. 1. 0. 0. 1.]
 [1. 0. 1. 0. 0. 1. 0. 0. 1. 0. 0. 0. 1. 1. 0. 0. 0. 1. 1. 0. 0. 1. 1. 1.]
 [1. 0. 1. 1. 0. 1. 0. 0. 0. 1. 0. 1. 0. 1. 0. 0. 0. 1. 1. 1. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0. 1. 0. 1. 1. 0. 0. 0. 0. 0. 1. 1. 1. 0. 1. 0. 1. 1. 1. 0.]
 [0. 0. 1. 1. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 1. 1. 0. 0. 1. 1. 1. 1. 1. 0.]
 [0. 0. 1. 0. 1. 0. 1. 1. 1. 1. 0. 0. 0. 0. 1. 0. 0. 1. 1. 1. 0. 1. 0. 0.]
 [1. 0. 1. 0. 1. 0. 0. 1. 1. 0. 1. 1. 0. 1. 0. 0. 0. 0. 0. 1. 1. 0. 1. 0.]
 [0. 1. 1. 1. 1. 0. 1. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0. 1. 1. 0. 0. 1. 1.]
 [0. 0. 0. 0. 1. 0. 1. 0. 1. 0. 0. 0. 1. 0. 1. 0. 0. 1. 1. 1. 1. 0. 1. 1.]
 [0. 0. 1. 1. 0. 0. 0. 1. 1. 1. 0. 0. 1. 0. 0. 1. 0. 1. 0. 1. 1. 1. 0. 0.]] 10


In [None]:
gen = 1 # we start at generation no.1 (tracking purposes)

for i in range(generations): # do it n (generation) times

    # an empty array for saving the new generation
    # at the beginning of each generation, the array should be empty
    # so that you put all the solutions created in a certain generation
    new_population = np.empty((0,len(x_y_string)))

    # an empty array for saving the new generation plus its obj func val
    new_population_with_obj_val = np.empty((0,len(x_y_string)+1))

    # an empty array for saving the best solution (chromosome)
    # for each generation
    sorted_best = np.empty((0,len(x_y_string)+1))

    print()
    print()
    print("--> Generation: #", gen) # tracking purposes


    family = 1 # we start at family no.1 (tracking purposes)


    for j in range(int(population/2)): # population/2 because each gives 2 parents

        print()
        print("--> Family: #", family) # tracking purposes


        # selecting 2 parents using tournament selection
        # "genf.find_parents_ts"[0] gives parent_1
        # "genf.find_parents_ts"[1] gives parent_2
        parent_1 = find_parents_ts(pool_of_solutions,
                                              x=x,y=y)[0]
        parent_2 = find_parents_ts(pool_of_solutions,
                                              x=x,y=y)[1]


        # crossover the 2 parents to get 2 children
        # "genf.crossover"[0] gives child_1
        # "genf.crossover"[1] gives child_2
        child_1 = crossover(parent_1,parent_2,
                               prob_crsvr=prob_crsvr)[0]
        child_2 = crossover(parent_1,parent_2,
                               prob_crsvr=prob_crsvr)[1]


        # mutating the 2 children to get 2 mutated children
        # "genf.mutation"[0] gives mutated_child_1
        # "genf.mutation"[1] gives mutated_child_2
        mutated_child_1 = mutation(child_1,child_2,
                                      prob_mutation=prob_mutation)[0]
        mutated_child_2 = mutation(child_1,child_2,
                                      prob_mutation=prob_mutation)[1]


        # getting the obj val (fitness value) for the 2 mutated children
        # "genf.objective_value"[2] gives obj val for the mutated child
        obj_val_mutated_child_1 = objective_value(x=x,y=y,
                                                             chromosome=mutated_child_1,
                                                             kfold=kfold)[2]
        obj_val_mutated_child_2 = objective_value(x=x,y=y,
                                                             chromosome=mutated_child_2,
                                                             kfold=kfold)[2]


        # for each mutated child, put its obj val next to it
        mutant_1_with_obj_val = np.hstack((obj_val_mutated_child_1,
                                               mutated_child_1)) # lines 132 and 140

        mutant_2_with_obj_val = np.hstack((obj_val_mutated_child_2,
                                               mutated_child_2)) # lines 134 and 143


        # we need to create the new population for the next generation
        # so for each family, we get 2 solutions
        # we keep on adding them till we are done with all the families in one generation
        # by the end of each generation, we should have the same number as the initial population
        # so this keeps on growing and growing
        # when it's a new generation, this array empties and we start the stacking process
        # and so on
        # check line 88
        new_population = np.vstack((new_population,
                                    mutated_child_1,
                                    mutated_child_2))


        # same explanation as above, but we include the obj val for each solution as well
        # check line 91
        new_population_with_obj_val = np.vstack((new_population_with_obj_val,
                                                 mutant_1_with_obj_val,
                                                 mutant_2_with_obj_val))


        # after getting 2 mutated children (solutions), we get another 2, and so on
        # until we have the same number of the intended population
        # then we go to the next generation and start over
        # since we ended up with 2 solutions, we move on to the next possible solutions
        family = family+1

    # we replace the initial (before) population with the new one (current generation)
    # this new pool of solutions becomes the starting population of the next generation
    pool_of_solutions = new_population


    # for each generation
    # we want to find the best solution in that generation
    # so we sort them based on index [0], which is the obj val
    sorted_best = np.array(sorted(new_population_with_obj_val,
                                               key=lambda x:x[0]))


    # since we sorted them from best to worst
    # the best in that generation would be the first solution in the array
    # so index [0] of the "sorted_best" array
    best_of_a_generation = np.vstack((best_of_a_generation,
                                      sorted_best[0]))


    # increase the counter of generations (tracking purposes)
    gen = gen+1



--> Generation: # 1

--> Family: # 1

--> Family: # 2

--> Family: # 3

--> Family: # 4

--> Family: # 5


--> Generation: # 2

--> Family: # 1

--> Family: # 2

--> Family: # 3

--> Family: # 4

--> Family: # 5


--> Generation: # 3

--> Family: # 1

--> Family: # 2

--> Family: # 3

--> Family: # 4

--> Family: # 5


--> Generation: # 4

--> Family: # 1

--> Family: # 2

--> Family: # 3

--> Family: # 4

--> Family: # 5


--> Generation: # 5

--> Family: # 1

--> Family: # 2

--> Family: # 3

--> Family: # 4

--> Family: # 5


--> Generation: # 6

--> Family: # 1

--> Family: # 2

--> Family: # 3

--> Family: # 4

--> Family: # 5


--> Generation: # 7

--> Family: # 1

--> Family: # 2

--> Family: # 3

--> Family: # 4

--> Family: # 5


--> Generation: # 8

--> Family: # 1

--> Family: # 2

--> Family: # 3

--> Family: # 4

--> Family: # 5


--> Generation: # 9

--> Family: # 1

--> Family: # 2

--> Family: # 3

--> Family: # 4

--> Family: # 5


--> Generation: # 10

--> Family: #

KeyboardInterrupt: 

In [None]:
# for our very last generation, we have the last population
# for this array of last population (convergence), there is a best solution
# so we sort them from best to worst
sorted_last_population = np.array(sorted(new_population_with_obj_val,
                                         key=lambda x:x[0]))

sorted_best_of_a_generation = np.array(sorted(best_of_a_generation,
                                         key=lambda x:x[0]))

sorted_last_population[:,0] = 1-(sorted_last_population[:,0]) # get accuracy instead of error
sorted_best_of_a_generation[:,0] = 1-(sorted_best_of_a_generation[:,0])

# since we sorted them from best to worst
# the best would be the first solution in the array
# so index [0] of the "sorted_last_population" array
best_string_convergence = sorted_last_population[0]

best_string_overall = sorted_best_of_a_generation[0]

print()
#print()
#print("Execution Time in Minutes:",(end_time - start_time)/60) # exec. time


print()
print()
print("------------------------------")
print()
#print("Execution Time in Seconds:",end_time - start_time) # exec. time
#print()
print("Final Solution (Convergence):",best_string_convergence[1:]) # final solution entire chromosome
print("Encoded C (Convergence):",best_string_convergence[1:14]) # final solution x chromosome
print("Encoded Gamma (Convergence):",best_string_convergence[14:]) # final solution y chromosome
print()
print("Final Solution (Best):",best_string_overall[1:]) # final solution entire chromosome
print("Encoded C (Best):",best_string_overall[1:14]) # final solution x chromosome
print("Encoded Gamma (Best):",best_string_overall[14:]) # final solution y chromosome

# to decode the x and y chromosomes to their real values
final_solution_convergence = objective_value(x=x,y=y,
                                                        chromosome=best_string_convergence[1:],
                                                        kfold=kfold)

final_solution_overall = objective_value(x=x,y=y,
                                                    chromosome=best_string_overall[1:],
                                                    kfold=kfold)

# the "objective_value" function returns 3 things -->
# [0] is the x value
# [1] is the y value
# [2] is the obj val for the chromosome (avg. error)
print()
print("Decoded C (Convergence):",round(final_solution_convergence[0],5)) # real value of x
print("Decoded Gamma (Convergence):",round(final_solution_convergence[1],5)) # real value of y
print("Obj Value - Convergence:",round(1-(final_solution_convergence[2]),5)) # obj val of final chromosome
print()
print("Decoded C (Best):",round(final_solution_overall[0],5)) # real value of x
print("Decoded Gamma (Best):",round(final_solution_overall[1],5)) # real value of y
print("Obj Value - Best in Generations:",round(1-(final_solution_overall[2]),5)) # obj val of final chromosome
print()
print("------------------------------")





------------------------------

Final Solution (Convergence): [0. 0. 0. 0. 1. 0. 0. 1. 0. 0. 0. 1. 1. 0. 1. 1. 0. 1. 0. 0. 1. 1. 0. 1.]
Encoded C (Convergence): [0. 0. 0. 0. 1. 0. 0. 1. 0. 0. 0. 1. 1.]
Encoded Gamma (Convergence): [0. 1. 1. 0. 1. 0. 0. 1. 1. 0. 1.]

Final Solution (Best): [0. 0. 0. 0. 0. 1. 1. 1. 0. 1. 0. 1. 1. 1. 1. 1. 0. 1. 1. 0. 1. 1. 0. 0.]
Encoded C (Best): [0. 0. 0. 0. 0. 1. 1. 1. 0. 1. 0. 1. 1.]
Encoded Gamma (Best): [1. 1. 1. 0. 1. 1. 0. 1. 1. 0. 0.]

Decoded C (Convergence): 709.40659
Decoded Gamma (Convergence): 0.08328
Obj Value - Convergence: 0.91848

Decoded C (Best): 964.46154
Decoded Gamma (Best): 0.07686
Obj Value - Best in Generations: 0.91887

------------------------------


# **Optimizing the Multilayer Perception Neural Networks**

In [None]:
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import random as rd
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn import preprocessing
from sklearn.neural_network import MLPRegressor

# Loading the data, shuffling and preprocessing it

from ucimlrepo import fetch_ucirepo

energy_efficiency = fetch_ucirepo(id=242)

X = energy_efficiency.data.features
y = energy_efficiency.data.targets
X_DF = pd.DataFrame(X)
y_DF = pd.DataFrame(y)

Data = pd.concat([X_DF, y_DF], axis=1)
Data.sample(frac=1)
X1 = pd.DataFrame(Data, columns = ['X1','X2','X3','X4','X5','X6','X7','X8'])


Y1 = pd.DataFrame(Data, columns = ['Y1']).values
Y2 = pd.DataFrame(Data, columns = ['Y2']).values

Xbef = pd.get_dummies(X1,columns = ['X6','X8']).values

min_max_scaler = preprocessing.MinMaxScaler()
X = min_max_scaler.fit_transform(Xbef)

Y = Y1[:,0]

Cnt1 = len(X)

# 10, 4, 0, 1, 0, 1, 1, 0, 0

### The solver has no crossover because mutation is enough, since it only has two values
### VARIABLES ###
### VARIABLES ###
p_c_con = 1 # Probability of crossover
p_c_comb = 0.3 # Probability of crossover for integers
p_m_con = 0.1 # Probability of mutation
p_m_comb = 0.2 # Probability of mutation for integers
p_m_solver = 0.3 # Probability of mutation for the solver
# - > We have adam and sgd but sgd is not working so we always work with "adam"
K = 3 # For Tournament selection
pop = 20 # Population per generation
gen = 10 # Number of generations
ii2 = 3 # Number of K
### VARIABLES ###
### VARIABLES ###


### Combinatorial ###
UB_X1 = 10 # X1, Number of Neurons
LB_X1 = 6
UB_X2 = 8 # X2, Number of Hidden Layers
LB_X2 = 3


In [None]:
# Where the first 15 represent X3 and the second 15 represent X4
XY0 = np.array([1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1,
                1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1]) # Initial solution

Init_Sol = XY0.copy()

n_list = np.empty((0,len(XY0)+2))
n_list_ST = np.empty((0,len(XY0)+2))
Sol_Here = np.empty((0,len(XY0)+2))
Sol_Here_ST = np.empty((0,1))

Solver_Type = ['adam']

In [None]:
for i in range(pop): # Shuffles the elements in the vector n times and stores them
    ST = rd.choice(Solver_Type)
    X1 = rd.randrange(6,10,2)
    X2 = rd.randrange(3,8,1)
    rd.shuffle(XY0)
    Sol_Here = np.append((X1,X2),XY0)
    n_list_ST = np.append(n_list_ST,ST)
    n_list = np.vstack((n_list,Sol_Here))


In [None]:
n_list_ST

array(['adam', 'adam', 'adam', 'adam', 'adam', 'adam', 'adam', 'adam',
       'adam', 'adam', 'adam', 'adam', 'adam', 'adam', 'adam', 'adam',
       'adam', 'adam', 'adam', 'adam'], dtype='<U32')

In [None]:
# Calculating fitness value

# X3 = Learning Rate
a_X3 = 0.01 # Lower bound of X
b_X3 = 0.3 # Upper bound of X
l_X3 = (len(XY0)/2) # Length of Chrom. X

# X4 = Momentum
a_X4 = 0.01 # Lower bound of Y
b_X4 = 0.99 # Upper bound of Y
l_X4 = (len(XY0)/2) # Length of Chrom. Y


Precision_X = (b_X3 - a_X3)/((2**l_X3)-1)

Precision_Y = (b_X4 - a_X4)/((2**l_X4)-1)


In [None]:
z = 0
t = 1
X0_num_Sum = 0

for i in range(len(XY0)//2):
    X0_num = XY0[-t]*(2**z)
    X0_num_Sum += X0_num
    t = t+1
    z = z+1


p = 0
u = 1 + (len(XY0)//2)
Y0_num_Sum = 0

for j in range(len(XY0)//2):
    Y0_num = XY0[-u]*(2**p)
    Y0_num_Sum += Y0_num
    u = u+1
    p = p+1


Decoded_X3 = (X0_num_Sum * Precision_X) + a_X3
Decoded_X4 = (Y0_num_Sum * Precision_Y) + a_X4


print()
print("Decoded_X3:",Decoded_X3)
print("Decoded_X4:",Decoded_X4)



Decoded_X3: 0.1390560625019074
Decoded_X4: 0.7091326639606922


In [None]:
For_Plotting_the_Best = np.empty((0,len(Sol_Here)+1))

One_Final_Guy = np.empty((0,len(Sol_Here)+2))
One_Final_Guy_Final = []

Min_for_all_Generations_for_Mut_1 = np.empty((0,len(Sol_Here)+1))
Min_for_all_Generations_for_Mut_2 = np.empty((0,len(Sol_Here)+1))

Min_for_all_Generations_for_Mut_1_1 = np.empty((0,len(Sol_Here)+2))
Min_for_all_Generations_for_Mut_2_2 = np.empty((0,len(Sol_Here)+2))

Min_for_all_Generations_for_Mut_1_1_1 = np.empty((0,len(Sol_Here)+2))
Min_for_all_Generations_for_Mut_2_2_2 = np.empty((0,len(Sol_Here)+2))



In [None]:
Generation = 1

for i in range(gen):
    New_Population = np.empty((0,len(Sol_Here))) # Saving the new generation

    All_in_Generation_X_1 = np.empty((0,len(Sol_Here)+1))
    All_in_Generation_X_2 = np.empty((0,len(Sol_Here)+1))

    Min_in_Generation_X_1 = []
    Min_in_Generation_X_2 = []

    Save_Best_in_Generation_X = np.empty((0,len(Sol_Here)+1))
    Final_Best_in_Generation_X = []
    Worst_Best_in_Generation_X = []

    print()
    print("--> GENERATION: #",Generation)

    Family = 1
    for j in range(int(pop/2)): # range(int(pop/2))
        print()
        print("--> FAMILY: #",Family)
        # Tournament Selection
        Parents = np.empty((0,len(Sol_Here)))

        for i in range(2):
            Battle_Troops = []
            Warrior_1_index = np.random.randint(0,len(n_list))
            Warrior_2_index = np.random.randint(0,len(n_list))
            Warrior_3_index = np.random.randint(0,len(n_list))
            while Warrior_1_index == Warrior_2_index:
                Warrior_1_index = np.random.randint(0,len(n_list))
            while Warrior_2_index == Warrior_3_index:
                    Warrior_3_index = np.random.randint(0,len(n_list))
            while Warrior_1_index == Warrior_3_index:
                    Warrior_3_index = np.random.randint(0,len(n_list))

            Warrior_1 = n_list[Warrior_1_index,:]
            Warrior_2 = n_list[Warrior_2_index,:]
            Warrior_3 = n_list[Warrior_3_index,:]

            Battle_Troops = [Warrior_1,Warrior_2,Warrior_3]

            # For Warrior #1
            W1_Comb_1 = Warrior_1[0]
            W1_Comb_1 = int(W1_Comb_1)
            W1_Comb_2 = Warrior_1[1]
            W1_Comb_2 = int(W1_Comb_2)

            W1_Con = Warrior_1[2:]

            X0_num_Sum_W1 = 0
            Y0_num_Sum_W1 = 0

            z = 0
            t = 1
            OF_So_Far_W1 = 0

            for i in range(len(XY0)//2):
                X0_num_W1 = W1_Con[-t]*(2**z)
                X0_num_Sum_W1 += X0_num_W1
                t = t+1
                z = z+1

            p = 0
            u = 1 + (len(XY0)//2)

            for j in range(len(XY0)//2):
                Y0_num_W1 = W1_Con[-u]*(2**p)
                Y0_num_Sum_W1 += Y0_num_W1
                u = u+1
                p = p+1


            Decoded_X3_W1 = (X0_num_Sum_W1 * Precision_X) + a_X3
            Decoded_X4_W1 = (Y0_num_Sum_W1 * Precision_Y) + a_X4

            Emp_3 = 0

            kf = KFold(n_splits=ii2, shuffle=True, random_state=42)

            for train_index, test_index in kf.split(X):
                X_train, X_test = X[train_index], X[test_index]
                Y_train, Y_test = Y[train_index], Y[test_index]

                Hid_Lay = ()

                # Objective Function
                for i in range(W1_Comb_2):
                    Hid_Lay = Hid_Lay + (W1_Comb_1,)

                model1 = MLPRegressor(activation='relu',hidden_layer_sizes=Hid_Lay,
                                       learning_rate_init=Decoded_X3_W1,momentum=Decoded_X4_W1)

                model1.fit(X_train, Y_train)
                PL1=model1.predict(X_test)
                AC1=model1.score(X_test,Y_test)

                OF_So_Far_3 = 1-(model1.score(X_test,Y_test))


                Emp_3 += OF_So_Far_3

            OF_So_Far_W1 = Emp_3/ii2
            Prize_Warrior_1 = OF_So_Far_W1

            # For Warrior #2

            W2_Comb_1 = Warrior_2[0]
            W2_Comb_1 = int(W2_Comb_1)
            W2_Comb_2 = Warrior_2[1]
            W2_Comb_2 = int(W2_Comb_2)
            W2_Con = Warrior_2[2:]
            X0_num_Sum_W2 = 0
            Y0_num_Sum_W2 = 0

            z = 0
            t = 1
            OF_So_Far_W2 = 0

            for i in range(len(XY0)//2):
                X0_num_W2 = W2_Con[-t]*(2**z)
                X0_num_Sum_W2 += X0_num_W2
                t = t+1
                z = z+1
            p = 0
            u = 1 + (len(XY0)//2)

            for j in range(len(XY0)//2):
                Y0_num_W2 = W2_Con[-u]*(2**p)
                Y0_num_Sum_W2 += Y0_num_W2
                u = u+1
                p = p+1
            Decoded_X3_W2 = (X0_num_Sum_W2 * Precision_X) + a_X3
            Decoded_X4_W2 = (Y0_num_Sum_W2 * Precision_Y) + a_X4
            Emp_4 = 0

            kf = KFold(n_splits=ii2, shuffle=True, random_state=42)
            for train_index, test_index in kf.split(X):
                X_train, X_test = X[train_index], X[test_index]
                Y_train, Y_test = Y[train_index], Y[test_index]

                Hid_Lay = ()

                # Objective Function
                for i in range(W2_Comb_2):
                    Hid_Lay = Hid_Lay + (W2_Comb_1,)

                model1 = MLPRegressor(activation='relu',hidden_layer_sizes=Hid_Lay,
                                       learning_rate_init=Decoded_X3_W2,momentum=Decoded_X4_W2)

                model1.fit(X_train, Y_train)
                PL1=model1.predict(X_test)
                AC1=model1.score(X_test,Y_test)

                OF_So_Far_4 = 1-(model1.score(X_test,Y_test))

                Emp_4 += OF_So_Far_4

            OF_So_Far_W2 = Emp_4/ii2
            Prize_Warrior_2 = OF_So_Far_W2

            # For Warrior #3
            W3_Comb_1 = Warrior_3[0]
            W3_Comb_1 = int(W3_Comb_1)
            W3_Comb_2 = Warrior_3[1]
            W3_Comb_2 = int(W3_Comb_2)
            W3_Con = Warrior_3[2:]
            X0_num_Sum_W3 = 0
            Y0_num_Sum_W3 = 0

            z = 0
            t = 1
            OF_So_Far_W3 = 0

            for i in range(len(XY0)//2):
                X0_num_W3 = W3_Con[-t]*(2**z)
                X0_num_Sum_W3 += X0_num_W3
                t = t+1
                z = z+1
            p = 0
            u = 1 + (len(XY0)//2)

            for j in range(len(XY0)//2):
                Y0_num_W3 = W3_Con[-u]*(2**p)
                Y0_num_Sum_W3 += Y0_num_W3
                u = u+1
                p = p+1
            Decoded_X3_W3 = (X0_num_Sum_W3 * Precision_X) + a_X3
            Decoded_X4_W3 = (Y0_num_Sum_W3 * Precision_Y) + a_X4
            Emp_5 = 0
            kf = KFold(n_splits=ii2, shuffle=True, random_state=42)
            for train_index, test_index in kf.split(X):
                X_train, X_test = X[train_index], X[test_index]
                Y_train, Y_test = Y[train_index], Y[test_index]
                Hid_Lay = ()
                # Objective Function
                for i in range(W3_Comb_2):
                    Hid_Lay = Hid_Lay + (W3_Comb_1,)

                model1 = MLPRegressor(activation='relu',hidden_layer_sizes=Hid_Lay,
                                       learning_rate_init=Decoded_X3_W3,momentum=Decoded_X4_W3)
                model1.fit(X_train, Y_train)
                PL1=model1.predict(X_test)
                AC1=model1.score(X_test,Y_test)

                OF_So_Far_5 = 1-(model1.score(X_test,Y_test))
                Emp_5 += OF_So_Far_5

            OF_So_Far_W3 = Emp_5/ii2
            Prize_Warrior_3 = OF_So_Far_W3
            Prize_Warrior_3 = OF_So_Far_W3

            if Prize_Warrior_1 == min(Prize_Warrior_1,Prize_Warrior_2,Prize_Warrior_3):
                Winner = Warrior_1
                Winner_str = "Warrior_1"
                Prize = Prize_Warrior_1
            elif Prize_Warrior_2 == min(Prize_Warrior_1,Prize_Warrior_2,Prize_Warrior_3):
                Winner = Warrior_2
                Winner_str = "Warrior_2"
                Prize = Prize_Warrior_2
            else:
                Winner = Warrior_3
                Winner_str = "Warrior_3"
                Prize = Prize_Warrior_3
            Parents = np.vstack((Parents,Winner))
        Parent_1 = Parents[0]
        Parent_2 = Parents[1]

        # Crossover
        Child_1 = np.empty((0,len(Sol_Here)))
        Child_2 = np.empty((0,len(Sol_Here)))

        # Crossover the Integers
        # For X1
        Ran_CO_1 = np.random.rand()
        if Ran_CO_1 < p_c_comb:
            # For X1
            Int_X1_1 = Parent_2[0]
            Int_X1_2 = Parent_1[0]
        else:
            # For X1
            Int_X1_1 = Parent_1[0]
            Int_X1_2 = Parent_2[0]
        # For X2
        Ran_CO_1 = np.random.rand()
        if Ran_CO_1 < p_c_comb:
            Int_X2_1 = Parent_2[1]
            Int_X2_2 = Parent_1[1]
        else:
            Int_X2_1 = Parent_1[1]
            Int_X2_2 = Parent_2[1]

        # Where to crossover
        # Two-point crossover
        Ran_CO_1 = np.random.rand()
        if Ran_CO_1 < p_c_con:

            Cr_1 = np.random.randint(2,len(Sol_Here))
            Cr_2 = np.random.randint(2,len(Sol_Here))

            while Cr_1 == Cr_2:
                Cr_2 = np.random.randint(2,len(Sol_Here))
            if Cr_1 < Cr_2:
                Cr_2 = Cr_2 + 1

                Copy_1 = Parent_1[2:]
                Mid_Seg_1 = Parent_1[Cr_1:Cr_2]

                Copy_2 = Parent_2[2:]
                Mid_Seg_2 = Parent_2[Cr_1:Cr_2]

                First_Seg_1 = Parent_1[2:Cr_1]
                Second_Seg_1 = Parent_1[Cr_2:]

                First_Seg_2 = Parent_2[2:Cr_1]
                Second_Seg_2 = Parent_2[Cr_2:]

                Child_1 = np.concatenate((First_Seg_1,Mid_Seg_2,Second_Seg_1))
                Child_2 = np.concatenate((First_Seg_2,Mid_Seg_1,Second_Seg_2))

                Child_1 = np.insert(Child_1,0,(Int_X1_1,Int_X2_1))###
                Child_2 = np.insert(Child_2,0,(Int_X1_2,Int_X2_2))
            else:
                Cr_1 = Cr_1 + 1

                Copy_1 = Parent_1[2:]
                Mid_Seg_1 = Parent_1[Cr_2:Cr_1]

                Copy_2 = Parent_2[2:]
                Mid_Seg_2 = Parent_2[Cr_2:Cr_1]

                First_Seg_1 = Parent_1[2:Cr_2]
                Second_Seg_1 = Parent_1[Cr_1:]

                First_Seg_2 = Parent_2[2:Cr_2]
                Second_Seg_2 = Parent_2[Cr_1:]

                Child_1 = np.concatenate((First_Seg_1,Mid_Seg_2,Second_Seg_1))
                Child_2 = np.concatenate((First_Seg_2,Mid_Seg_1,Second_Seg_2))
                Child_1 = np.insert(Child_1,0,(Int_X1_1,Int_X2_1))
                Child_2 = np.insert(Child_2,0,(Int_X1_2,Int_X2_2))
        else:
            Child_1 = Parent_1[2:]
            Child_2 = Parent_2[2:]
            Child_1 = np.insert(Child_1,0,(Int_X1_1,Int_X2_1))
            Child_2 = np.insert(Child_2,0,(Int_X1_2,Int_X2_2))
        # Mutation Child #1
        Mutated_Child_1 = []

        # For X1
        Ran_M1_1 = np.random.rand()
        if Ran_M1_1 < p_m_comb:
            Ran_M1_2 = np.random.rand()
            if Ran_M1_2 >= 0.5:
                if Child_1[0] == UB_X1:
                    C_X1_M1 = Child_1[0]
                elif Child_1[0] == LB_X1:
                    C_X1_M1 = Child_1[0]
                else:
                    C_X1_M1 = Child_1[0] + 2
            else:
                if Child_1[0] == UB_X1:
                    C_X1_M1 = Child_1[0]
                elif Child_1[0] == LB_X1:
                    C_X1_M1 = Child_1[0]
                else:
                    C_X1_M1 = Child_1[0] - 2
        else:
            C_X1_M1 = Child_1[0]

        # For X2
        Ran_M1_3 = np.random.rand()
        if Ran_M1_3 < p_m_comb:
            Ran_M1_4 = np.random.rand()
            if Ran_M1_4 >= 0.5:
                if Child_1[1] == UB_X2:
                    C_X2_M1 = Child_1[1]
                elif Child_1[1] == LB_X2:
                    C_X2_M1 = Child_1[1]
                else:
                    C_X2_M1 = Child_1[1] + 1
            else:
                if Child_1[1] == UB_X2:
                    C_X2_M1 = Child_1[1]
                elif Child_1[1] == LB_X2:
                    C_X2_M1 = Child_1[1]
                else:
                    C_X2_M1 = Child_1[1] - 1
        else:
            C_X2_M1 = Child_1[1]

        t = 0
        Child_1n = Child_1[2:]
        for i in Child_1n:
            Ran_Mut_1 = np.random.rand() # Probablity to Mutate
            if Ran_Mut_1 < p_m_con: # If probablity to mutate is less than p_m, then mutate
                if Child_1n[t] == 0:
                    Child_1n[t] = 1
                else:
                    Child_1n[t] = 0
                t = t+1
                Mutated_Child_1n = Child_1n
            else:
                Mutated_Child_1n = Child_1n

        Mutated_Child_1 = np.insert(Mutated_Child_1n,0,(C_X1_M1,C_X2_M1))
        # Mutation Child #2
        Mutated_Child_2 = []

        # For X1
        Ran_M2_1 = np.random.rand()
        if Ran_M2_1 < p_m_comb:
            Ran_M2_2 = np.random.rand()
            if Ran_M2_2 >= 0.5:
                if Child_2[0] == UB_X1:
                    C_X1_M2 = Child_1[0]
                elif Child_2[0] == LB_X1:
                    C_X1_M2 = Child_2[0]
                else:
                    C_X1_M2 = Child_2[0] + 2
            else:
                if Child_2[0] == UB_X1:
                    C_X1_M2 = Child_2[0]
                elif Child_1[0] == LB_X1:
                    C_X1_M2 = Child_2[0]
                else:
                    C_X1_M2 = Child_1[0] - 2
        else:
            C_X1_M2 = Child_2[0]
        # For X2
        Ran_M2_3 = np.random.rand()
        if Ran_M2_3 < p_m_comb:
            Ran_M2_4 = np.random.rand()
            if Ran_M2_4 >= 0.5:
                if Child_2[1] == UB_X2:
                    C_X2_M2 = Child_2[1]
                elif Child_2[1] == LB_X2:
                    C_X2_M2 = Child_2[1]
                else:
                    C_X2_M2 = Child_2[1] + 1
            else:
                if Child_2[1] == UB_X2:
                    C_X2_M2 = Child_2[1]
                elif Child_2[1] == LB_X2:
                    C_X2_M2 = Child_2[1]
                else:
                    C_X2_M2 = Child_2[1] - 1
        else:
            C_X2_M2 = Child_2[1]

        t = 0
        Child_2n = Child_2[2:]
        for i in Child_2n:
            Ran_Mut_2 = np.random.rand() # Probablity to Mutate
            if Ran_Mut_2 < p_m_con: # If probablity to mutate is less than p_m, then mutate
                if Child_2n[t] == 0:
                    Child_2n[t] = 1
                else:
                    Child_2n[t] = 0
                t = t+1
                Mutated_Child_2n = Child_2n
            else:
                Mutated_Child_2n = Child_2n

        Mutated_Child_2 = np.insert(Mutated_Child_2n,0,(C_X1_M2,C_X2_M2))
        # Calculate fitness values of mutated children
        fit_val_muted_children = np.empty((0,2))

        # For mutated child #1
        MC_1_Comb_1 = Mutated_Child_1[0]
        MC_1_Comb_1 = int(MC_1_Comb_1)
        MC_1_Comb_2 = Mutated_Child_1[1]
        MC_1_Comb_2 = int(MC_1_Comb_2)

        MC_1_Con = Mutated_Child_1[2:]

        X0_num_Sum_MC_1 = 0
        Y0_num_Sum_MC_1 = 0

        z = 0
        t = 1
        OF_So_Far_MC_1 = 0

        for i in range(len(XY0)//2):
            X0_num_MC_1 = MC_1_Con[-t]*(2**z)
            X0_num_Sum_MC_1 += X0_num_MC_1
            t = t+1
            z = z+1

        p = 0
        u = 1 + (len(XY0)//2)

        for j in range(len(XY0)//2):
            Y0_num_MC_1 = MC_1_Con[-u]*(2**p)
            Y0_num_Sum_MC_1 += Y0_num_MC_1
            u = u+1
            p = p+1

        Decoded_X3_MC_1 = (X0_num_Sum_MC_1 * Precision_X) + a_X3
        Decoded_X4_MC_1 = (Y0_num_Sum_MC_1 * Precision_Y) + a_X4

        Emp_6 = 0

        kf = KFold(n_splits=ii2, shuffle=True, random_state=42)
        for train_index, test_index in kf.split(X):
            X_train, X_test = X[train_index], X[test_index]
            Y_train, Y_test = Y[train_index], Y[test_index]

            Hid_Lay = ()
            # Objective Function

            for i in range(MC_1_Comb_2):
                Hid_Lay = Hid_Lay + (MC_1_Comb_1,)

            model1 = MLPRegressor(activation='relu',hidden_layer_sizes=Hid_Lay,
                                   learning_rate_init=Decoded_X3_MC_1,momentum=Decoded_X4_MC_1)
            model1.fit(X_train, Y_train)
            PL1=model1.predict(X_test)
            AC1=model1.score(X_test,Y_test)
            OF_So_Far_6 = 1-(model1.score(X_test,Y_test))
            Emp_6 += OF_So_Far_6

        OF_So_Far_MC_1 = Emp_6/ii2

        # For mutated child #2
        MC_2_Comb_1 = Mutated_Child_2[0]
        MC_2_Comb_1 = int(MC_2_Comb_1)
        MC_2_Comb_2 = Mutated_Child_2[1]
        MC_2_Comb_2 = int(MC_2_Comb_2)

        MC_2_Con = Mutated_Child_2[2:]

        X0_num_Sum_MC_2 = 0
        Y0_num_Sum_MC_2 = 0

        z = 0
        t = 1
        OF_So_Far_MC_2 = 0

        for i in range(len(XY0)//2):
            X0_num_MC_2 = MC_2_Con[-t]*(2**z)
            X0_num_Sum_MC_2 += X0_num_MC_2
            t = t+1
            z = z+1
        p = 0
        u = 1 + (len(XY0)//2)

        for j in range(len(XY0)//2):
            Y0_num_MC_2 = MC_2_Con[-u]*(2**p)
            Y0_num_Sum_MC_2 += Y0_num_MC_2
            u = u+1
            p = p+1
        Decoded_X3_MC_2 = (X0_num_Sum_MC_2 * Precision_X) + a_X3
        Decoded_X4_MC_2 = (Y0_num_Sum_MC_2 * Precision_Y) + a_X4
        Emp_7 = 0
        kf = KFold(n_splits=ii2, shuffle=True, random_state=42)

        for train_index, test_index in kf.split(X):
            X_train, X_test = X[train_index], X[test_index]
            Y_train, Y_test = Y[train_index], Y[test_index]
            Hid_Lay = ()

            # Objective Function
            for i in range(MC_2_Comb_2):
                Hid_Lay = Hid_Lay + (MC_2_Comb_1,)

            model1 = MLPRegressor(activation='relu',hidden_layer_sizes=Hid_Lay,
                                   learning_rate_init=Decoded_X3_MC_2,momentum=Decoded_X4_MC_2)
            model1.fit(X_train, Y_train)
            PL1=model1.predict(X_test)
            AC1=model1.score(X_test,Y_test)
            OF_So_Far_7 = 1-(model1.score(X_test,Y_test))
            Emp_7 += OF_So_Far_7
        OF_So_Far_MC_2 = Emp_7/ii2
        All_in_Generation_X_1_1_temp = Mutated_Child_1[np.newaxis]
        All_in_Generation_X_1_1 = np.column_stack((OF_So_Far_MC_1, All_in_Generation_X_1_1_temp))

        All_in_Generation_X_2_1_temp = Mutated_Child_2[np.newaxis]
        All_in_Generation_X_2_1 = np.column_stack((OF_So_Far_MC_2, All_in_Generation_X_2_1_temp))

        All_in_Generation_X_1 = np.vstack((All_in_Generation_X_1,All_in_Generation_X_1_1))
        All_in_Generation_X_2 = np.vstack((All_in_Generation_X_2,All_in_Generation_X_2_1))

        Save_Best_in_Generation_X = np.vstack((All_in_Generation_X_1,All_in_Generation_X_2))

        New_Population = np.vstack((New_Population,Mutated_Child_1,Mutated_Child_2))

        t = 0
        R_1 = []
        for i in All_in_Generation_X_1:

            if (All_in_Generation_X_1[t,:1]) <= min(All_in_Generation_X_1[:,:1]):
                R_1 = All_in_Generation_X_1[t,:]
            t = t+1

        Min_in_Generation_X_1 = R_1[np.newaxis]
        t = 0
        R_2 = []
        for i in All_in_Generation_X_2:

            if (All_in_Generation_X_2[t,:1]) <= min(All_in_Generation_X_2[:,:1]):
                R_2 = All_in_Generation_X_2[t,:]
            t = t+1

        Min_in_Generation_X_2 = R_2[np.newaxis]
        Family = Family+1
    t = 0
    R_Final = []
    for i in Save_Best_in_Generation_X:

        if (Save_Best_in_Generation_X[t,:1]) <= min(Save_Best_in_Generation_X[:,:1]):
            R_Final = Save_Best_in_Generation_X[t,:]
        t = t+1

    Final_Best_in_Generation_X = R_Final[np.newaxis]
    t = 0
    R_22_Final = []
    for i in Save_Best_in_Generation_X:

        if (Save_Best_in_Generation_X[t,:1]) >= max(Save_Best_in_Generation_X[:,:1]):
            R_22_Final = Save_Best_in_Generation_X[t,:]
        t = t+1

    Worst_Best_in_Generation_X = R_22_Final[np.newaxis]
    # Elitism, the best in the generation lives
    Darwin_Guy = Final_Best_in_Generation_X[:]
    Not_So_Darwin_Guy = Worst_Best_in_Generation_X[:]

    Darwin_Guy = Darwin_Guy[0:,1:].tolist()
    Not_So_Darwin_Guy = Not_So_Darwin_Guy[0:,1:].tolist()
    Best_1 = np.where((New_Population == Darwin_Guy).all(axis=1))
    Worst_1 = np.where((New_Population == Not_So_Darwin_Guy).all(axis=1))
    New_Population[Worst_1] = Darwin_Guy
    n_list = New_Population
    Min_for_all_Generations_for_Mut_1 = np.vstack((Min_for_all_Generations_for_Mut_1,Min_in_Generation_X_1))
    Min_for_all_Generations_for_Mut_2 = np.vstack((Min_for_all_Generations_for_Mut_2,Min_in_Generation_X_2))

    Min_for_all_Generations_for_Mut_1_1 = np.insert(Min_in_Generation_X_1, 0, Generation)
    Min_for_all_Generations_for_Mut_2_2 = np.insert(Min_in_Generation_X_2, 0, Generation)
    Min_for_all_Generations_for_Mut_1_1_1 = np.vstack((Min_for_all_Generations_for_Mut_1_1_1,Min_for_all_Generations_for_Mut_1_1))
    Min_for_all_Generations_for_Mut_2_2_2 = np.vstack((Min_for_all_Generations_for_Mut_2_2_2,Min_for_all_Generations_for_Mut_2_2))


    Generation = Generation+1

One_Final_Guy = np.vstack((Min_for_all_Generations_for_Mut_1_1_1,Min_for_all_Generations_for_Mut_2_2_2))

t = 0
Final_Here = []
for i in One_Final_Guy:

    if (One_Final_Guy[t,1]) <= min(One_Final_Guy[:,1]):
        Final_Here = One_Final_Guy[t,:]
    t = t+1

One_Final_Guy_Final = Final_Here[np.newaxis]
XY0_Encoded_After = Final_Here[4:]

# DECODING
z = 0
t = 1
X0_num_Sum_Encoded_After = 0

for i in range(len(XY0)//2):
    X0_num_Encoded_After = XY0_Encoded_After[-t]*(2**z)
    X0_num_Sum_Encoded_After += X0_num_Encoded_After
    t = t+1
    z = z+1
p = 0
u = 1 + (len(XY0)//2)
Y0_num_Sum_Encoded_After = 0

for j in range(len(XY0)//2):
    Y0_num_Encoded_After = XY0_Encoded_After[-u]*(2**p)
    Y0_num_Sum_Encoded_After += Y0_num_Encoded_After
    u = u+1
    p = p+1
Decoded_X_After = (X0_num_Sum_Encoded_After * Precision_X) + a_X3
Decoded_Y_After = (Y0_num_Sum_Encoded_After * Precision_Y) + a_X4

print()
print("The High Accuracy is:",(1-One_Final_Guy_Final[:,1]))
print("Number of Neurons:",Final_Here[2])
print("Number of Hidden Layers:",Final_Here[3])
print("Learning Rate:",Decoded_X_After)
print("Momentum:",Decoded_Y_After)



--> GENERATION: # 1

--> FAMILY: # 1

--> FAMILY: # 2

--> FAMILY: # 3

--> FAMILY: # 4





--> FAMILY: # 5

--> FAMILY: # 6

--> FAMILY: # 7





--> FAMILY: # 8





--> FAMILY: # 9





--> FAMILY: # 10

--> GENERATION: # 2

--> FAMILY: # 1





--> FAMILY: # 2





--> FAMILY: # 3





--> FAMILY: # 4





--> FAMILY: # 5

--> FAMILY: # 6





--> FAMILY: # 7

--> FAMILY: # 8





--> FAMILY: # 9





--> FAMILY: # 10





--> GENERATION: # 3

--> FAMILY: # 1





--> FAMILY: # 2





--> FAMILY: # 3





--> FAMILY: # 4





--> FAMILY: # 5





--> FAMILY: # 6





--> FAMILY: # 7





--> FAMILY: # 8





--> FAMILY: # 9





--> FAMILY: # 10





--> GENERATION: # 4

--> FAMILY: # 1





--> FAMILY: # 2





--> FAMILY: # 3





--> FAMILY: # 4





--> FAMILY: # 5





--> FAMILY: # 6





--> FAMILY: # 7





--> FAMILY: # 8





--> FAMILY: # 9





--> FAMILY: # 10





--> GENERATION: # 5

--> FAMILY: # 1





--> FAMILY: # 2





--> FAMILY: # 3





--> FAMILY: # 4





--> FAMILY: # 5





--> FAMILY: # 6





--> FAMILY: # 7





--> FAMILY: # 8





--> FAMILY: # 9





--> FAMILY: # 10





--> GENERATION: # 6

--> FAMILY: # 1





--> FAMILY: # 2





--> FAMILY: # 3





--> FAMILY: # 4





--> FAMILY: # 5





--> FAMILY: # 6





--> FAMILY: # 7





--> FAMILY: # 8





--> FAMILY: # 9





--> FAMILY: # 10





--> GENERATION: # 7

--> FAMILY: # 1





--> FAMILY: # 2





--> FAMILY: # 3





--> FAMILY: # 4





--> FAMILY: # 5





--> FAMILY: # 6





--> FAMILY: # 7





--> FAMILY: # 8





--> FAMILY: # 9





--> FAMILY: # 10





--> GENERATION: # 8

--> FAMILY: # 1





--> FAMILY: # 2





--> FAMILY: # 3





--> FAMILY: # 4





--> FAMILY: # 5





--> FAMILY: # 6





--> FAMILY: # 7





--> FAMILY: # 8





--> FAMILY: # 9





--> FAMILY: # 10





--> GENERATION: # 9

--> FAMILY: # 1





--> FAMILY: # 2





--> FAMILY: # 3





--> FAMILY: # 4





--> FAMILY: # 5





--> FAMILY: # 6





--> FAMILY: # 7





--> FAMILY: # 8





--> FAMILY: # 9





--> FAMILY: # 10





--> GENERATION: # 10

--> FAMILY: # 1





--> FAMILY: # 2





--> FAMILY: # 3





--> FAMILY: # 4





--> FAMILY: # 5





--> FAMILY: # 6





--> FAMILY: # 7





--> FAMILY: # 8





--> FAMILY: # 9





--> FAMILY: # 10






The High Accuracy is: [0.94011569]
Number of Neurons: 10.0
Number of Hidden Layers: 4.0
Learning Rate: 0.012327646717734307
Momentum: 0.6947468489638966




*  The High **Accuracy** is: [0.94011569]
*  Number of **Neurons**: 10.0
*  Number of **Hidden Layers**: 4.0
*  **Learning Rate**: 0.012327646717734307
* **Momentum**: 0.6947468489638966

