Automates the optimization of the topology of Multi-Layer Perceptrons using Genetic Algorithms in PyTorch for classification on MNIST dataset
AutoMLP-GA evolves MLP architectures by tuning the following parameters:
- Number of Neurons: Adjusts the number of neurons in each layer.
- Number of Layers: Determines the depth of the network by varying the number of hidden layers.
- Activation Functions: Experiments with different activation functions (e.g., ReLU, Tanh, Sigmoid, LeakyReLU) for each layer.
- Initial Learning Rate: Optimizes the initial learning rate for the training algorithm.
- Optimizer Type: Selects the type of optimizer, such as Adam, SGD, AdamW and RMSprop.
- Learning Rate decay Selects the type of learning rate decay, such as LinearLR, CosineAnnelaing and ExponentialLR.
- Batch Size: Determines the batch size for training the network (32, 64, 128, 256).
- Dropout: Determines the dropout rate for the network (0.0, 0.1, 0.2, 0.3, 0.4).
These parameters are crucial in defining the structure and learning capability of the MLPs. By evolving these aspects, AutoMLP-GA aims to discover the most effective network configurations for specific datasets and tasks.
To set up the AutoGAMLP tool, follow these steps:
- Clone the Repository:
git clone https://github.com/yourusername/AutoMLP-GA.git
cd AutoMLP-GA
-
Install Python: Ensure that Python (version 3.9 or higher) is installed on your system. You can download it from python.org.
-
Install Dependencies: AutoGAMLP requires PyTorch and a few other libraries. Install them using requirements.txt:
pip install -r requirements.txt
Using AutoGAMLP involves running the main.py
script, which orchestrates the process of evolving neural networks using genetic algorithms. Here’s how to use it:
- Configure Parameters:
Before running
main.py
, you might want to configure certain parameters like the batch size (fixed at 128) or the number of generations, population size by prompt. - Run the Evolution Process: Execute the main script to start the evolution process:
python main.py --gen 10 --pop 20
The Network
class in network.py
is designed to define and handle a neural network based on the parameters provided (like the number of neurons, number of layers, activation function, etc.). The Optimizer
class in optimizer.py
manages the evolutionary process, creating a population of these networks, breeding them, mutating them, and selecting the fittest networks over generations.
- In
network.py
, theNetwork
class can create a random network configuration usingcreate_random
. - In
optimizer.py
, thecreate_population
method usesNetwork
to create an initial population of randomly configured networks.
- The
breed
andmutate
methods inoptimizer.py
handle the crossover and mutation of networks. After these operations, the network's structure (like layers, neurons, activation functions) is updated. - The
Network
class has methods (create_set
andcreate_network
) to update its structure based on these new configurations.
- The
fitness
method inoptimizer.py
evaluates networks based on their accuracy, which is set and updated in theNetwork
class during training.
- Training is handled outside of these classes, typically in a script like
train.py
, where each network is trained, and its performance (accuracy) is evaluated.
The selection method used in the Optimizer
class is a combination of Truncation Selection and Random Selection:
- Truncation Selection: After each generation, a certain percentage of the best-performing networks (as determined by their fitness, which in this case is accuracy) are retained for the next generation. This is controlled by the
retain
attribute. - Random Selection: In addition to the top performers, there is also a chance (
random_select
probability) to select some networks that are not among the top performers. This method introduces diversity into the gene pool, preventing premature convergence to a local optimum.
The crossover method in breed
function seems to implement a form of Uniform Crossover:
- Uniform Crossover: In this method, for each network parameter (like the number of neurons, layers, activation functions), the child network randomly inherits the value of that parameter from either one of its two parents. This method treats each gene (parameter) independently and gives equal chance for a gene to be inherited from either parent.
The mutation method in the mutate
function is a basic form of Random Mutation:
- Random Mutation: Here, a network is chosen to undergo mutation with a certain probability (
mutate_chance
). When a mutation occurs, one of the network's parameters is randomly selected and then randomly altered (by picking a new value for that parameter from the available choices).