Welcome to my Restricted Boltzmann Machine (RBM) project, a key part of the CA 20581 Biological Computation course, 2024a, at the Open University of Israel. This project delves into the generative stochastic neural networks, aiming to learn complex probability distributions from input data.
- Project Overview π
- Getting Started π
- RBM Core Architecture ποΈ
- Utilizing the Network
- Contributing to the Project π‘
Restricted Boltzmann Machines (RBMs) are generative stochastic neural networks adept at learning probability distributions over input sets. Featuring both visible and hidden units with undirected connections, RBMs harness the contrastive divergence algorithm for efficient parameter optimization.
This RBM implementation stems from a Biological Computation course at the Open University of Israel, designed to delve into generative learning's nuances and RBMs' application in machine learning and AI. Remarkably, this project foregoes external libraries, utilizing only Python and NumPy to construct the RBM and its components.
The project leverages the Iris dataset, a staple in machine learning for its detailed feature set of iris flower species, to train the RBM in classifying iris flowers, showcasing the model's prowess in learning complex distributions and executing classification tasks. Additionally, a GUI application enhances user engagement with the RBM, offering an intuitive interface for dataset interaction.
This RBM project encompasses a suite of features to navigate the intricacies of generative learning and neural networks:
- Customizable Hidden Units: Adapt the RBM with varying hidden unit counts to accommodate different datasets and learning challenges.
- Efficient Learning with Contrastive Divergence: Employ this cutting-edge algorithm for streamlined learning and model refinement.
- Robust Model State Management: Effortlessly manage the model's weights, biases, and settings, facilitating continuity in projects or analyses.
- Insightful Visualizations: Gain a deeper understanding of the model through comprehensive visualization tools that illustrate the network's structure, weights, biases, and learning evolution.
- Interactive User Experience: Engage with the RBM via a polished GUI or command-line interface, designed for ease of data management, model training, and evaluation.
- Advanced Data Preprocessing: Utilize the
DataSet
class for proficient dataset management, ensuring smooth integration with the RBM's generative learning requirements.
Interacting with the RBM model is streamlined through both a command-line interface and a GUI application, offering varied functionalities for model training, evaluation, and visualization.
Navigate the RBM model with these command-line flags for a tailored interaction:
-train
: Initiates model training with designated or default settings.-test
: Evaluates the model's accuracy on a given dataset.-plot
: Visualizes the model's structure, weights, and biases.-save
: Archives the model's weights and biases for future use.-load
: Restores the model's weights and biases from saved files.
To train the model, and then save the weights:
python rbm.py -train -save
To load an existing model and test its performance:
python rbm.py -load -test
To enhance user interaction and simplify the process of managing the RBM, this project includes a GUI application built with Tkinter. The GUI provides an intuitive interface for performing key operations such as data loading, model training/testing, and visualization.
From the GUI, you can load the dataset, train the network, test the network, and plot the synapses and biases of the network. I tried to make it friendly and easy to use. Make sure to load the dataset before training or testing the network, a file dialog will open when you click the Load
button. The iris.data file is included in the repository.
This section outlines the core components of the Restricted Boltzmann Machine (RBM) network, implemented in Python. The architecture is modular, designed for flexibility across various datasets, with specific optimization for the Iris dataset.
- DataSet.py: Defines the
DataSet
class, responsible for loading and parsing datasets into numpy arrays of discrete values, tailored for RBM processing. - RBM.py: Introduces the
RBM
class, embodying the core functionality of the RBM network, including training and sampling methods. This file also contains theSynapse
class, which is currently under review for removal due to potential redundancy. - GUI.py: Implements a graphical user interface (
GUI
class) to facilitate user interaction with the RBM network. - layers.py: Contains the
Layer
class, representing individual layers within the RBM. This class is crucial for managing neuron states, biases, and potentially enabling parallel processing within the network. The structure aims to highlight network restrictions and enhance modularity.
The RBM is configured for the Iris dataset as follows:
- Input Neurons: 12 neurons, allocated as 3 per each of the 4 dataset features, post-normalization to discrete values.
- Output Neurons: 3 neurons, one per dataset class.
- Hidden Neurons: 16 neurons, an empirically determined number based on experimental outcomes.
Note that the input and output neuron are the visible units
( hidden units
are the hidden neurons (
The network employs two primary layer types, managed through inheritance:
Layer
: A generic class for RBM layers, managing units (neurons) and biases. It serves as the foundation for:HiddenLayer
andVisibleLayer
: Specialized classes derived fromLayer
, tailored to their specific roles in the network structure.
The diagram details the network's composition, including hidden neurons (
An indexing discrepancy led to a significant bug affecting class classification accuracy. The issue, stemming from a mix of 1-based and 0-based indexing, primarily impacted the third class's recognition. This has been rectified to ensure consistent and accurate network performance.
def energy(self, visible_units, hidden_units):
"""
calculate the energy of the model using the formula:
-Ξ£(vi * a) - Ξ£(hj * b) - Ξ£(Ξ£(vi * wji) * hj)
args:
- visible_units: the visible layer units
- hidden_units: the hidden layer units
"""
visible_bias = self.visible_layer.bias
hidden_bias = self.hidden_layer.bias
weights = self.synapses.weights
return -np.dot(visible_bias, visible_units)\
- np.dot(hidden_bias, hidden_units) - \
np.dot(visible_units.T @ weights,
hidden_units) # note the @ operator
The energy function of the network is thus:
Where:
The DataSet
class serves as the cornerstone for data management within the RBM framework. It is engineered to streamline the loading, preprocessing, and handling of datasets, ensuring seamless compatibility with the RBM's operational requirements.
Key Features:
- Instance Management: Facilitates the encapsulation of data entries through the
Instance
object, streamlining data manipulation and access. - Discrete Value Transformation: Employs a method to convert continuous attributes into discrete values, optimizing data for the RBM's generative learning process.
- Adaptable Data Processing: Supports dynamic integration of labels, attributes, and instances, catering to diverse dataset structures.
- Efficient Data Loading: Implements the
createDataSet
method for direct file-based data initialization, accommodating real values that are discretized as necessary.
- Setup: Initiates with an empty framework for labels, attributes, and instances.
- Data Ingestion: Utilizes file input to populate the model with relevant data, identifying and storing components via specialized methods.
- Data Conversion: Applies the
convertToDiscrete
method to continuous values, preparing the dataset for RBM training.
Before runing the classification the following steps must preceed:
- Define the energy function with suitable parameters.
- Lock the input units as required.
- Initialize the other units with random values, and choose big
$T$ to allow the network to converge.
The synapses and biases are initialized using the train
method, which is called from the RBM
class. The process is as follows:
- Synaptic Initialization: The weights and biases are initialized using a random number generator from the numpy library, with values ranging between -0.1 and 0.1, laying the foundation for diverse neural connections.
- Choose the learning rate: The learning rate is set to 0.1, a value that has been empirically determined to facilitate efficient learning and convergence.
Than the following steps are reapeated until the network converges, or until a maximum number of iterations is reached:
-
Pick a random instance
$\vec{v} = (v_1,...,v_15)$ from the dataset. -
Calculate the probability
$P_k$ for every hidden unit$h_k$ using the formula:$P_k = \frac{1}{1 + e^{-\sum_{i=1}^{15} v_i w_{ik} - b_k}}$ . Note that that it doesnt deapend on the other hidden units - this is the reason for the nameRestricted
Boltzmann Machine, and also why it enables parallel processing. -
Initialize the classification algorithm with
$\vec{v} = (v_1,...,v_{15})$ and perform a single step of the Gibbs sampling algorithm to get network state$(\vec{h}(1), \vec{v}(1))$ . This is done without locking the input units. -
Update the weights and biases using the contrastive divergence algorithm:
$a_i^{new} = a_i + \eta(v_i - v_i{(1)})$ $b_j^{new} = b_j + \eta(P_j - h_j{(1)})$ $J_{ij}^{new} = J_{ij} + \eta(v_i P_j - v_i{(1)} h_j{(1)})$
-
Repeat the previous steps until the network converges, or until a maximum number of iterations is reached.
- Pre-training Evaluation: Utilize the GUI or modify
CONSOLE_LOGGING
inRBM.py
toTrue
for insights into the network's initial state. - Training Execution: Initiate the learning process via GUI or command line, employing the
test
function.
Post-training, the network exhibits around 90% accuracy on the test set, demonstrating stable performance and precise classification across multiple iterations. For custom runs, leverage the load
argument or the GUI to input predefined biases and weights.
The evolution of biases and synapses post-learning is visually represented, indicating convergence towards optimal values and illustrating the learning impact.
Condition | Before Learning | After Learning |
---|---|---|
Biases | ||
Synapses |
Your expertise can significantly propel this project forward. We invite contributions in all forms, from code enhancements to feedback. Join us in refining and expanding the capabilities of this RBM implementation.