This project is a minimalistic deep learning framework inspired by PyTorch, built using JAX for automatic differentiation and GPU acceleration. It includes core components for building and training neural networks, such as layers, optimizers, and loss functions.
To install the required dependencies, run:
pip install -r requirements.txt
core/
: Contains base classes for computation nodes, losses, and optimizers.nets/
: Contains network and layer implementations.legacy_utils.py
: Contains legacy utility functions for operations like convolution and pooling.plotutils.py
: Contains utilities for visualizing model outputs.
class ComputationNode(abc.ABC):
"""
Abstract base class for computation nodes in a neural network.
Attributes:
input : Input data to the node.
output : Output data from the node.
seed : Random seed for reproducibility.
requires_grad : Boolean indicating if the node requires gradient computation.
_grad_norm : Dictionary to store gradient norms.
_accum_params : Dictionary to store accumulated parameters.
_accum_param_var_mean : Dictionary to store variance and mean of accumulated parameters.
grad_cache : Dictionary to cache gradients.
accumulate_grad_norm : Boolean indicating if gradient norms should be accumulated.
accumulate_parameters : Boolean indicating if parameters should be accumulated.
"""
class Loss(abc.ABC):
"""
Abstract base class for loss functions.
Methods:
loss(pred, true): Computes the loss between predictions and true labels.
backward(output_grad): Computes the gradient of the loss with respect to the input.
"""
class Optimizer(abc.ABC):
"""
Abstract base class for optimizers.
Methods:
step(): Performs a single optimization step.
"""
class Net:
"""
Represents a neural network composed of multiple computation layers.
Attributes:
layers : List of computation nodes forming the network.
layer_seed_keys : Dictionary storing seed keys for layer initialization.
master_key : Key for random number generation to ensure reproducibility.
"""
class Linear(ComputationNode):
"""
Represents a linear layer in a neural network.
Attributes:
input_size : Size of the input features.
output_size : Size of the output features.
initialization : Method for initializing weights.
parameters : Dictionary containing weights and biases.
accumulate_grad_norm : Boolean indicating if gradient norms should be accumulated.
accumulate_parameters : Boolean indicating if parameters should be accumulated.
"""
class Conv2D(ComputationNode):
"""
Represents a 2D convolutional layer in a neural network.
Attributes:
input_channels : Number of input channels.
kernel_size : Size of the convolutional kernel.
no_of_filters : Number of filters in the layer.
stride : Stride of the convolution.
pad : Padding size.
accumulate_grad_norm : Boolean indicating if gradient norms should be accumulated.
accumulate_params : Boolean indicating if parameters should be accumulated.
initialization : Method for initializing weights.
parameters : Dictionary containing weights and biases.
bias : Boolean indicating if the layer uses a bias term.
"""
class ReLU(ComputationNode):
"""
Represents a Rectified Linear Unit (ReLU) activation layer.
"""
class PReLU(ComputationNode):
"""
Represents a Parametric Rectified Linear Unit (PReLU) activation layer.
Attributes:
parameters : Dictionary containing the parameter 'a'.
accumulate_grad_norm : Boolean indicating if gradient norms should be accumulated.
accumulate_parameters : Boolean indicating if parameters should be accumulated.
"""
class SoftMax(ComputationNode):
"""
Represents a SoftMax activation layer.
Attributes:
use_legacy_backward : Boolean indicating if legacy backward computation should be used.
"""
class CCE(Loss):
"""
Categorical Cross-Entropy (CCE) loss class.
Inherits from the base Loss class and implements the loss and backward functions.
"""
class SGD(Optimizer):
"""
Stochastic Gradient Descent (SGD) optimizer class.
Inherits from the base Optimizer class and implements the step function for updating network parameters.
"""
This project is licensed under the MIT License.