A lightweight and extensible PyTorch machine learning framework designed for rapid prototyping and scalable training of deep learning models.
- Distributed Training: Built-in support for multi-GPU and multi-node training with
torchrun - Mixed Precision: Automatic mixed precision training with gradient scaling
- Flexible Architecture: Easy-to-extend modular design for models, datasets, and callbacks
- Comprehensive Training: Automated checkpointing, learning rate scheduling, and statistics tracking
- Command-line Interface: YAML configuration with command-line overrides
- Production Ready: Gradient clipping, memory monitoring, and robust error handling
mlutils.py/
βββ mlutils/ # Core ML framework
β βββ trainer.py # Main Trainer class with distributed training support
β βββ callbacks.py # Callback system for training hooks
β βββ schedule.py # Learning rate schedulers
β βββ utils.py # Utilities (device selection, parameter counting, etc.)
βββ project/ # Example project implementation
β βββ models/ # Model implementations
β β βββ transformer.py # Example transformer model
β βββ datasets/ # Dataset implementations
β β βββ dummy.py # Example dummy dataset
β β βββ utils.py # Dataset loading utilities
β βββ callbacks.py # Project-specific callbacks
β βββ utils.py # Optimizers (AdamW, Lion), normalizers, loss functions
β βββ __main__.py # Training script with CLI
βββ scripts/ # Installation and utility scripts
βββ pyproject.toml # Project dependencies
- Python >= 3.11
- PyTorch (automatically installed)
git clone git@github.com:vpuri3/mlutils.py.git
cd mlutils.py
./scripts/install.shSingle GPU training:
python -m project --train true --dataset dummy --exp_name my_experiment --mdoel_type 0 --epochs 50Multi-GPU training:
torchrun --nproc-per-node 2 -m project --train true --dataset dummy --exp_name my_experiment --model_type 0 --epochs 50python -m project --restart true --exp_name my_experimentpython -m project --evaluate true --exp_name my_experimentThe framework uses YAML configuration files with command-line overrides. Key configuration options:
python -m project --help--epochs: Number of training epochs (default: 100)--batch_size: Batch size (default: 4)--learning_rate: Learning rate (default: 1e-3)--weight_decay: Weight decay (default: 0.0)--optimizer: Optimizer choice (adamw, lion)--mixed_precision: Enable mixed precision training (default: true)
--model_type: Model type (0: Transformer)--channel_dim: Model hidden dimension (default: 64)--num_blocks: Number of transformer blocks (default: 4)--num_heads: Number of attention heads (default: 8)--mlp_ratio: MLP expansion ratio (default: 4.0)
The mlutils.Trainer class provides:
- Distributed Training: Automatic handling of multi-GPU training with DDP
- Mixed Precision: Built-in support for FP16 training with gradient scaling
- Checkpointing: Automatic model and optimizer state saving/loading
- Callbacks: Extensible hook system for custom training logic
- Statistics: Training/validation loss tracking and custom metrics
import mlutils
import project
# Load data
train_data, test_data, metadata = project.load_dataset('dummy', 'data/')
# Create model
model = project.Transformer(
in_dim=metadata['in_dim'],
out_dim=metadata['out_dim'],
channel_dim=64,
num_blocks=4,
num_heads=8
)
# Create trainer
trainer = mlutils.Trainer(
model=model,
_data=train_data,
data_=test_data,
epochs=100,
lr=1e-3,
mixed_precision=True
)
# Train
trainer.train()Create new models in project/models/:
import torch.nn as nn
class MyModel(nn.Module):
def __init__(self, in_dim, out_dim, **kwargs):
super().__init__()
self.layers = nn.Linear(in_dim, out_dim)
def forward(self, x):
return self.layers(x)Register in project/models/__init__.py:
from .my_model import MyModelCreate new datasets in project/datasets/:
import torch.utils.data as data
class MyDataset(data.Dataset):
def __init__(self, root):
# Load your data
pass
def __len__(self):
return len(self.data)
def __getitem__(self, idx):
return self.data[idx]Register in project/datasets/utils.py:
def load_dataset(dataset_name, DATADIR_BASE):
if dataset_name == 'my_dataset':
# Load and return train, test, metadata
passTraining results are automatically saved to out/<exp_name>/:
out/
βββ my_experiment/
βββ config.yaml # Experiment configuration
βββ ckpt01/ # Checkpoints (model + optimizer state)
βββ ckpt02/
βββ ...
βββ ckpt10/
βββ grad_norm.png # Gradient norm tracking
βββ learning_rate.png # Learning rate schedule
βββ losses.png # Training/validation loss plots
βββ model_stats.json # Memory/ timing statistics
The framework includes implementations of:
- AdamW: With automatic parameter group separation (decay vs no-decay)
- Lion: Memory-efficient optimizer with sign-based updates
Built-in normalizers for data preprocessing:
IdentityNormalizer: No normalizationUnitCubeNormalizer: Scale to [0,1] rangeUnitGaussianNormalizer: Zero mean, unit variance
Extensible callback system for training hooks:
class MyCallback(mlutils.Callback):
def __call__(self, trainer, **kwargs):
# Custom logic during training
pass
trainer.add_callback('epoch_end', MyCallback())Supported schedules:
OneCycleLR: One-cycle learning rate policyCosineAnnealingLR: Cosine annealingCosineAnnealingWarmRestarts: Cosine annealing with restartsConstantLR: Constant learning rate
The included dummy dataset demonstrates the framework:
- Input: 2D coordinates (x, y) on a 32Γ32 grid
- Output: sin(Οx) Γ sin(Οy) function values
- Task: Function approximation with a transformer model
This project is licensed under the MIT License - see the LICENSE file for details.
- Fork the repository
- Create a feature branch
- Add your model/dataset/feature following the existing patterns
- Submit a pull request
If you use this framework in your research, please cite:
@misc{mlutils2025,
title={mlutils.py: A Lightweight PyTorch ML Framework},
author={Vedant Puri},
year={2025},
url={https://github.com/vpuri3/mlutils.py}
}Note: This framework is designed to be simple yet powerful. It provides the essential components needed for most ML projects while remaining easy to understand and extend.