Skip to content

ayoussf/Triton-Hub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TritonHub

🌐 Overview

TritonHub is a differentiable, efficient, and modular open-source library of PyTorch neural network modules and operations implemented in Triton. It provides GPU-accelerated primitives that leverage Triton's low-level control and parallelism, enabling seamless integration of deep learning building blocks into workflows. The framework supports both forward and backward passes while maintaining full compatibility with PyTorch, and can be easily extended and adapted to support the needs of the deep learning research and development community.

📦 Installation

Clone the repository and install using setup.py:

git clone https://github.com/ayoussf/Triton-Hub.git
cd Triton-Hub
python setup.py install

For development:

python setup.py develop

⚙️ Prerequisites

TritonHub requires the following dependencies:

  • Linux operating system (WSL for Windows users)
  • CUDA
  • GPU hardware
  • Triton (installed via pip or from source)

🚀 Quick Start

import torch
from TritonHub.Normalization import LayerNorm
from TritonHub.Activation import GeLU

batch, length, dim = 2, 100, 128
device = "cuda"
dtype = torch.float32 # or torch.float16

x = torch.randn(batch, length, dim, device=device, dtype=dtype).to("cuda")

layernorm = LayerNorm(128, eps=1e-6, elementwise_affine=True, bias=True, device=device, dtype=dtype)
gelu = GeLU(approximate='None') # or tanh approximation.

x = layernorm(x)
x = gelu(x)

🧩 Supported Modules

TritonHub currently supports the following modules, with forward and backward passes:

  • Activation Functions

    • GeLU (with/without tanh approximation)
    • ReLU
    • LeakyReLU
    • ReLU6
    • Sigmoid
    • Tanh
    • Mish
    • SiLU (Swish)
    • Softmax
    • LogSoftmax
    • Softmin
    • Softplus
    • Threshold
  • Normalization Layers

    • LayerNorm
    • RMSNorm
    • Planned: BatchNorm
  • Neural Network Layers

    • Linear
    • Dropout
    • Multi-Layer Perceptron (Gated-MLP or FFN)
    • Planned: Convolution Layers (1D/2D)
  • Distance Functions

    • Pairwise cosine similarity
  • Ops

    • Batched Matmul (bmm): supports unbatched inputs
    • Normalize (L1, L2 and p tensor normalization)
    • Norm (matrix/vector L1, L2 and p-norms)

🗺️ Roadmap

Exquisite Feature Status
Linear Layer Backward Pass
Include Triton Block Sizes in Autotune
Convolution Layer (1D/2D)
BatchNorm
L1 and p Tensor Normalization
Matrix/Vector L1, L2 and p Norms
Activation Functions
Distance Functions
Batched Matmul
Warmup Unit Tests more efficiently

🤝 Contributions

Contributions are welcomed! To add a new feature or improve an existing module:

  1. Fork the repository and create a pull request.
  2. Include a unit test under the UnitTests directory for your module.
  3. Follow existing coding conventions and ensure compatibility with PyTorch + Triton.

Found a bug or have a suggestion? Feel free to open an issue or submit a PR.

📄 License

TritonHub is released under the MIT License. You're free to use, modify, and distribute it.

🙏 Acknowledgments

Special thanks to the authors of Mamba. Their work has been a valuable reference for parts of this repository.

Releases

No releases published

Packages

No packages published

Languages