Efficient Triton Kernels for LLM Training
-
Updated
Nov 7, 2024 - Python
Efficient Triton Kernels for LLM Training
ClearML - Model-Serving Orchestration and Repository Solution
FlagGems is an operator library for large language models implemented in Triton Language.
(WIP)The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework for algorithm service that ensures reliability, high concurrency and scalability of services.
SymGDB - symbolic execution plugin for gdb
Automatic ROPChain Generation
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
Deploy DL/ ML inference pipelines with minimal extra code.
A performance library for machine learning applications.
Symbolic debugging tool using JonathanSalwan/Triton
Triton implementation of FlashAttention2 that adds Custom Masks.
Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.
Efficient kernel for RMS normalization with fused operations, includes both forward and backward passes, compatibility with PyTorch.
Completely free Telegram integrated RAT, Triton_RAT is one of the best RAT's that was written in python. Good luck😉!
⚡ Blazing fast audio augmentation in Python, powered by GPU for high-efficiency processing in machine learning and audio analysis tasks.
Optimize, convert and deploy machine learning models as fast inference API using Triton and ORT. Currently support Hugging Face transformers, PyToch, Tensorflow, SKLearn and XGBoost models.
QuickStart for Deploying a Basic Model on the Triton Inference Server
Morion is a PoC tool to experiment with symbolic execution on real-word (ARMv7) binaries.
Increase the inference speed of the model
Add a description, image, and links to the triton topic page so that developers can more easily learn about it.
To associate your repository with the triton topic, visit your repo's landing page and select "manage topics."