Large-scale LLM inference engine
-
Updated
Sep 22, 2024 - Python
Large-scale LLM inference engine
A common base representation of python source code for pylint and other projects
Simple first-order logic implementation for .NET.
OneDiff: An out-of-the-box acceleration library for diffusion models.
MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.
校招、秋招、春招、实习好项目,带你从零动手实现支持LLama的大模型推理框架。
(project) A prediction machine that learns a dataset based on an equation created using various operators
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.
校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step
library of C++ functions that support applications of Stan in Pharmacometrics
Neural network inference template for real-time cricital audio environments - presented at ADC23
Python Computer Vision & Video Analytics Framework With Batteries Included
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
An example of performing inference using the MobileNetV3 model with the Tract library
Repository for OpenVINO's extra modules
Friendli: the fastest serving engine for generative AI
Aussie AI Base C++ Library is the source code repo for the book Generative AI in C++, along with various other AI/ML kernels.
The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
A robust and efficient TinyML inference engine.
Add a description, image, and links to the inference-engine topic page so that developers can more easily learn about it.
To associate your repository with the inference-engine topic, visit your repo's landing page and select "manage topics."