inference

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

machine-learning inference pytorch machinelearning deeplearning demos inference-engine onnx tensorflow-lite qnn inference-api

Updated May 29, 2024
Python

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.

Updated May 28, 2024
C++

google-ai-edge / mediapipe

Star

Cross-platform, customizable ML solutions for live and streaming media.

android c-plus-plus calculator machine-learning framework computer-vision deep-learning inference pipeline-framework stream-processing video-processing perception mobile-development audio-processing graph-framework graph-based mediapipe

Updated May 28, 2024
C++

google / jetstream-pytorch

Star

PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"

inference pytorch batching attention llama gemma model-serving tpu llm llm-inference llama2

Updated May 29, 2024
Python

llm-db / FineInfer

Star

Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)

inference pytorch lora fine-tuning peft llm

Updated May 28, 2024
Python

google / XNNPACK

Star

High-efficiency floating-point neural network inference operators for mobile, server, and Web

cpu neural-network inference multithreading simd matrix-multiplication neural-networks convolutional-neural-networks convolutional-neural-network inference-optimization mobile-inference

Updated May 29, 2024
C

argmaxinc / WhisperKit

Star

Swift native on-device speech recognition with Whisper for Apple Silicon

macos swift ios watchos transformers inference speech-recognition pretrained-models whisper visionos

Updated May 29, 2024
Swift

SuperDuperDB / superduperdb

Star

🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.

Updated May 28, 2024
Python

huggingface / huggingface.js

Star

Utilities to use the Hugging Face Hub API

machine-learning inference hub api-client huggingface

Updated May 28, 2024
TypeScript

openvinotoolkit / openvino_notebooks

Star

📚 Jupyter notebook tutorials for OpenVINO™

machine-learning computer-vision deep-learning inference openvino

Updated May 29, 2024
Jupyter Notebook

NVIDIA / TensorRT

Star

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

deep-learning inference nvidia gpu-acceleration tensorrt

Updated May 28, 2024
C++

jyyulab / SJARACNe

Star

Scalable Tool for Gene Network Reverse Engineering

inference mutual-information gene-network

Updated May 28, 2024
C++

Improve this page

Add a description, image, and links to the inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the inference topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference

Here are 1,193 public repositories matching this topic...

vllm-project / vllm

huggingface / text-generation-inference

openvinotoolkit / openvino

triton-inference-server / server

google / JetStream

microsoft / DeepSpeed

hpcaitech / ColossalAI

deepjavalibrary / djl-serving

quic / ai-hub-models

ROCm / MIVisionX

google-ai-edge / mediapipe

google / jetstream-pytorch

llm-db / FineInfer

google / XNNPACK

argmaxinc / WhisperKit

SuperDuperDB / superduperdb

huggingface / huggingface.js

openvinotoolkit / openvino_notebooks

NVIDIA / TensorRT

jyyulab / SJARACNe

Improve this page

Add this topic to your repo