A high-throughput and memory-efficient inference and serving engine for LLMs
-
Updated
Jun 13, 2024 - Python
A high-throughput and memory-efficient inference and serving engine for LLMs
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型
Everything you want to know about Google Cloud TPU
Differentiable Fluid Dynamics Package
DECIMER: Deep Learning for Chemical Image Recognition using Efficient-Net V2 + Transformer
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
Benchmarking suite to evaluate 🤖 robotics computing performance. Vendor-neutral. ⚪Grey-box and ⚫Black-box approaches.
🖼 Training StyleGAN2 on TPUs in JAX
EfficientNet, MobileNetV3, MobileNetV2, MixNet, etc in JAX w/ Flax Linen and Objax
Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offload
Edge TPU Accelerator / Multi-TPU + MobileNet-SSD v2 + Python + Async + LattePandaAlpha/RaspberryPi3/LaptopPC
EvoPose2D is a two-stage human pose estimation model that was designed using neuroevolution. It achieves state-of-the-art accuracy on COCO.
Repository for Google Summer of Code 2019 https://summerofcode.withgoogle.com/projects/#4662790671826944
Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP
🎯 Accumulated Gradients for TensorFlow 2
Add a description, image, and links to the tpu topic page so that developers can more easily learn about it.
To associate your repository with the tpu topic, visit your repo's landing page and select "manage topics."