ml-inference

Spurthi ml-inference

Popular repositories Loading

nanochat nanochat Public

Forked from karpathy/nanochat

The best ChatGPT that $100 can buy.

Python
autoresearch autoresearch Public

Forked from karpathy/autoresearch

AI agents running research on single-GPU nanochat training automatically

Python
reference-kernels reference-kernels Public

Forked from gpu-mode/reference-kernels

Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!

Python
ollama ollama Public

Forked from ollama/ollama

Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

Go
blog blog Public

ML Inference in production -- learnings, optimizations, and insights
TensorRT-LLM TensorRT-LLM Public

Forked from NVIDIA/TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python