Skip to content
View ml-inference's full-sized avatar

Block or report ml-inference

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. nanochat nanochat Public

    Forked from karpathy/nanochat

    The best ChatGPT that $100 can buy.

    Python

  2. autoresearch autoresearch Public

    Forked from karpathy/autoresearch

    AI agents running research on single-GPU nanochat training automatically

    Python

  3. reference-kernels reference-kernels Public

    Forked from gpu-mode/reference-kernels

    Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!

    Python

  4. ollama ollama Public

    Forked from ollama/ollama

    Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

    Go

  5. blog blog Public

    ML Inference in production -- learnings, optimizations, and insights

  6. TensorRT-LLM TensorRT-LLM Public

    Forked from NVIDIA/TensorRT-LLM

    TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

    Python