robclu

🎯

Focusing

Rob Clucas robclu

🎯

Focusing

Building simple, scalable, efficient infrastructure for AI @spyral-ai.

27 followers · 10 following

Spyral AI
London, UK
03:34 - 12h behind
https://www.spyral.ai
@rob_clucas

Achievements

x2 x2

Achievements

x2 x2

Lists (4)

Sort

Stars

nickcoutsos / keymap-editor

A web based graphical editor of ZMK keymaps.

JavaScript 1,489 386 Updated Mar 3, 2025

NLPOptimize / flash-tokenizer

EFFICIENT AND OPTIMIZED TOKENIZER ENGINE FOR LLM INFERENCE SERVING

C++ 42 1 Updated Mar 29, 2025

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 3,367 232 Updated Mar 28, 2025

Infini-AI-Lab / TriForce

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Python 242 17 Updated Aug 31, 2024

themanojdesai / genai-llm-ml-case-studies

A collection of 500+ real-world ML & LLM system design case studies from 100+ companies. Learn how top tech firms implement GenAI in production.

200 36 Updated Mar 9, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,945 188 Updated Mar 29, 2025

Aider-AI / aider

aider is AI pair programming in your terminal

Python 30,188 2,735 Updated Mar 29, 2025

olimorris / codecompanion.nvim

✨ AI-powered coding, seamlessly in Neovim

Lua 2,959 179 Updated Mar 27, 2025

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 8,396 814 Updated Mar 27, 2025

deepseek-ai / smallpond

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,440 389 Updated Mar 5, 2025

ym689 / rec_icl

Python 3 Updated Oct 31, 2024

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,109 536 Updated Mar 28, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 7,329 684 Updated Mar 28, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

C++ 11,386 811 Updated Mar 1, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,948 230 Updated Mar 4, 2025

jonefeewang / stonemq

A high-performance and efficient message queue developed in Rust

Rust 72 5 Updated Feb 19, 2025

stepfun-ai / Step-Video-T2V

Python 2,734 238 Updated Mar 17, 2025

BaohaoLiao / RSD

Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.

Python 21 3 Updated Mar 21, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 2,518 261 Updated Mar 29, 2025

lucidrains / titans-pytorch

Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch

Python 1,247 109 Updated Mar 14, 2025

exo-explore / exo

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚

Python 27,245 1,670 Updated Mar 21, 2025

yetone / avante.nvim

Use your Neovim like using Cursor AI IDE!

Lua 11,771 476 Updated Mar 29, 2025

hao-ai-lab / LookaheadDecoding

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,223 73 Updated Mar 6, 2025

SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3.

Python 1,109 119 Updated Mar 23, 2025

NVIDIA / kvpress

LLM KV cache compression made easy

Python 444 31 Updated Mar 19, 2025

apple / ml-cross-entropy

Python 400 33 Updated Mar 26, 2025

dust-tt / llama-ssp

Experiments on speculative sampling with Llama models

Python 125 6 Updated Jun 8, 2023

facebookresearch / LayerSkip

Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024

Python 277 23 Updated Feb 24, 2025

VectorSpaceLab / OmniGen

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

Jupyter Notebook 3,848 330 Updated Feb 20, 2025

chipsalliance / Caliptra

Caliptra IP and firmware for integrated Root of Trust block

275 40 Updated Mar 29, 2025

Rob Clucas robclu

Lists (4)

Database

GPU

LLM

Vision

Stars