tdh-archive
Popular repositories Loading
-
fast_mamba.np
fast_mamba.np PublicForked from idoh/fast_mamba.np
A pure and fast NumPy implementation of Mamba with cache support.
Python
-
-
NotepadNext
NotepadNext PublicForked from dail8859/NotepadNext
A cross-platform, reimplementation of Notepad++
C++
-
candle
candle PublicForked from johnma2006/candle
Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.
Jupyter Notebook
-
cake
cake PublicForked from evilsocket/cake
Distributed LLM and StableDiffusion inference for mobile, desktop and server.
Rust
-
Repositories
- SmolChat-Android Public Forked from shubham0204/SmolChat-Android
Running any GGUF SLMs/LLMs locally, on-device in Android
- distributed-llama Public Forked from b4rtaz/distributed-llama
Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.
- x-transformers Public Forked from lucidrains/x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
- prima.cpp Public Forked from Lizonghang/prima.cpp
prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
- burn Public Forked from tracel-ai/burn
Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.
- native-sparse-attention-pytorch Public Forked from lucidrains/native-sparse-attention-pytorch
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
- DeepGEMM Public Forked from deepseek-ai/DeepGEMM
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…