GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as…

C++ 1,267 544 Updated Feb 15, 2025

stevelaskaridis / awesome-mobile-llm

Awesome Mobile LLMs

156 12 Updated Mar 23, 2025

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,349 520 Updated May 3, 2024

alibaba / MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…

C++ 10,139 1,796 Updated Mar 28, 2025

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 142,078 28,450 Updated Mar 28, 2025

salykova / matmul.c

Multi-Threaded FP32 Matrix Multiplication on x86 CPUs

C 343 21 Updated Feb 20, 2025

akanyaani / gpt-2-tensorflow2.0

OpenAI GPT2 pre-training and sequence prediction implementation in Tensorflow 2.0

Python 262 85 Updated Mar 25, 2023

openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"

Python 23,250 5,635 Updated Aug 14, 2024

andrewkchan / deepseek.cpp

CPU inference for the DeepSeek family of large language models in pure C++

C++ 281 29 Updated Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

frank911 jacknrose

Block or report jacknrose

Stars

shouxieai / tensorRT_quantization

deepseek-ai / DeepGEMM

mit-han-lab / llm-awq

NLKNguyen / papercolor-theme

gty111 / PTX-EMU

gpgpu-sim / cutlass-gpgpu-sim

accel-sim / accel-sim-framework

gpu-mode / resource-stream

Mozilla-Ocho / llamafile

flame / blislab

Zhao-Dongyu / sgemm_riscv

surez-ok / blislab_riscv

google / minimalloc

LucasKl / riscv-function-profiling

ucb-bar / chipyard

jiegec / rvv-kernels

openhwgroup / cva6

nibrunie / rvv-examples

pulp-platform / ara

yzhaiustc / Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F

zdevito / ATen

gpgpu-sim / gpgpu-sim_distribution