Skip to content
View simon4nie's full-sized avatar
  • Joined Feb 27, 2025

Block or report simon4nie

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
C 166 65 Updated May 31, 2023

Oh my tmux! My self-contained, pretty & versatile tmux configuration made with 💛🩷💙🖤❤️🤍

Shell 22,657 3,401 Updated Feb 27, 2025

LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.

Python 575 59 Updated Mar 7, 2025

learn gpu cuda programming by examples

CMake 6 2 Updated Aug 17, 2023

CUDA 算子手撕与面试指南

Cuda 185 19 Updated Jan 15, 2025

Material for gpu-mode lectures

Jupyter Notebook 3,920 396 Updated Feb 9, 2025

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 2,741 285 Updated Mar 4, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,184 779 Updated Mar 1, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,059 608 Updated Mar 6, 2025

Examples for using ONNX Runtime for machine learning inferencing.

C++ 1,318 357 Updated Jan 23, 2025

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 15,876 3,083 Updated Mar 7, 2025

Use your Neovim like using Cursor AI IDE!

Lua 10,806 428 Updated Mar 7, 2025
Showing results