Skip to content
View airMeng's full-sized avatar

Organizations

@RunoobHelpsRunoob

Block or report airMeng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Let your Claude able to think

TypeScript 14,820 1,725 Updated Mar 10, 2025

Advanced Quantization Algorithm for LLMs/VLMs.

Python 410 31 Updated Mar 28, 2025

Intel® NPU Acceleration Library

Python 653 74 Updated Jan 13, 2025

An innovative library for efficient LLM inference via low-bit quantization

C++ 351 38 Updated Aug 30, 2024
C++ 61 20 Updated Dec 18, 2024

how to optimize some algorithm in cuda.

Cuda 2,047 183 Updated Mar 26, 2025

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2,164 210 Updated Oct 8, 2024

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2,363 264 Updated Mar 28, 2025

Writing a minimal x86-64 JIT compiler in C++

C++ 101 18 Updated Apr 28, 2018

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

Jupyter Notebook 454 127 Updated Mar 27, 2025

Intel® Extension for TensorFlow*

C++ 336 42 Updated Mar 18, 2025
Jupyter Notebook 204 67 Updated Nov 22, 2024

MLIR Sample dialect

C++ 118 35 Updated Feb 18, 2025

MLIRX is now defunct. Please see PolyBlocks - https://docs.polymagelabs.com

38 9 Updated Dec 1, 2023

Parallel Algorithm Scheduling Library

C++ 106 19 Updated Jul 24, 2017

This is an implementation of sgemm_kernel on L1d cache.

Assembly 225 33 Updated Feb 26, 2024

an educational compiler intermediate representation

Rust 646 280 Updated Mar 12, 2025

A primitive library for neural network

C++ 1,325 218 Updated Nov 24, 2024

A list of awesome compiler projects and papers for tensor computation and deep learning.

2,526 308 Updated Oct 19, 2024

Fast sparse deep learning on CPUs

Python 52 8 Updated Sep 28, 2022

Assembler for NVIDIA Maxwell architecture

Sass 984 165 Updated Jan 3, 2023

Samples for Intel® oneAPI Toolkits

C++ 1,012 718 Updated Mar 24, 2025

LLVM Optimization to extract a function, embedded in its intermediate representation in the binary, and execute it using the LLVM Just-In-Time compiler.

C++ 519 31 Updated May 15, 2021

Transform ONNX model to PyTorch representation

Python 329 65 Updated Nov 13, 2024

Optimize GEMM. With AVX512 and AVX512-BF16, 800x improvement.

C++ 15 1 Updated Oct 26, 2020

Intel Data Parallel C++ (and SYCL 2020) Tutorial.

C++ 93 16 Updated Dec 15, 2021

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,262 330 Updated May 16, 2023

Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure

C++ 832 337 Updated Mar 27, 2025

ipex verbose toolkit

Python 2 Updated Mar 10, 2022

Python Framework for sparse neural networks

Cuda 19 5 Updated Apr 28, 2017
Next
Showing results