airMeng

Follow

Meng, Hengyu airMeng

Follow

27 followers · 51 following

Intel
Shanghai
https://read.cv/hym

Achievements

Achievements

Organizations

Stars

richards199999 / Thinking-Claude

Let your Claude able to think

TypeScript 14,820 1,725 Updated Mar 10, 2025

intel / auto-round

Advanced Quantization Algorithm for LLMs/VLMs.

Python 410 31 Updated Mar 28, 2025

intel / intel-npu-acceleration-library

Intel® NPU Acceleration Library

Python 653 74 Updated Jan 13, 2025

intel / neural-speed

An innovative library for efficient LLM inference via low-bit quantization

C++ 351 38 Updated Aug 30, 2024

intel / xetla

C++ 61 20 Updated Dec 18, 2024

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 2,047 183 Updated Mar 26, 2025

intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2,164 210 Updated Oct 8, 2024

intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2,363 264 Updated Mar 28, 2025

sol-prog / x86-64-minimal-JIT-compiler-Cpp

Writing a minimal x86-64 JIT compiler in C++

C++ 101 18 Updated Apr 28, 2018

huggingface / optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

Jupyter Notebook 454 127 Updated Mar 27, 2025

intel / intel-extension-for-tensorflow

Intel® Extension for TensorFlow*

C++ 336 42 Updated Mar 18, 2025

mlc-ai / notebooks

Jupyter Notebook 204 67 Updated Nov 22, 2024

Lewuathe / mlir-hello

MLIR Sample dialect

C++ 118 35 Updated Feb 18, 2025

polymage-labs / mlirx

MLIRX is now defunct. Please see PolyBlocks - https://docs.polymagelabs.com

38 9 Updated Dec 1, 2023

deepsea-inria / pasl

Parallel Algorithm Scheduling Library

C++ 106 19 Updated Jul 24, 2017

pigirons / sgemm_hsw

This is an implementation of sgemm_kernel on L1d cache.

Assembly 225 33 Updated Feb 26, 2024

sampsyo / bril

an educational compiler intermediate representation

Rust 646 280 Updated Mar 12, 2025

OpenPPL / ppl.nn

A primitive library for neural network

C++ 1,325 218 Updated Nov 24, 2024

merrymercy / awesome-tensor-compilers

A list of awesome compiler projects and papers for tensor computation and deep learning.

2,526 308 Updated Oct 19, 2024

marsupialtail / sparsednn

Fast sparse deep learning on CPUs

Python 52 8 Updated Sep 28, 2022

NervanaSystems / maxas

Assembler for NVIDIA Maxwell architecture

Sass 984 165 Updated Jan 3, 2023

oneapi-src / oneAPI-samples

Samples for Intel® oneAPI Toolkits

C++ 1,012 718 Updated Mar 24, 2025

jmmartinez / easy-just-in-time

LLVM Optimization to extract a function, embedded in its intermediate representation in the binary, and execute it using the LLVM Just-In-Time compiler.

C++ 519 31 Updated May 15, 2021

Talmaj / onnx2pytorch

Transform ONNX model to PyTorch representation

Python 329 65 Updated Nov 13, 2024

Leslie-Fang / GEMM_Optimization

Optimize GEMM. With AVX512 and AVX512-BF16, 800x improvement.

C++ 15 1 Updated Oct 26, 2020

jeffhammond / dpcpp-tutorial

Intel Data Parallel C++ (and SYCL 2020) Tutorial.

C++ 93 16 Updated Dec 15, 2021

bytedance / lightseq

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,262 330 Updated May 16, 2023

onnx / onnx-mlir

Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure

C++ 832 337 Updated Mar 27, 2025

chunyuan-w / ipex_verbose

ipex verbose toolkit

Python 2 Updated Mar 10, 2022

gangiman / PySparseConvNet

Python Framework for sparse neural networks

Cuda 19 5 Updated Apr 28, 2017