Skip to content
View DaiZhiyuan's full-sized avatar
  • Beijing, China

Block or report DaiZhiyuan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Contains the code examples from The UVM Primer Book sorted by chapters.

SystemVerilog 518 211 Updated Dec 24, 2021

Reference examples and short projects using UVM Methodology

SystemVerilog 261 156 Updated May 18, 2022

Build a SystemVerilog Environment for an ALU, using OOP testbench components as; stimulus generator, driver, monitor, scoreboard. ALU was verified using QuestaSim.

SystemVerilog 9 Updated Mar 4, 2023

🧠 Guide to Building RAG (Retrieval-Augmented Generation) Applications

107 15 Updated Mar 27, 2025

Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

Python 990 230 Updated Jul 8, 2019

Code for the paper "Language Models are Unsupervised Multitask Learners"

Python 23,247 5,635 Updated Aug 14, 2024

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 36,270 6,162 Updated Mar 29, 2025

Learn where some of the network sysctl variables fit into the Linux/Kernel network flow. Translations: 🇷🇺

5,626 526 Updated Feb 23, 2025

bpftune uses BPF to auto-tune Linux systems

C 1,575 85 Updated Mar 27, 2025

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,434 388 Updated Mar 5, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 8,390 814 Updated Mar 27, 2025
Jupyter Notebook 151 32 Updated Sep 11, 2022

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,108 536 Updated Mar 28, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,329 684 Updated Mar 28, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,385 811 Updated Mar 1, 2025

Reexamining Direct Cache Access to Optimize I/O Intensive Applications for Multi-hundred-gigabit Networks

Makefile 92 19 Updated Sep 2, 2021

Vendor-neutral programmable observability pipelines.

Go 1,617 489 Updated Mar 28, 2025

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

TypeScript 47,046 4,333 Updated Mar 28, 2025

Fully-featured web interface for Ollama LLMs

TypeScript 1,165 282 Updated Feb 4, 2025

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 104,510 16,910 Updated Mar 28, 2025

面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版

Jupyter Notebook 16,512 2,046 Updated Feb 25, 2025

code for benchmarking GPU performance based on cublasSgemm and cublasHgemm

Cuda 31 18 Updated May 20, 2022

Sample code to test and benchmark large CuFFTs on Nvidia GPUs

Cuda 3 Updated Aug 9, 2024

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 13,173 899 Updated Mar 25, 2025

LLM inference in C/C++

C++ 77,325 11,238 Updated Mar 28, 2025

Example models using DeepSpeed

Python 6,391 1,075 Updated Mar 27, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 37,662 4,321 Updated Mar 29, 2025

仅需Python基础,从0构建大语言模型;从0逐步构建GLM4\Llama3\RWKV6, 深入理解大模型原理

Jupyter Notebook 2,572 364 Updated Aug 15, 2024
Next
Showing results