Skip to content
View wxsms's full-sized avatar
🥕
🥕

Organizations

@vuejs @uiv-lib @vuejs-translations

Block or report wxsms

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization

JavaScript 4,051 391 Updated Feb 23, 2025

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

JavaScript 1,256 66 Updated Dec 3, 2024

Go configuration with fangs

Go 27,978 2,032 Updated Feb 18, 2025

🐶 Kubernetes CLI To Manage Your Clusters In Style!

Go 28,835 1,813 Updated Mar 7, 2025

FUSE-based file system backed by Amazon S3

C++ 8,961 1,035 Updated Feb 28, 2025

NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs

C++ 471 61 Updated Feb 20, 2025

Asynchronous HTTP client/server framework for asyncio and Python

Python 15,477 2,064 Updated Mar 7, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 12 4 Updated Mar 8, 2025

NVIDIA GPUDirect Storage Driver

C 228 34 Updated Dec 11, 2024

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Python 2,697 361 Updated Mar 8, 2025

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

Python 29,947 3,742 Updated Aug 6, 2024

An easy to use PyTorch to TensorRT converter

Python 4,688 683 Updated Aug 17, 2024

VNC client web application

JavaScript 12,060 2,368 Updated Feb 28, 2025

A Kubernetes media gateway for WebRTC. Contact: info@l7mp.io

Go 812 69 Updated Feb 20, 2025

Open-Source Low-Latency Accelerated Linux WebRTC HTML5 Remote Desktop Streaming Platform for Self-Hosting, Containers, Kubernetes, or Cloud/HPC

CSS 436 54 Updated Mar 8, 2025

**Official** 李宏毅 (Hung-yi Lee) 機器學習 Machine Learning 2022 Spring

Jupyter Notebook 2,272 516 Updated Oct 18, 2022

Model Compression Toolbox for Large Language Models and Diffusion Models

Python 364 27 Updated Feb 21, 2025

[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Cuda 880 61 Updated Mar 9, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,637 1,136 Updated Mar 7, 2025

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 1,447 130 Updated Mar 9, 2025

OneDiff: An out-of-the-box acceleration library for diffusion models.

Jupyter Notebook 1,834 123 Updated Jan 13, 2025

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,293 2,165 Updated Mar 7, 2025

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 1,065 94 Updated Mar 7, 2025

PyTorch native quantization and sparsity for training and inference

Python 1,887 229 Updated Mar 8, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 2,315 240 Updated Mar 9, 2025

a Go package to interact with arbitrary JSON

Go 3,759 495 Updated Apr 8, 2024

A modern replacement for Redis and Memcached

C++ 27,416 1,012 Updated Mar 9, 2025

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration

Go 4,662 928 Updated Mar 7, 2025

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 9,153 650 Updated Mar 6, 2025

Connect, secure, control, and observe services.

Go 36,550 7,885 Updated Mar 9, 2025
Next
Showing results