Skip to content
View UberIzual's full-sized avatar

Block or report UberIzual

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, Parallelism, etc. 🎉🎉

3,596 252 Updated Mar 4, 2025

Disaggregated serving system for Large Language Models (LLMs).

Jupyter Notebook 482 52 Updated Aug 19, 2024

Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

Python 3,640 335 Updated Mar 7, 2025

前端后端同时开源。 Ai-to-pptx是一个使用AI技术(DeepSeek)制作PPTX的助手,支持在线生成和导出PPTX。 主要功能: 1 使用DeepSeek等大语言模型来生成大纲 2 生成PPTX的时候可以选择不同的模板 3 支持导出PPTX

TypeScript 807 111 Updated Mar 6, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 7,637 667 Updated Mar 7, 2025

Stable Diffusion and Flux in pure C/C++

C++ 3,894 351 Updated Mar 1, 2025

Community maintained hardware plugin for vLLM on Ascend

Python 291 48 Updated Mar 7, 2025

Cost-efficient and pluggable Infrastructure components for GenAI inference

Jupyter Notebook 3,031 268 Updated Mar 8, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,192 781 Updated Mar 1, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,065 609 Updated Mar 6, 2025

WeSQL is an innovative MySQL distribution that adopts a compute-storage separation architecture, with storage backed by S3 (and S3-compatible systems). It can run on any cloud, ensuring no vendor l…

C++ 785 27 Updated Feb 25, 2025

POE2 Game Assistance / Autopot and more

26 4 Updated Feb 25, 2025

Thor(雷神托尔) 是一款强大的人工智能模型管理工具,其主要目的是为了实现多种AI模型的统一管理和使用。通过Thor(雷神托尔),用户可以轻松地管理和使用众多AI模型,而且Thor(雷神托尔)兼容OpenAI的接口格式,使得使用更加方便。

C# 172 38 Updated Mar 6, 2025

This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"

Python 1,225 100 Updated Mar 6, 2025

博客

17 2 Updated Jan 24, 2025

A text extraction library supporting PDFs, images, office documents and more

Python 1,553 50 Updated Mar 7, 2025

AI-first Search & Answer Engine for work. Open-source alternative to Glean.

TypeScript 478 34 Updated Mar 7, 2025

The finest Windows Optimizer

C# 15,504 1,011 Updated Aug 18, 2024

Visual Data Transformation and Data Preparation. Low-Code Python-based ETL.

TypeScript 1,008 58 Updated Mar 7, 2025

JupyterLab extension to create GitHub commits & pull requests

TypeScript 117 14 Updated Jun 25, 2024

Tools for diffing and merging of Jupyter notebooks.

TypeScript 2,710 161 Updated Sep 21, 2024

Roo Code (prev. Roo Cline) gives you a whole dev team of AI agents in your code editor.

TypeScript 7,604 671 Updated Mar 7, 2025

Benchmark llm performance

Python 93 44 Updated Jul 25, 2024

A docker image based in ubuntu to run docker containers inside docker containers

Shell 209 89 Updated Mar 2, 2025

我们致力于量化知识的开源与汉化,打破国内外量化金融行业信息差。

935 68 Updated Mar 5, 2025

PromptSite is a lightweight prompt version management package that helps you version control, track, experiment and debug with your LLM prompts with ease.

Python 22 Updated Feb 3, 2025

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

TypeScript 65,130 15,822 Updated Mar 7, 2025

A repo for Packs written and maintained by Nomad community members

HCL 227 78 Updated Aug 26, 2024
Next
Showing results