Skip to content
View wmira's full-sized avatar

Block or report wmira

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

llm

8 repositories

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 70,816 13,578 Updated Feb 21, 2026

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.

Python 52,516 4,360 Updated Feb 19, 2026

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…

Python 36,455 5,901 Updated Feb 21, 2026

Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!

Python 8,858 699 Updated Feb 20, 2026

Supercharge Your LLM Application Evaluations 🚀

Python 12,666 1,254 Updated Jan 31, 2026

High-performance In-browser LLM Inference Engine

TypeScript 17,381 1,200 Updated Feb 18, 2026

ML-powered speech recognition directly in your browser

TypeScript 3,245 420 Updated Oct 1, 2024

ML-powered speech synthesis directly in your browser

TypeScript 175 16 Updated Feb 14, 2025