Pulse · vllm-project/aibrix · GitHub

March 4, 2025 – March 11, 2025

Overview

29 Active pull requests

28 Active issues

Could not load contribution data

Please try again later

1 Release published by 1 person

v0.2.1
published Mar 9, 2025

26 Pull requests merged by 8 people

[Docs] Update Slack link
#841 merged Mar 11, 2025
Bump AIBrix version to v0.2.1 in manifests
#839 merged Mar 10, 2025
Add deepseek-r1 671B deployment sample and docs
#835 merged Mar 10, 2025
Fix repeated initialization of gateway routers and add unit test for prefix cache
#838 merged Mar 10, 2025
[Misc] Fix CI issue on release branch and clean up logs
#837 merged Mar 9, 2025
Disable Github Container Registry build & push in release workflow
#836 merged Mar 9, 2025
Update version and tags to v0.2.1
#833 merged Mar 9, 2025
cherry-pick #825 #826 part of #717 in release branch
#828 merged Mar 8, 2025
Support to create default HttpRoute for RayClusterFleet
#826 merged Mar 8, 2025
Increase envoy proxy memory config and client connection buffersize
#825 merged Mar 8, 2025
Add /v1/models endpoint to gateway
#802 merged Mar 8, 2025
[Bug] Added Startup Probe in Quickstart Model
#773 merged Mar 8, 2025
Fix the paths in lambda cloud doc
#824 merged Mar 8, 2025
Workload generation scripts for prefix aware routing
#820 merged Mar 7, 2025
Reconfigure workload generator for predefined synthetic patterns
#771 merged Mar 7, 2025
[Docs]: fix vllm mock app Unauthorized response
#817 merged Mar 7, 2025
[CI]: update release tags pattern
#815 merged Mar 7, 2025
[Refactor]: gateway-plugins ext-proc server codebase
#810 merged Mar 7, 2025
[Misc] Adding model field to each request
#812 merged Mar 6, 2025
Move modelAdapter runtime validation to webhook
#786 merged Mar 6, 2025
Cherry pick #776 #779 #788 #789 #794 to release branch
#809 merged Mar 6, 2025
cherry-pick Enable CI tests for release branch (#805)
#808 merged Mar 6, 2025
cherry-pick Enable CI tests for release branch (#805)
#807 merged Mar 6, 2025
Enable CI tests for release branch
#805 merged Mar 6, 2025
Added target pod to client result and made clients consistent
#799 merged Mar 6, 2025
Update request message processing for /v1/completion input
#794 merged Mar 6, 2025

3 Pull requests opened by 3 people

[Misc][Docs]: GCP and Kubernetes Terraform Deployment Modules
#823 opened Mar 7, 2025
Support OpenAI api style /v1/models response
#829 opened Mar 8, 2025
[WIP] Generate workload based on prefix sharing synthetic data
#840 opened Mar 10, 2025

12 Issues closed by 4 people

[Bug][Docs]: Slack Link is Dead
#830 closed Mar 11, 2025
Add Deepseek R1 multi-node yaml samples
#754 closed Mar 10, 2025
E2E TestPrefixCacheModelInference test is flaky
#816 closed Mar 10, 2025
Support multi-node & autoscaling & routing together for models like Deepseek-R1
#758 closed Mar 8, 2025
HTTPRoute is not created for RayClusterFleet result in default routing doesn't work
#827 closed Mar 8, 2025
RayClusterReplicaSet didn't populate annotations to headers and workers
#775 closed Mar 8, 2025
Request for Documentation on AI Accelerator Diagnostic and Failure Mockup Tools
#818 closed Mar 8, 2025
an endpoint to list all the models deployed in the AiBrix system
#800 closed Mar 8, 2025
[bug] Probes for quickstart model kill pod
#772 closed Mar 8, 2025
Basic prefix workload generation with configurable parameter
#819 closed Mar 7, 2025
Tests on release branch are not enabled and it blocks minor version release
#804 closed Mar 6, 2025
Making the streaming client and non-streaming client consistent
#798 closed Mar 6, 2025

16 Issues opened by 5 people

[RateLimit] TPM computation based on origin response
#848 opened Mar 11, 2025
Provide production grade overlay manifests
#847 opened Mar 11, 2025
[RFC]: Make API Gateway interface OpenAI compatible
#846 opened Mar 11, 2025
[Observation] Improve AIBrix control plane monitoring
#845 opened Mar 11, 2025
[Docs] Provide AIBrix upgrade guidance
#844 opened Mar 11, 2025
[Feature] Support inference engine SGLang
#843 opened Mar 11, 2025
Ask for testing suggestions
#842 opened Mar 10, 2025
OSError: /models/deepseek-r1 does not appear to have a file named configuration_deepseek.py. Checkout 'https://huggingface.co//models/deepseek-r1/tree/None' for available files.
#834 opened Mar 9, 2025
Some prompts with special character fail the benchmark script
#832 opened Mar 9, 2025
RayClusterFleet controllers shows some reconcilation issues
#831 opened Mar 8, 2025
Making prefix-cache-and-load-aware routing more general
#814 opened Mar 7, 2025
Prefix sharing workload generation
#813 opened Mar 7, 2025
ModelAdapter seems to be working abnormally
#801 opened Mar 5, 2025
Making max-tokens configurable in the benchmark client.
#797 opened Mar 5, 2025
Recording request routing(target-pod) in the benchmark client
#796 opened Mar 5, 2025
Piggybacking more information in response header
#795 opened Mar 5, 2025

15 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Refactor make deploy to use apply instead of create
#793 commented on Mar 6, 2025 • 4 new comments
Update documentation for Quick Start and Base Model deployment
#745 commented on Mar 8, 2025 • 2 new comments
Use string based tokenizer in prefix cache
#774 commented on Mar 10, 2025 • 1 new comment
Do LLM Cache Support V100 hardware?
#791 commented on Mar 5, 2025 • 0 new comments
[Question] How to access the vLLM-Vineyard integration code mentioned in Distributed KV Cache documentation?
#733 commented on Mar 6, 2025 • 0 new comments
We still see some errors that not explainable if httpRoute is missing
#778 commented on Mar 8, 2025 • 0 new comments
Replace our cloned 3rd-party yamls with helm charts
#452 commented on Mar 8, 2025 • 0 new comments
[CI] Generate helm package from kubebuilder manifests
#66 commented on Mar 8, 2025 • 0 new comments
Failed to run benchmark scripts against the endpoint
#783 commented on Mar 9, 2025 • 0 new comments
Introduce wait time or retry before moving adapter to another pod if that pod is not ready
#258 commented on Mar 10, 2025 • 0 new comments
[RFC] Support different inference engines like vLLM, SGLang, TensorRT-LLM
#137 commented on Mar 11, 2025 • 0 new comments
v0.3.0 roadmap
#698 commented on Mar 11, 2025 • 0 new comments
Add model API
#299 commented on Mar 6, 2025 • 0 new comments
[WIP] Gateway refactoring
#393 commented on Mar 6, 2025 • 0 new comments
WIP: Add unit test code coverage
#627 commented on Mar 6, 2025 • 0 new comments