-
Notifications
You must be signed in to change notification settings - Fork 282
Insights: vllm-project/aibrix
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
v0.2.1
published
Mar 9, 2025
26 Pull requests merged by 8 people
-
[Docs] Update Slack link
#841 merged
Mar 11, 2025 -
Bump AIBrix version to v0.2.1 in manifests
#839 merged
Mar 10, 2025 -
Add deepseek-r1 671B deployment sample and docs
#835 merged
Mar 10, 2025 -
Fix repeated initialization of gateway routers and add unit test for prefix cache
#838 merged
Mar 10, 2025 -
[Misc] Fix CI issue on release branch and clean up logs
#837 merged
Mar 9, 2025 -
Disable Github Container Registry build & push in release workflow
#836 merged
Mar 9, 2025 -
Update version and tags to v0.2.1
#833 merged
Mar 9, 2025 -
cherry-pick #825 #826 part of #717 in release branch
#828 merged
Mar 8, 2025 -
Support to create default HttpRoute for RayClusterFleet
#826 merged
Mar 8, 2025 -
Increase envoy proxy memory config and client connection buffersize
#825 merged
Mar 8, 2025 -
Add /v1/models endpoint to gateway
#802 merged
Mar 8, 2025 -
[Bug] Added Startup Probe in Quickstart Model
#773 merged
Mar 8, 2025 -
Fix the paths in lambda cloud doc
#824 merged
Mar 8, 2025 -
Workload generation scripts for prefix aware routing
#820 merged
Mar 7, 2025 -
Reconfigure workload generator for predefined synthetic patterns
#771 merged
Mar 7, 2025 -
[Docs]: fix vllm mock app Unauthorized response
#817 merged
Mar 7, 2025 -
[CI]: update release tags pattern
#815 merged
Mar 7, 2025 -
[Refactor]: gateway-plugins ext-proc server codebase
#810 merged
Mar 7, 2025 -
[Misc] Adding model field to each request
#812 merged
Mar 6, 2025 -
Move modelAdapter runtime validation to webhook
#786 merged
Mar 6, 2025 -
Cherry pick #776 #779 #788 #789 #794 to release branch
#809 merged
Mar 6, 2025 -
cherry-pick Enable CI tests for release branch (#805)
#808 merged
Mar 6, 2025 -
cherry-pick Enable CI tests for release branch (#805)
#807 merged
Mar 6, 2025 -
Enable CI tests for release branch
#805 merged
Mar 6, 2025 -
Added target pod to client result and made clients consistent
#799 merged
Mar 6, 2025 -
Update request message processing for /v1/completion input
#794 merged
Mar 6, 2025
3 Pull requests opened by 3 people
-
[Misc][Docs]: GCP and Kubernetes Terraform Deployment Modules
#823 opened
Mar 7, 2025 -
Support OpenAI api style /v1/models response
#829 opened
Mar 8, 2025 -
[WIP] Generate workload based on prefix sharing synthetic data
#840 opened
Mar 10, 2025
12 Issues closed by 4 people
-
[Bug][Docs]: Slack Link is Dead
#830 closed
Mar 11, 2025 -
Add Deepseek R1 multi-node yaml samples
#754 closed
Mar 10, 2025 -
E2E TestPrefixCacheModelInference test is flaky
#816 closed
Mar 10, 2025 -
Support multi-node & autoscaling & routing together for models like Deepseek-R1
#758 closed
Mar 8, 2025 -
HTTPRoute is not created for RayClusterFleet result in default routing doesn't work
#827 closed
Mar 8, 2025 -
RayClusterReplicaSet didn't populate annotations to headers and workers
#775 closed
Mar 8, 2025 -
Request for Documentation on AI Accelerator Diagnostic and Failure Mockup Tools
#818 closed
Mar 8, 2025 -
an endpoint to list all the models deployed in the AiBrix system
#800 closed
Mar 8, 2025 -
[bug] Probes for quickstart model kill pod
#772 closed
Mar 8, 2025 -
Basic prefix workload generation with configurable parameter
#819 closed
Mar 7, 2025 -
Tests on release branch are not enabled and it blocks minor version release
#804 closed
Mar 6, 2025 -
Making the streaming client and non-streaming client consistent
#798 closed
Mar 6, 2025
16 Issues opened by 5 people
-
[RateLimit] TPM computation based on origin response
#848 opened
Mar 11, 2025 -
Provide production grade overlay manifests
#847 opened
Mar 11, 2025 -
[RFC]: Make API Gateway interface OpenAI compatible
#846 opened
Mar 11, 2025 -
[Observation] Improve AIBrix control plane monitoring
#845 opened
Mar 11, 2025 -
[Docs] Provide AIBrix upgrade guidance
#844 opened
Mar 11, 2025 -
[Feature] Support inference engine SGLang
#843 opened
Mar 11, 2025 -
Ask for testing suggestions
#842 opened
Mar 10, 2025 -
Some prompts with special character fail the benchmark script
#832 opened
Mar 9, 2025 -
RayClusterFleet controllers shows some reconcilation issues
#831 opened
Mar 8, 2025 -
Making prefix-cache-and-load-aware routing more general
#814 opened
Mar 7, 2025 -
Prefix sharing workload generation
#813 opened
Mar 7, 2025 -
ModelAdapter seems to be working abnormally
#801 opened
Mar 5, 2025 -
Making max-tokens configurable in the benchmark client.
#797 opened
Mar 5, 2025 -
Recording request routing(target-pod) in the benchmark client
#796 opened
Mar 5, 2025 -
Piggybacking more information in response header
#795 opened
Mar 5, 2025
15 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Refactor make deploy to use apply instead of create
#793 commented on
Mar 6, 2025 • 4 new comments -
Update documentation for Quick Start and Base Model deployment
#745 commented on
Mar 8, 2025 • 2 new comments -
Use string based tokenizer in prefix cache
#774 commented on
Mar 10, 2025 • 1 new comment -
Do LLM Cache Support V100 hardware?
#791 commented on
Mar 5, 2025 • 0 new comments -
[Question] How to access the vLLM-Vineyard integration code mentioned in Distributed KV Cache documentation?
#733 commented on
Mar 6, 2025 • 0 new comments -
We still see some errors that not explainable if httpRoute is missing
#778 commented on
Mar 8, 2025 • 0 new comments -
Replace our cloned 3rd-party yamls with helm charts
#452 commented on
Mar 8, 2025 • 0 new comments -
[CI] Generate helm package from kubebuilder manifests
#66 commented on
Mar 8, 2025 • 0 new comments -
Failed to run benchmark scripts against the endpoint
#783 commented on
Mar 9, 2025 • 0 new comments -
Introduce wait time or retry before moving adapter to another pod if that pod is not ready
#258 commented on
Mar 10, 2025 • 0 new comments -
[RFC] Support different inference engines like vLLM, SGLang, TensorRT-LLM
#137 commented on
Mar 11, 2025 • 0 new comments -
v0.3.0 roadmap
#698 commented on
Mar 11, 2025 • 0 new comments -
Add model API
#299 commented on
Mar 6, 2025 • 0 new comments -
[WIP] Gateway refactoring
#393 commented on
Mar 6, 2025 • 0 new comments -
WIP: Add unit test code coverage
#627 commented on
Mar 6, 2025 • 0 new comments