-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
usage-based-routing-ttl-on-cache #3412
usage-based-routing-ttl-on-cache #3412
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
the ttl for usage is 1min because the TPM/RPM limits are across 1 minute @sumanth13131 can you share a repro of the problem you're facing? |
bump on this? @sumanth13131 |
Hi @krrishdholakia, PS: The same key can be possible by the next day also. This causes ![]() |
@sumanth13131 so if i understand the problem
Can we fix this by having a more precise key name? including the current date as part of it we do this for lowest latency routing -
|
Putting TTL is fine right? |
You're right. But instead of hardcoding it, can we have it be a controllable param Like in lowest latency routing -
Then in the test, let's have it set to a very low amount (5s?) -> and check if it's set + evicted within the expected timeframe |
Sure |
Hi @krrishdholakia, |
@@ -134,6 +134,56 @@ async def test_acompletion_caching_on_router(): | |||
traceback.print_exc() | |||
pytest.fail(f"Error occurred: {e}") | |||
|
|||
@pytest.mark.asyncio | |||
async def test_completion_caching_on_router(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @sumanth13131 how does this test your cache ttl change?
redis_port=os.environ["REDIS_PORT"], | ||
cache_responses=True, | ||
timeout=30, | ||
routing_strategy_args={"ttl": 10}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @krrishdholakia
followed the same as latency-based-routing
.
default TTL value is 1min(60secs), for this test case alone here we set it for only 10sec.
RPM/TPM counter cache using a 1-minute window policy. It is better to put the TTL for that counter key.