-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
tech-debtTechnical DebtTechnical Debt
Description
🤔 What is the technical debt you think should be addressed?
Currently we store all inputs in every Responses API request. This means storage takes O(n^2) with n-turn Responses conversation using previous_response_id
.
We can optimize this by storing the previous_response_ids and only storing the new input at each turn.
Code pointer:
Writing:
llama-stack/llama_stack/providers/inline/agents/meta_reference/responses/openai_responses.py
Line 167 in 4819a2e
input=input_items_data, |
Loading:
llama-stack/llama_stack/providers/inline/agents/meta_reference/responses/openai_responses.py
Line 227 in 4819a2e
input = await self._prepend_previous_response(input, previous_response_id) |
💡 What is the benefit of addressing this technical debt?
storage optimization
Other thoughts
No response
Metadata
Metadata
Assignees
Labels
tech-debtTechnical DebtTechnical Debt