Optimize responses storage

### 🤔 What is the technical debt you think should be addressed?

Currently we store *all* inputs in every Responses API request. This means storage takes O(n^2) with n-turn Responses conversation using `previous_response_id`. 

We can optimize this by storing the previous_response_ids and only storing the new input at each turn.

## Code pointer:
Writing:
https://github.com/llamastack/llama-stack/blob/4819a2e0ee43ca623a365b58ff8c3497629cee8f/llama_stack/providers/inline/agents/meta_reference/responses/openai_responses.py#L167

Loading:
https://github.com/llamastack/llama-stack/blob/4819a2e0ee43ca623a365b58ff8c3497629cee8f/llama_stack/providers/inline/agents/meta_reference/responses/openai_responses.py#L227

### 💡 What is the benefit of addressing this technical debt?

storage optimization

### Other thoughts

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize responses storage #3646

🤔 What is the technical debt you think should be addressed?

Code pointer:

💡 What is the benefit of addressing this technical debt?

Other thoughts

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimize responses storage #3646

Description

🤔 What is the technical debt you think should be addressed?

Code pointer:

💡 What is the benefit of addressing this technical debt?

Other thoughts

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions