Skip to content

Can we cache some pre-known prompt word sets K and V for use in the Context phase? #975

@yifeihappy

Description

@yifeihappy

In my application scenario, I hope to know whether the product is related to our search terms; the collection of products is known in advance. How can it be optimized? My idea is to cache the K and V related to the title of the product first, and then concatenate them with the search terms in the Context phase and use them directly. Based on TensorRT-llm, can this goal be achieved?

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions