Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support server side response cache #395

Open
kemingy opened this issue Jun 21, 2023 · 5 comments
Open

feat: support server side response cache #395

kemingy opened this issue Jun 21, 2023 · 5 comments
Labels
enhancement New feature or request help wanted Extra attention is needed level/hard rust Pull requests that update Rust code
Milestone

Comments

@kemingy
Copy link
Member

kemingy commented Jun 21, 2023

Describe the feature

refer to:

Some ML models might benefit from the cache.

As for the storage part, I think ideally we should support both local and remote cache.

Why do you need this feature?

No response

Additional context

No response

@kemingy kemingy added enhancement New feature or request help wanted Extra attention is needed rust Pull requests that update Rust code level/hard labels Jun 21, 2023
@AlexXi19
Copy link

Hey Keming, interested in taking a look at this issue, I briefly looked into some rust crates for this feature and found this crate. This crate seems to have support for redis cache, sized cache and timed cache (although i dont believe they have timed + sized cache). My first thought would be to add an axum middleware to handling the caching logic. What are your thoughts on this?

@kemingy
Copy link
Member Author

kemingy commented Jun 26, 2023

Hey Keming, interested in taking a look at this issue, I briefly looked into some rust crates for this feature and found this crate. This crate seems to have support for redis cache, sized cache and timed cache (although i dont believe they have timed + sized cache). My first thought would be to add an axum middleware to handling the caching logic. What are your thoughts on this?

I think this PR should come with a benchmark. I don't know if this lib fits our requirements.

  • multi routes
  • local & remote cache
  • cache TTL
  • cache size limit

I don't know how it handles the cache key. Since the key/value could be a huge image (like 3 x 1000 x 1000 f32). The benchmark should include different key/value types like a simple string, an image, an embedding, etc.

@AlexXi19
Copy link

Good point. Do you think the cache should be aware of the exact content type?

@kemingy
Copy link
Member Author

kemingy commented Jun 26, 2023

Good point. Do you think the cache should be aware of the exact content type?

No. Because we don't really parse the HTTP request body on the Rust side. I list different types of data just because their sizes are different.

@kemingy kemingy added this to the v1 release milestone Jun 27, 2023
@kemingy
Copy link
Member Author

kemingy commented Jun 28, 2023

For the benchmark, you can check https://github.com/tensorchord/inference-benchmark/tree/main/benchmark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed level/hard rust Pull requests that update Rust code
Projects
None yet
Development

No branches or pull requests

2 participants