Skip to content

Conversation

@Guikingone
Copy link
Contributor

@Guikingone Guikingone commented Sep 3, 2025

Q A
Bug fix? no
New feature? yes
Docs? yes
Issues Somehow related to #337
License MIT

Hi 👋🏻

This PR aim to introduce a caching layer for Ollama platform (like OpenAI, Anthropic and more already does).

@carsonbot carsonbot added Feature New feature Platform Issues & PRs about the AI Platform component Status: Needs Review labels Sep 3, 2025
@Guikingone Guikingone force-pushed the ollama/prompt_caching branch from 51fa81c to 1fa590a Compare September 3, 2025 12:04
@OskarStark OskarStark changed the title [Platform] Add Ollama prompt cache [Platform][Ollama] Add prompt cache Sep 3, 2025
@Guikingone Guikingone force-pushed the ollama/prompt_caching branch 2 times, most recently from bf5a1fe to 5ef4417 Compare September 3, 2025 12:18
@Guikingone Guikingone force-pushed the ollama/prompt_caching branch from 5ef4417 to cc5f431 Compare September 5, 2025 11:57
@Guikingone Guikingone force-pushed the ollama/prompt_caching branch 3 times, most recently from 8e15d3d to 46bca63 Compare September 5, 2025 15:31
@Guikingone Guikingone force-pushed the ollama/prompt_caching branch 4 times, most recently from 914eddf to 8112ab2 Compare September 15, 2025 08:57
@Guikingone Guikingone force-pushed the ollama/prompt_caching branch 3 times, most recently from de85ef7 to e038555 Compare September 23, 2025 11:56
@chr-hertel
Copy link
Member

Let's zoom a bit out here, for two reasons:

  1. doesn't ollama caching already?
  2. if we want to have it user land, why only ollama?

@Guikingone
Copy link
Contributor Author

doesn't ollama caching already?

Ollama does a "context caching" and/or a K/V caching, it stores the X latest messages for the model window (or pending tokens to speed TTFT), it's not a cache that returns the generated response if the request already exist.

if we want to have it user land, why only Ollama?

Well, because that's the one that I use the most and the easiest to implement first but we can integrate it for every platform if that's the question, we just need to use the API contract, both Anthropic and OpenAI already does it natively 🤔

If the question is: Could we implement it at the platform layer for every platform without relying on API calls, well, that's not a big deal to be honest and we could easily integrate it 🙂

Copy link
Member

@chr-hertel chr-hertel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are left overs in the bundle config of Ollama and the OllamaResultConverter - see previous comments - those changes are not relevant anymore, right?

@Guikingone
Copy link
Contributor Author

IMHO, the token usage must be added in a separate PR, it requires to update existing platforms to send the informations in the result.

Out of the current scope for this PR IMO.

@Guikingone
Copy link
Contributor Author

@junaidbinfarooq I created an issue for the TokenUsage DTO 🙂

@Guikingone Guikingone force-pushed the ollama/prompt_caching branch from 15948ea to bd7637d Compare November 22, 2025 08:56
Copy link
Contributor

@OskarStark OskarStark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After my comments

@Guikingone Guikingone force-pushed the ollama/prompt_caching branch from fcf5955 to c66e733 Compare November 22, 2025 17:40
@OskarStark
Copy link
Contributor

Thank you @Guikingone.

@OskarStark OskarStark merged commit b24a508 into symfony:main Nov 22, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature New feature Platform Issues & PRs about the AI Platform component Status: Reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants