Skip to content

Ollama: cache model show metadata#64753

Merged
ImLukeF merged 4 commits intomainfrom
imlukef/local-ollama-show-cache
Apr 11, 2026
Merged

Ollama: cache model show metadata#64753
ImLukeF merged 4 commits intomainfrom
imlukef/local-ollama-show-cache

Conversation

@ImLukeF
Copy link
Copy Markdown
Contributor

@ImLukeF ImLukeF commented Apr 11, 2026

Summary

  • cache Ollama /api/show enrichment results for discovery/model-picker flows when a model digest or modified timestamp is stable
  • reuse cached context-window and capability metadata across repeated enrich calls without changing model semantics
  • add targeted tests covering cache reuse and cache invalidation on digest changes

Testing

  • pnpm test extensions/ollama/src/provider-models.test.ts extensions/ollama/src/provider-models.ssrf.test.ts

@openclaw-barnacle openclaw-barnacle bot added size: S maintainer Maintainer-authored PR labels Apr 11, 2026
@ImLukeF ImLukeF marked this pull request as ready for review April 11, 2026 10:39
Copilot AI review requested due to automatic review settings April 11, 2026 10:39
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 11, 2026

Greptile Summary

This PR adds a module-level Map cache for Ollama /api/show enrichment results, keyed on apiBase|modelName|digest (or modified_at as a fallback), with a 256-entry FIFO eviction cap. Cache hits skip the network call entirely; the cache is cleared between tests via an exported resetOllamaModelShowInfoCacheForTest helper.

  • Failed fetches are permanently cached. queryOllamaModelShowInfo never rejects — it swallows every error and returns {}. The empty-resolved promise is stored in the cache immediately, so if Ollama is temporarily down during the first enrichment call, that {} entry is served from cache on every subsequent discovery pass for the same model digest. Users end up with persistently missing context-window and capability metadata until the digest changes.

Confidence Score: 4/5

Safe to merge after addressing the failed-fetch caching issue; the P1 defect can lead to permanently missing model metadata for a session.

The cache hit/miss logic and eviction are structurally sound, and the tests are well-structured. The one P1 finding — that transient fetch failures are cached as permanent empty results for the lifetime of the process (until the model digest changes) — is a real defect that could silently degrade the model-picker UX.

extensions/ollama/src/provider-models.ts — specifically queryOllamaModelShowInfoCached (lines 145–162) and setOllamaModelShowCacheEntry (lines 78–88).

Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/ollama/src/provider-models.ts
Line: 78-88

Comment:
**Unreachable `has(key)` branch**

`queryOllamaModelShowInfoCached` only calls `setOllamaModelShowCacheEntry` after confirming the key is absent (`if (cached) { return … }` covers the present-key path). The `ollamaModelShowInfoCache.has(key)` branch in this setter can therefore never be `true` in practice — it is dead code that will silently mislead future readers who assume it handles a "refresh LRU position" case.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/ollama/src/provider-models.ts
Line: 145-162

Comment:
**Failed fetches permanently cached as empty results**

`queryOllamaModelShowInfo` always resolves — it catches every error and non-OK response and returns `{}`. That settled-to-empty promise is stored in the cache before `await pending` is called. On any subsequent call for the same `name+digest` the cached empty promise is served immediately, so a model whose enrichment failed (Ollama timeout, restart, 500) will never be retried until its digest changes. Users end up with permanently missing context-window and capability metadata for that session.

To fix this, remove the cache entry when the resolved value is empty (e.g., in a `.then()` on `pending` that calls `ollamaModelShowInfoCache.delete(cacheKey)` when both `contextWindow` and `capabilities` are absent), so the next discovery pass can attempt a fresh fetch.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Ollama: cache model show metadata" | Re-trigger Greptile

Comment on lines +78 to +88
function setOllamaModelShowCacheEntry(key: string, value: Promise<OllamaModelShowInfo>): void {
if (ollamaModelShowInfoCache.has(key)) {
ollamaModelShowInfoCache.delete(key);
} else if (ollamaModelShowInfoCache.size >= MAX_OLLAMA_SHOW_CACHE_ENTRIES) {
const oldestKey = ollamaModelShowInfoCache.keys().next().value;
if (typeof oldestKey === "string") {
ollamaModelShowInfoCache.delete(oldestKey);
}
}
ollamaModelShowInfoCache.set(key, value);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unreachable has(key) branch

queryOllamaModelShowInfoCached only calls setOllamaModelShowCacheEntry after confirming the key is absent (if (cached) { return … } covers the present-key path). The ollamaModelShowInfoCache.has(key) branch in this setter can therefore never be true in practice — it is dead code that will silently mislead future readers who assume it handles a "refresh LRU position" case.

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/ollama/src/provider-models.ts
Line: 78-88

Comment:
**Unreachable `has(key)` branch**

`queryOllamaModelShowInfoCached` only calls `setOllamaModelShowCacheEntry` after confirming the key is absent (`if (cached) { return … }` covers the present-key path). The `ollamaModelShowInfoCache.has(key)` branch in this setter can therefore never be `true` in practice — it is dead code that will silently mislead future readers who assume it handles a "refresh LRU position" case.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaned up in b2380c0c99 while fixing the empty-result caching path. The setter now only handles bounded insertion/eviction, without the dead has(key) branch.

Comment on lines +145 to +162
async function queryOllamaModelShowInfoCached(
apiBase: string,
model: Pick<OllamaTagModel, "name" | "digest" | "modified_at">,
): Promise<OllamaModelShowInfo> {
const cacheKey = buildOllamaModelShowCacheKey(apiBase, model);
if (!cacheKey) {
return await queryOllamaModelShowInfo(apiBase, model.name);
}

const cached = ollamaModelShowInfoCache.get(cacheKey);
if (cached) {
return await cached;
}

const pending = queryOllamaModelShowInfo(apiBase, model.name);
setOllamaModelShowCacheEntry(cacheKey, pending);
return await pending;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Failed fetches permanently cached as empty results

queryOllamaModelShowInfo always resolves — it catches every error and non-OK response and returns {}. That settled-to-empty promise is stored in the cache before await pending is called. On any subsequent call for the same name+digest the cached empty promise is served immediately, so a model whose enrichment failed (Ollama timeout, restart, 500) will never be retried until its digest changes. Users end up with permanently missing context-window and capability metadata for that session.

To fix this, remove the cache entry when the resolved value is empty (e.g., in a .then() on pending that calls ollamaModelShowInfoCache.delete(cacheKey) when both contextWindow and capabilities are absent), so the next discovery pass can attempt a fresh fetch.

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/ollama/src/provider-models.ts
Line: 145-162

Comment:
**Failed fetches permanently cached as empty results**

`queryOllamaModelShowInfo` always resolves — it catches every error and non-OK response and returns `{}`. That settled-to-empty promise is stored in the cache before `await pending` is called. On any subsequent call for the same `name+digest` the cached empty promise is served immediately, so a model whose enrichment failed (Ollama timeout, restart, 500) will never be retried until its digest changes. Users end up with permanently missing context-window and capability metadata for that session.

To fix this, remove the cache entry when the resolved value is empty (e.g., in a `.then()` on `pending` that calls `ollamaModelShowInfoCache.delete(cacheKey)` when both `contextWindow` and `capabilities` are absent), so the next discovery pass can attempt a fresh fetch.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in b2380c0c99. The cached lookup now evicts empty {} results so transient /api/show failures are retried on the next discovery pass instead of sticking for the session.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 23c10dfd05

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +159 to +160
const pending = queryOllamaModelShowInfo(apiBase, model.name);
setOllamaModelShowCacheEntry(cacheKey, pending);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Do not cache failed /api/show lookups

This stores the in-flight /api/show result in cache before knowing whether it succeeded, so transient failures (timeouts/non-OK responses that currently resolve to {} in queryOllamaModelShowInfo) get memoized under a stable digest/modified timestamp key. In that case, later discovery/model-picker runs for the same model never retry and keep missing contextWindow/capabilities (for example, losing vision) until process restart or digest change. Cache insertion should be conditional on a successful metadata fetch, or failed/empty results should evict the key.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an in-memory cache for Ollama /api/show enrichment so repeated model discovery/model-picker flows can reuse context-window/capability metadata when a model’s digest/modified timestamp is stable.

Changes:

  • Introduce a bounded (256 entry) cache for /api/show results keyed by base URL + model + digest/modified timestamp.
  • Route enrichOllamaModelsWithContext() through the cached lookup path.
  • Add tests asserting cache reuse on stable digest and invalidation when digest changes.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
extensions/ollama/src/provider-models.ts Adds /api/show cache + cached query helper and wires it into enrichment.
extensions/ollama/src/provider-models.test.ts Adds tests covering cache reuse and invalidation; clears cache between tests.

Comment on lines +159 to +161
const pending = queryOllamaModelShowInfo(apiBase, model.name);
setOllamaModelShowCacheEntry(cacheKey, pending);
return await pending;
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache key is built from a normalized base URL (via resolveOllamaApiBase), but the actual request is still made with the original apiBase argument. If callers ever pass a base URL with /v1 (or trailing slashes) in one call and a normalized base in another, this can cause cache poisoning (e.g., caching {} from a 404 at /v1/api/show) and then reusing it for the correct base. Normalize apiBase once in this function (or inside queryOllamaModelShowInfo) and use the same normalized value both for the cache key and the fetch URL.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in b2380c0c99. queryOllamaModelShowInfo() now normalizes the base URL up front and uses that same normalized value for both the fetch path and the SSRF policy, so the cache key and request target stay aligned.

@ImLukeF ImLukeF force-pushed the imlukef/local-ollama-show-cache branch from fe9b15a to de56dfb Compare April 11, 2026 12:29
@ImLukeF ImLukeF merged commit 3da5822 into main Apr 11, 2026
21 of 27 checks passed
@ImLukeF ImLukeF deleted the imlukef/local-ollama-show-cache branch April 11, 2026 12:30
@ImLukeF
Copy link
Copy Markdown
Contributor Author

ImLukeF commented Apr 11, 2026

Merged via squash.

@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Apr 11, 2026

🔒 Aisle Security Analysis

We found 2 potential security issue(s) in this PR:

# Severity Title
1 🟡 Medium Cache key collision allows /api/show metadata cache poisoning in Ollama provider
2 🔵 Low Unbounded cache key size from remote Ollama model metadata can cause memory pressure (DoS)
1. 🟡 Cache key collision allows /api/show metadata cache poisoning in Ollama provider
Property Value
Severity Medium
CWE CWE-20
Location extensions/ollama/src/provider-models.ts:67-76

Description

A module-level cache (ollamaModelShowInfoCache) keys entries by string concatenation of untrusted model fields (model.name, model.digest/modified_at) separated with |.

  • model.name, digest, and modified_at originate from the remote Ollama endpoint (/api/tags) and are not validated/escaped before being used in the cache key.
  • Because the key format is a raw concatenated string, an Ollama endpoint that returns values containing the delimiter (|) can deliberately create colliding keys, causing one model’s /api/show metadata (context window / capabilities) to be reused for a different model.
  • Downstream, capabilities influences behavior (e.g., enabling vision/tooling paths), so poisoning this cache can lead to capability spoofing/state confusion.

Vulnerable code:

return `${resolveOllamaApiBase(apiBase)}|${model.name}|${version}`;

Recommendation

Avoid building cache keys via ambiguous string concatenation of untrusted components.

Use a structured key that cannot collide, such as nested Maps or a stable encoding that escapes separators.

Example (nested maps):

const cache = new Map<string, Map<string, Map<string, Promise<OllamaModelShowInfo>>>>();

function getEntry(apiBase: string, modelName: string, version: string) {
  let byBase = cache.get(apiBase);
  if (!byBase) cache.set(apiBase, (byBase = new Map()));
  let byName = byBase.get(modelName);
  if (!byName) byBase.set(modelName, (byName = new Map()));
  return { byName, key: version };
}

Additionally, validate/normalize model.name (and other fields used for keying) to an expected character set (e.g., disallow control characters and |) before using them as cache identifiers.

2. 🔵 Unbounded cache key size from remote Ollama model metadata can cause memory pressure (DoS)
Property Value
Severity Low
CWE CWE-400
Location extensions/ollama/src/provider-models.ts:67-76

Description

The /api/show metadata cache key is built by concatenating apiBase, model.name, and model.digest/modified_at obtained from /api/tags without any size limits.

  • fetchOllamaModels() populates OllamaTagModel fields directly from the server response (only filtering for truthy name).
  • queryOllamaModelShowInfoCached() uses those fields to construct a cache key string.
  • An attacker controlling the configured Ollama base URL (or a compromised Ollama endpoint) can return extremely large name, digest, or modified_at values, causing large cache key allocations.
  • Even with a 256-entry cap, 256 very large keys (and their associated Promise closures) can lead to significant memory usage and UI/process instability.

Vulnerable code:

const version = model.digest?.trim() || model.modified_at?.trim();
return `${resolveOllamaApiBase(apiBase)}|${model.name}|${version}`;

Recommendation

Mitigate memory pressure by ensuring cache keys are bounded in size and derived from validated inputs.

Options:

  1. Hash the key components instead of concatenating full strings:
import { createHash } from "node:crypto";

function buildOllamaModelShowCacheKey(apiBase: string, model: Pick<OllamaTagModel, "name"|"digest"|"modified_at">) {
  const version = model.digest?.trim() || model.modified_at?.trim();
  if (!version) return undefined;

  const raw = `${resolveOllamaApiBase(apiBase)}|${model.name}|${version}`;
  return createHash("sha256").update(raw).digest("hex");
}
  1. Enforce maximum lengths for model.name and version (e.g., drop caching if too long):
if (model.name.length > 256 || version.length > 256) return undefined;

Additionally, consider validating/sanitizing fields from /api/tags (e.g., ignore unusually large values) before using them in caching or UI logic.


Analyzed PR: #64753 at commit de56dfb

Last updated on: 2026-04-11T12:37:35Z

amittell pushed a commit to amittell/openclaw that referenced this pull request Apr 11, 2026
Merged via squash.

Prepared head SHA: de56dfb
Co-authored-by: ImLukeF <92253590+ImLukeF@users.noreply.github.com>
Co-authored-by: ImLukeF <92253590+ImLukeF@users.noreply.github.com>
Reviewed-by: @ImLukeF
leonardsellem pushed a commit to leonardsellem/moltbot that referenced this pull request Apr 11, 2026
Merged via squash.

Prepared head SHA: de56dfb
Co-authored-by: ImLukeF <92253590+ImLukeF@users.noreply.github.com>
Co-authored-by: ImLukeF <92253590+ImLukeF@users.noreply.github.com>
Reviewed-by: @ImLukeF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintainer Maintainer-authored PR size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants