From 3d90f9923e7cf501b52566ef544614d67b10f357 Mon Sep 17 00:00:00 2001
From: Oleksandr Kuvshynov <661042+okuvshynov@users.noreply.github.com>
Date: Sun, 5 Oct 2025 15:50:11 -0400
Subject: [PATCH] server: update readme to mention n_past_max metric

https://github.com/ggml-org/llama.cpp/pull/15361 added new metric
exported, but I've missed this doc.
---
 tools/server/README.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/server/README.md b/tools/server/README.md
index 9f7ab229f7ddf..6825c8bf300c6 100644
--- a/tools/server/README.md
+++ b/tools/server/README.md
@@ -1045,6 +1045,7 @@ Available metrics:
 - `llamacpp:kv_cache_tokens`: KV-cache tokens.
 - `llamacpp:requests_processing`: Number of requests processing.
 - `llamacpp:requests_deferred`: Number of requests deferred.
+- `llamacpp:n_past_max`: High watermark of the context size observed.
 
 ### POST `/slots/{id_slot}?action=save`: Save the prompt cache of the specified slot to a file.