diff --git a/tools/server/README.md b/tools/server/README.md index 9f7ab229f7ddf..6825c8bf300c6 100644 --- a/tools/server/README.md +++ b/tools/server/README.md @@ -1045,6 +1045,7 @@ Available metrics: - `llamacpp:kv_cache_tokens`: KV-cache tokens. - `llamacpp:requests_processing`: Number of requests processing. - `llamacpp:requests_deferred`: Number of requests deferred. +- `llamacpp:n_past_max`: High watermark of the context size observed. ### POST `/slots/{id_slot}?action=save`: Save the prompt cache of the specified slot to a file.