Feature Request: tool to list and delete cached models

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

I'd love to have a tool to list and delete cached models (that were fetched automatically when using the `-hf` option). This would be akin to the `ls` and `rm` commands in Ollama.

### Motivation

A lot of people (myself included) use the `-hf` option to automatically fetch models from Hugging Face. This places models in a model cache directory, which can get rather big over time. Each model typically has at least three associated files (manifest, gguf, and etag), and sometimes five (adding mmprog gguf and associated etag). Manually managing files in the cache is a bit cumbersome. It would be nice to have an elegant way to see which models you have cached, how much space they take, and have a single command to delete all cached files associated with a model.

### Possible Implementation

I built a Python script to do this for my own convenience: https://gist.github.com/sultanqasim/5b6d9654236e18dea4896d3c9ce2dc1b

The output of the script looks like this:

~~~
$ ./llama-cache ls
Name                                                        Size (GB) Modified
--------------------------------------------------------------------------------
ibm-granite/granite-4.0-h-small-GGUF:Q4_K_M                 18.1      2025-10-02 13:34
unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF:Q4_K_XL            16.5      2025-08-07 00:32
unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF:IQ4_XS     12.7      2025-07-24 12:10
unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:Q4_K_XL           16.5      2025-10-01 17:19
unsloth/Qwen3-14B-GGUF:Q4_K_XL                              8.5       2025-08-01 23:03

$ ./llama-cache rm unsloth/Qwen3-14B-GGUF:Q4_K_XL
Deleted: /Users/sultan/Library/Caches/llama.cpp/unsloth_Qwen3-14B-GGUF_Qwen3-14B-UD-Q4_K_XL.gguf
Deleted: /Users/sultan/Library/Caches/llama.cpp/unsloth_Qwen3-14B-GGUF_Qwen3-14B-UD-Q4_K_XL.gguf.json
Deleted: /Users/sultan/Library/Caches/llama.cpp/manifest=unsloth_Qwen3-14B-GGUF=Q4_K_XL.json
~~~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: tool to list and delete cached models #16393

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: tool to list and delete cached models #16393

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions