Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Endpoint for Listing Loaded Running Models #4013

Closed
strikeoncmputrz opened this issue Apr 29, 2024 · 3 comments · Fixed by #4327
Closed

API Endpoint for Listing Loaded Running Models #4013

strikeoncmputrz opened this issue Apr 29, 2024 · 3 comments · Fixed by #4327
Labels
feature request New feature or request

Comments

@strikeoncmputrz
Copy link

strikeoncmputrz commented Apr 29, 2024

It would be excellent to be able to interrogate the API to determine which models are running at any given time, rather than just seeing which checkpoints were pulled.

I use a variety of clients to interact with Ollama's API. I sometimes run models with a long keep_alive and assume others have similar use cases.

The only way I know of to identify a running model is through processes: ps aux | grep -- '--model' | grep -v grep | grep -Po '(?<=--model\s).*' | cut -d ' ' -f1. This will give you the full path to the model's blob. From there, you can compare that with the output of ollama show --modelfile (or the /api/show endpoint).

I checked the open issues and reddit and didn't see any similar RFIs or requests.

I wrote a bash script (depends on jq) that implements this as POC.

@strikeoncmputrz strikeoncmputrz added the feature request New feature or request label Apr 29, 2024
@pdevine
Copy link
Contributor

pdevine commented Apr 29, 2024

I think this would be great, along with an ollama ps which shows which models are currently loaded in memory. It should include when the model's TTL is going to expire as well.

@strikeoncmputrz
Copy link
Author

TTL is a great idea!

@unmotivatedgene
Copy link

Yes please add this especially with the new concurrency options I want to know what models are sticking around and taking up all my VRAM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants