Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Conversation

@thunhuanh
Copy link

@thunhuanh thunhuanh commented Oct 31, 2023

Add API to unload model.
Resolve #86

@tikikun tikikun requested review from hiro-v and tikikun October 31, 2023 09:44
Copy link
Contributor

@tikikun tikikun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct approach , please do some manual test and merge @vuonghoainam tks

@hiro-v
Copy link
Contributor

hiro-v commented Nov 2, 2023

Hi @thunhu99
Thank you for your contribution. It's great.

However, the server is not working as expected as the server stops working after I sent the DELETE request to unload model

  1. Load the model
curl --location 'http://localhost:3928/inferences/llamacpp/loadModel' \
--header 'Content-Type: application/json' \
--data '{
    "llama_model_path": "<model_path>",
    "ctx_len": 2048,
    "ngl": 100,
    "embedding": true
}'
  1. Test the model to make sure it's working correctly:
curl --location 'http://localhost:3928/inferences/llamacpp/chat_completion' \
--header 'Content-Type: application/json' \
--header 'Accept: text/event-stream' \
--header 'Access-Control-Allow-Origin: *' \
--data '{
        "messages": [
            {"content": "[INST] Write code to solve the following coding problem that obeys the constraints and passes the example test cases. Please wrap your code answer using ```:{prompt}[/INST]", "role": "system"},
            {"content": "python code for fibonacci", "role": "user"},
            {"content": "Here is a Python code for Fibonacci sequence:\n```def fib(n):if n <= 1:return else:return fib(n-1) + fib(n-2)```This code takes an integer `n` as input and returns the `n`-th Fibonacci number.", "role": "assistant"},
            {"content": "please continue", "role": "user"}
        ],
        "stream": true,
        "model": "gpt-3.5-turbo",
        "max_tokens": 2048,
        "stop": ["hello"],
        "frequency_penalty": 0,
        "presence_penalty": 0,
        "temperature": 0
     }'
  1. Try to unload the model (as your code change)
curl --location --request DELETE 'http://localhost:3928/inferences/llamacpp/unloadmodel' \
--header 'Content-Type: application/json' \
--data ''

However, after the 3rd step, the server stops working and I cannot send the 1st step again (which is similar to killing process)
What we expect is that after 3rd step, I can loadmodel again.

Could you please check and make some changes, thanks

{
Json::Value jsonResp;
if (model_loaded) {
llama.unloadModel();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I tested, the server stops working after this line.
The below lines do not execute to return result

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry my mistake. The API endpoint map to the wrong handler:
it should be METHOD_ADD(llamaCPP::unloadModel, "unloadmodel", Delete);
instead of METHOD_ADD(llamaCPP::loadModel, "unloadmodel", Delete);

@hiro-v hiro-v added P1: important Important feature / fix type: enhancement labels Nov 2, 2023
@hiro-v hiro-v added this to the Nitro v2.0.0 milestone Nov 2, 2023
@thunhuanh
Copy link
Author

thunhuanh commented Nov 2, 2023

I have fixed the issue, and test it locally, the changes should work now @vuonghoainam @tikikun

@tikikun
Copy link
Contributor

tikikun commented Nov 7, 2023

Hi @thunhuanh there has been quite intense i need to refactor this PR into another PR a bit to merge, will credit back to this issue.

@tikikun
Copy link
Contributor

tikikun commented Nov 13, 2023

Hi @thunhuanh I have added your change #122 to this PR with a little bit of change, thank you very much to take the time to contribute to the project

@tikikun tikikun closed this Nov 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

P1: important Important feature / fix

Projects

No open projects
Archived in project

Development

Successfully merging this pull request may close these issues.

feat: Add API to unload model

4 participants