Feat/unload model api #97

thunhuanh · 2023-10-31T09:21:24Z

Add API to unload model.
Resolve #86

tikikun

Correct approach , please do some manual test and merge @vuonghoainam tks

hiro-v · 2023-11-02T03:06:23Z

Hi @thunhu99
Thank you for your contribution. It's great.

However, the server is not working as expected as the server stops working after I sent the DELETE request to unload model

Load the model

curl --location 'http://localhost:3928/inferences/llamacpp/loadModel' \
--header 'Content-Type: application/json' \
--data '{
    "llama_model_path": "<model_path>",
    "ctx_len": 2048,
    "ngl": 100,
    "embedding": true
}'

Test the model to make sure it's working correctly:

curl --location 'http://localhost:3928/inferences/llamacpp/chat_completion' \
--header 'Content-Type: application/json' \
--header 'Accept: text/event-stream' \
--header 'Access-Control-Allow-Origin: *' \
--data '{
        "messages": [
            {"content": "[INST] Write code to solve the following coding problem that obeys the constraints and passes the example test cases. Please wrap your code answer using ```:{prompt}[/INST]", "role": "system"},
            {"content": "python code for fibonacci", "role": "user"},
            {"content": "Here is a Python code for Fibonacci sequence:\n```def fib(n):if n <= 1:return else:return fib(n-1) + fib(n-2)```This code takes an integer `n` as input and returns the `n`-th Fibonacci number.", "role": "assistant"},
            {"content": "please continue", "role": "user"}
        ],
        "stream": true,
        "model": "gpt-3.5-turbo",
        "max_tokens": 2048,
        "stop": ["hello"],
        "frequency_penalty": 0,
        "presence_penalty": 0,
        "temperature": 0
     }'

Try to unload the model (as your code change)

curl --location --request DELETE 'http://localhost:3928/inferences/llamacpp/unloadmodel' \
--header 'Content-Type: application/json' \
--data ''

However, after the 3rd step, the server stops working and I cannot send the 1st step again (which is similar to killing process)
What we expect is that after 3rd step, I can loadmodel again.

Could you please check and make some changes, thanks

hiro-v · 2023-11-02T03:07:49Z

controllers/llamaCPP.cc

+{
+  Json::Value jsonResp;
+  if (model_loaded) {
+    llama.unloadModel();


As I tested, the server stops working after this line.
The below lines do not execute to return result

Oh, sorry my mistake. The API endpoint map to the wrong handler:
it should be METHOD_ADD(llamaCPP::unloadModel, "unloadmodel", Delete);
instead of METHOD_ADD(llamaCPP::loadModel, "unloadmodel", Delete);

thunhuanh · 2023-11-02T11:11:00Z

I have fixed the issue, and test it locally, the changes should work now @vuonghoainam @tikikun

…nitro into feat/unload-model-api

tikikun · 2023-11-07T08:23:43Z

Hi @thunhuanh there has been quite intense i need to refactor this PR into another PR a bit to merge, will credit back to this issue.

tikikun · 2023-11-13T02:09:11Z

Hi @thunhuanh I have added your change #122 to this PR with a little bit of change, thank you very much to take the time to contribute to the project

thunhu99 added 2 commits October 31, 2023 16:12

Add api to unload model

a4b7226

set context and model back to nullptr after delete

3d81378

tikikun requested review from hiro-v and tikikun October 31, 2023 09:44

tikikun reviewed Oct 31, 2023

View reviewed changes

hiro-v reviewed Nov 2, 2023

View reviewed changes

hiro-v assigned thunhuanh Nov 2, 2023

hiro-v added P1: important Important feature / fix type: enhancement labels Nov 2, 2023

hiro-v added this to the Nitro v2.0.0 milestone Nov 2, 2023

thunhuanh and others added 3 commits November 2, 2023 18:11

Merge branch 'janhq:main' into feat/unload-model-api

b617433

fix wrong api endpoind - handler mapping

af3bdb9

Merge branch 'feat/unload-model-api' of https://github.com/thunhuanh/…

29411b5

…nitro into feat/unload-model-api

tikikun mentioned this pull request Nov 13, 2023

Unload model stop background #122

Merged

tikikun closed this Nov 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/unload model api #97

Feat/unload model api #97

Uh oh!

thunhuanh commented Oct 31, 2023 •

edited

Loading

Uh oh!

tikikun left a comment

Uh oh!

hiro-v commented Nov 2, 2023

Uh oh!

hiro-v Nov 2, 2023

Uh oh!

thunhuanh Nov 2, 2023

Uh oh!

thunhuanh commented Nov 2, 2023 •

edited

Loading

Uh oh!

tikikun commented Nov 7, 2023

Uh oh!

tikikun commented Nov 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Feat/unload model api #97

Feat/unload model api #97

Uh oh!

Conversation

thunhuanh commented Oct 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tikikun left a comment

Choose a reason for hiding this comment

Uh oh!

hiro-v commented Nov 2, 2023

Uh oh!

hiro-v Nov 2, 2023

Choose a reason for hiding this comment

Uh oh!

thunhuanh Nov 2, 2023

Choose a reason for hiding this comment

Uh oh!

thunhuanh commented Nov 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tikikun commented Nov 7, 2023

Uh oh!

tikikun commented Nov 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

thunhuanh commented Oct 31, 2023 •

edited

Loading

thunhuanh commented Nov 2, 2023 •

edited

Loading