gpt4all_api update and in working state - added gguf support and refactoring end points. All test pass except embedding #1659

dpsalvatierra · 2023-11-18T02:32:46Z

Describe your changes

Edited docker-compose.yaml with the following changes:
New Variable: line 15 replaced bin model with variable ${MODEL_ID}
New volume: line 19 added models folder to place gguf llms
Added env file with instructions on using MODEL_BIN variable
created models folder under gpt4all_api with README place holder
Cleaned up some code in make test folder, specifically the batch completion function test
Added python-dotenv and openai==0.28.0 to requirements
Cleaning up some orphan init.py that don't need to be added to the branch
added *.gguf exception to gitignore to avoid uploading models during push
using "venv" instead of "env" for virtual environment
Issue ticket number and link
Refactored Chat Completion and engines endpoints.

Issue ticket number and link

Checklist before requesting a review

[x ] I have performed a self-review of my code.
If it is a core feature, I have added thorough tests.
I have added thorough documentation for my code.
I have tagged PR with relevant project labels. I acknowledge that a PR without labels may be dismissed.
If this PR addresses a bug, I have provided both a screenshot/video of the original bug and the working solution.

After change:
https://app.screenclip.com/Mi3r

##Steps to Reproduce
1- docker compose up --build : test gpt4all_api
2- docker compose -f docker-compose.yaml -f docker-compose.gpu.yaml up --build -d : test gpt4all_gpu
3- make test : all tests pass except embedding, This is really outdated and openai new API using "chat completion" instead of completion during the inference. The workaround is to use openai==0.28.0 (edited in requirements.txt)

Notes

Two problems addressed:
A) Docker API build is downloading old Bin files which are now deprecated. The new file format is gguf.
B) API using outdated completion functions in make file during testing. Workaround is to use openai==0.28.0
C) Not able to test embedding successfully
D) Adding variables
E) All endpoints working:

Engines.py

Added "requests.get" function to fetch engines from github list
Better error handling, moving loggers to bottom next to exception requests.
Removed unused libraries

chat.py

Error handling
Added api settings to fetch model to run tests
Added streaming response

and modify test batch

from API Update chat.py Signed-off-by: Daniel Salvatierra <dsalvat1@gmail.com>

AndriyMulyar · 2023-11-20T00:29:44Z

Approved.

nadar · 2023-11-20T08:22:18Z

since the docker-compose was broken until this PR, i have tested the PR and the API works again, thanks! Not sure its related but i got 2 problems maybe no related, but i thought it could be useful, if not i am sorry and i will create an issue.

1.) The /v1/completions always makes a redirect to a trailing slash version /v1/completions/

["request_headers"]=>
  array(6) {
    [0]=>
    string(29) "POST /v1/completions HTTP/1.1"
    [1]=>
    string(20) "Host: localhost:4891"
    [2]=>
    string(59) "User-Agent: PHP Curl/2.5 (+https://github.com/php-mod/curl)"
    [3]=>
    string(11) "Accept: */*"
    [4]=>
    string(30) "Content-Type: application/json"
    [5]=>
    string(18) "Content-Length: 86"
  }
  ["response_headers"]=>
  array(5) {
    [0]=>
    string(31) "HTTP/1.1 307 Temporary Redirect"
    [1]=>
    string(35) "date: Mon, 20 Nov 2023 08:18:32 GMT"
    [2]=>
    string(15) "server: uvicorn"
    [3]=>
    string(17) "content-length: 0"
    [4]=>
    string(47) "location: http://localhost:4891/v1/completions/"
  }

2.) No matter what the /v1/completions/ the always returns with finish_reasion "stop"

  ["request_headers"]=>
  array(6) {
    [0]=>
    string(30) "POST /v1/completions/ HTTP/1.1"
    [1]=>
    string(20) "Host: localhost:4891"
    [2]=>
    string(59) "User-Agent: PHP Curl/2.5 (+https://github.com/php-mod/curl)"
    [3]=>
    string(11) "Accept: */*"
    [4]=>
    string(30) "Content-Type: application/json"
    [5]=>
    string(18) "Content-Length: 86"
  }
  ["response_headers"]=>
  array(5) {
    [0]=>
    string(15) "HTTP/1.1 200 OK"
    [1]=>
    string(35) "date: Mon, 20 Nov 2023 08:19:43 GMT"
    [2]=>
    string(15) "server: uvicorn"
    [3]=>
    string(19) "content-length: 275"
    [4]=>
    string(30) "content-type: application/json"
  }
  ["response"]=>
  string(275) "{"id":"eb87d52c-e64d-4bb3-842c-fa1341f601ad","object":"text_completion","created":1700468392,"model":"mistral-7b-openorca.Q4_0.gguf","choices":[{"text":"\n","index":0,"logprobs":-1.0,"finish_reason":"stop"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}"
  ["response_header_continue":protected]=>
  bool(false)
}

response

{
  ["id"]=>
  string(36) "eb87d52c-e64d-4bb3-842c-fa1341f601ad"
  ["object"]=>
  string(15) "text_completion"
  ["created"]=>
  int(1700468392)
  ["model"]=>
  string(29) "mistral-7b-openorca.Q4_0.gguf"
  ["choices"]=>
  array(1) {
    [0]=>
    array(4) {
      ["text"]=>
      string(1) "
"
      ["index"]=>
      int(0)
      ["logprobs"]=>
      float(-1)
      ["finish_reason"]=>
      string(4) "stop"
    }
  }
  ["usage"]=>
  array(3) {
    ["prompt_tokens"]=>
    int(0)
    ["completion_tokens"]=>
    int(0)
    ["total_tokens"]=>
    int(0)
  }
}

nadar · 2023-11-20T08:38:17Z

@dpsalvatierra in the Readme the variable is called MODEL_ID but in the docker-compose file its MODEL_BIN.

dpsalvatierra · 2023-11-20T17:22:32Z

@dpsalvatierra in the Readme the variable is called MODEL_ID but in the docker-compose file its MODEL_BIN.

Thanks for bringing this up. The MODEL_ID variable is used on the GPU docker compose file, however I also noticed that the Makefile needs some house keeping.

Expect new adds and some proper documentation soon.

If you are only running the default docker compose file, just adding the model into the folder and adjusting the .env for "MODEL_BIN" variable should get you up and running.

dpsalvatierra and others added 4 commits November 7, 2023 14:31

Fixing API problem - bin files are deprecated

84b6fed

Update .gitignore and Dockerfile, add .env file

5facc3c

and modify test batch

Refactor engines module to fetch engine details

cfade2d

from API Update chat.py Signed-off-by: Daniel Salvatierra <dsalvat1@gmail.com>

Merge branch 'nomic-ai:main' into main

d53f16a

dpsalvatierra mentioned this pull request Nov 18, 2023

gpt4all_api update to get it working with gguf and perform tests #1641

Closed

5 tasks

dpsalvatierra changed the title ~~gpt4all_api update to get it working with gguf and refactoring end points~~ gpt4all_api update and in working state - added gguf support and refactoring end points. All test pass except embedding Nov 18, 2023

AndriyMulyar approved these changes Nov 20, 2023

View reviewed changes

nadar mentioned this pull request Nov 20, 2023

Instructions in gpt4all-api directory don't/no longer work #1482

Closed

10 tasks

manyoso merged commit 76413e1 into nomic-ai:main Nov 21, 2023

cebtenzzre mentioned this pull request Jan 3, 2024

[Question] Try to run gpt4all-api -> sudo docker compose up --build -> Unable to instantiate model: code=11, Resource temporarily unavailable #1642

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpt4all_api update and in working state - added gguf support and refactoring end points. All test pass except embedding #1659

gpt4all_api update and in working state - added gguf support and refactoring end points. All test pass except embedding #1659

dpsalvatierra commented Nov 18, 2023

AndriyMulyar commented Nov 20, 2023

nadar commented Nov 20, 2023

nadar commented Nov 20, 2023

dpsalvatierra commented Nov 20, 2023 •

edited

gpt4all_api update and in working state - added gguf support and refactoring end points. All test pass except embedding #1659

gpt4all_api update and in working state - added gguf support and refactoring end points. All test pass except embedding #1659

Conversation

dpsalvatierra commented Nov 18, 2023

Describe your changes

Issue ticket number and link

Checklist before requesting a review

Notes

AndriyMulyar commented Nov 20, 2023

nadar commented Nov 20, 2023

nadar commented Nov 20, 2023

dpsalvatierra commented Nov 20, 2023 • edited

dpsalvatierra commented Nov 20, 2023 •

edited