-
Notifications
You must be signed in to change notification settings - Fork 7.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gpt4all_api update and in working state - added gguf support and refactoring end points. All test pass except embedding #1659
Conversation
and modify test batch
from API Update chat.py Signed-off-by: Daniel Salvatierra <dsalvat1@gmail.com>
Approved. |
since the docker-compose was broken until this PR, i have tested the PR and the API works again, thanks! Not sure its related but i got 2 problems maybe no related, but i thought it could be useful, if not i am sorry and i will create an issue. 1.) The /v1/completions always makes a redirect to a trailing slash version /v1/completions/
2.) No matter what the
response
|
@dpsalvatierra in the Readme the variable is called |
Thanks for bringing this up. The MODEL_ID variable is used on the GPU docker compose file, however I also noticed that the Makefile needs some house keeping. Expect new adds and some proper documentation soon. If you are only running the default docker compose file, just adding the model into the folder and adjusting the .env for "MODEL_BIN" variable should get you up and running. |
Describe your changes
Issue ticket number and link
Checklist before requesting a review
Before:
https://app.screenclip.com/GMfa
After change:
https://app.screenclip.com/Mi3r
##Steps to Reproduce
1- docker compose up --build : test gpt4all_api
2- docker compose -f docker-compose.yaml -f docker-compose.gpu.yaml up --build -d : test gpt4all_gpu
3- make test : all tests pass except embedding, This is really outdated and openai new API using "chat completion" instead of completion during the inference. The workaround is to use openai==0.28.0 (edited in requirements.txt)
Notes
Two problems addressed:
A) Docker API build is downloading old Bin files which are now deprecated. The new file format is gguf.
B) API using outdated completion functions in make file during testing. Workaround is to use openai==0.28.0
C) Not able to test embedding successfully
D) Adding variables
E) All endpoints working:
Engines.py
chat.py