this is a light openai api wrapper for grall.
- clone the repo
- start your grall models, keep all the sockets in the same directory
lets say your models are in ./models/
grall ipc ./models/model1.gril ./sockets/model1.sock
grall ipc ./models/model2.gril ./sockets/model2.sock- setup environment variables, heres the most basic ones for the above example
SOCKS_DIR=socketssee configuration for more env variables.
- start
go run cmd/opengrall/main.go1. GET /v1/models 1. POST /v1/chat/completions (stream and non-streaming)
ADDR=:8080 # note the : in the beginning
SOCKS_DIR=./sockets
LIMIT=1024 # limit each generation to these many bytes
DELAY=8 # ms delay for each generation
ID_MAX=256 # increase if you have many concurrent generations
OWNED_BY=user # model owner info on /v1/models- authenticaiton
- better config