-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API Improvements #962
Comments
+1 for the chat agent support and potential template format change. Even though LangChain supports Ollama out of the box, its model implementation is wrong because it uses its own prompt format (i.e. Alpaca-like) to preprocess the input, which is again wrapped with a model-specific prompt template once the request is sent to the server. (See https://github.com/langchain-ai/langchainjs/blob/main/langchain/src/chat_models/ollama.ts#L256) It's a problem that LangChain should fix, but the real issue is that there's no way to correctly implement the model with how Ollama currently handles the prompt template. To be specific, LangChain presupposes a chat model can process a list of messages in a single prompt, which can be from the system, user, or AI. But even though we may change |
Hi, is someone working on the feature to enable batch processing with embeddings? Without it, the feature is, besides for basic testing with small corpuses of text, not useable. |
Batch embeddings really are a must for the whole embeddings feature to be usable. It looks like some work was done in #3642, though it's been in draft state for a while. |
I'm currently writing a webui for ollama but I find the API quite limited/cumbersome.
What is your vision/plan regarding it? Is it in a frozen state, or are you planning to improve it?
Here's some criticism:
mixed model/generation endpoints. some namespacing would be nice.
mixed
model
/name
params that refer to the same thing./api/tags
: why is this named tags?GET /api/tags
to get all available local models butPOST /api/show
to get one?some endpoints throw errors, some return
status
as a JSON property.No way to query the available public models repository
POST /api/create
: doesn't allow to specify theModelfile
as just raw text, so there's no way to create models without file system access (client-side). Also no way to specify model file using just an object. Also for this to work,FROM
needs to handle remote resources aswell.POST /api/show
: returns a string which forces the client to parse it to get the actual data. It would be nice it it also returned a JSON object.POST /api/embeddings
: without batching support is mostly uselesstemplate
inModelfile
:to properly support chat agents it would be nice to have a chat-specific generation endpoint, and to be able to iterate over them in the
template
.otherwise the feature itself is quite limited and requires the client to mostly override and re-implement all the logic (and it also needs to know all the underlying model parameters).
(this is how Hugginface does https://huggingface.co/HuggingFaceH4/zephyr-7b-beta/blob/main/tokenizer_config.json#L34)
Example:
define
Modelfile
template
:instead of:
and then passing the messages as a JSON array to
POST /api/chat/generate
Here's my app if you want to have a peek: https://github.com/knoopx/llm-workbench
The text was updated successfully, but these errors were encountered: