API Improvements #962

knoopx · 2023-11-01T13:02:41Z

I'm currently writing a webui for ollama but I find the API quite limited/cumbersome.
What is your vision/plan regarding it? Is it in a frozen state, or are you planning to improve it?

Here's some criticism:

mixed model/generation endpoints. some namespacing would be nice.
mixed model/name params that refer to the same thing.
/api/tags: why is this named tags?
GET /api/tags to get all available local models but POST /api/show to get one?
some endpoints throw errors, some return status as a JSON property.
No way to query the available public models repository
POST /api/create: doesn't allow to specify the Modelfile as just raw text, so there's no way to create models without file system access (client-side). Also no way to specify model file using just an object. Also for this to work, FROM needs to handle remote resources aswell.
POST /api/show: returns a string which forces the client to parse it to get the actual data. It would be nice it it also returned a JSON object.
POST /api/embeddings: without batching support is mostly useless
template in Modelfile:
to properly support chat agents it would be nice to have a chat-specific generation endpoint, and to be able to iterate over them in the template.
otherwise the feature itself is quite limited and requires the client to mostly override and re-implement all the logic (and it also needs to know all the underlying model parameters).
(this is how Hugginface does https://huggingface.co/HuggingFaceH4/zephyr-7b-beta/blob/main/tokenizer_config.json#L34)

Example:

define Modelfile template :
```
{{range .Messages}}
<|{{ .Role }}|>
{{ .Content }}
</s>
{{end}}
```
instead of:
```
{{- if .System }}
<|system|>
{{ .System }}
</s>
{{- end }}
<|user|>
{{ .Prompt }}
</s>
<|assistant|>
```
and then passing the messages as a JSON array to

POST /api/chat/generate
```
{
    ...,
    "messages": [
        {"role": "system", "content": "you are an assistant"},
        {"role": "user", "content": "hello"},
        {"role": "assistant", "content": "hi there"}
    ]
}
```

Here's my app if you want to have a peek: https://github.com/knoopx/llm-workbench

The text was updated successfully, but these errors were encountered:

mysticfall · 2023-11-25T12:42:25Z

+1 for the chat agent support and potential template format change.

Even though LangChain supports Ollama out of the box, its model implementation is wrong because it uses its own prompt format (i.e. Alpaca-like) to preprocess the input, which is again wrapped with a model-specific prompt template once the request is sent to the server. (See https://github.com/langchain-ai/langchainjs/blob/main/langchain/src/chat_models/ollama.ts#L256)

It's a problem that LangChain should fix, but the real issue is that there's no way to correctly implement the model with how Ollama currently handles the prompt template.

To be specific, LangChain presupposes a chat model can process a list of messages in a single prompt, which can be from the system, user, or AI.

But even though we may change ChatOllama to make it query the model-specific template and send the formatted prompt using the raw parameter, there's no way to parse the template to extract the proper format for each message type.

qsdhj · 2024-05-14T08:06:42Z

Hi, is someone working on the feature to enable batch processing with embeddings? Without it, the feature is, besides for basic testing with small corpuses of text, not useable.

IvanoBilenchi · 2024-05-17T15:05:59Z

Batch embeddings really are a must for the whole embeddings feature to be usable. It looks like some work was done in #3642, though it's been in draft state for a while.

BruceMacD added the feature request New feature or request label Nov 1, 2023

johanbrandhorst mentioned this issue May 9, 2024

embeddings support batch? #4224

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Improvements #962

API Improvements #962

knoopx commented Nov 1, 2023 •

edited

mysticfall commented Nov 25, 2023

qsdhj commented May 14, 2024

IvanoBilenchi commented May 17, 2024

API Improvements #962

API Improvements #962

Comments

knoopx commented Nov 1, 2023 • edited

mysticfall commented Nov 25, 2023

qsdhj commented May 14, 2024

IvanoBilenchi commented May 17, 2024

knoopx commented Nov 1, 2023 •

edited