Improve OpenAI API compatibility #216

rbollampally · 2024-01-31T09:29:37Z

Feature request

Implement v1/models like OpenAI API to list available local loras. This is dependent on #199

There is also a hurdle to this: A user may have multiple base models and multiple local loras. I don't know of an effective way to filter loras applicable to currently loaded base. Probably this can be worked on after initial release.

Motivation

Many OpenAI compatible webUI projects expect a list of available models via the /v1/models endpoint, like ollama-webui.
This would only work for local loras as suggested in #199 as it is practically impossible to list all loras from hugginface hub

Your contribution

I'm trying to extend the current OpenAI implementation to support listing of models. I can submit a PR when completed and if the community is interested but it will be useless without #199

The text was updated successfully, but these errors were encountered:

magdyksaleh · 2024-02-08T16:53:41Z

This is interesting. I can see us listing either 1) All available local loras or 2) All previously used / cached loras.

nidhoggr-nil · 2024-05-16T14:40:06Z

Trying to use lorax for structured generation with different frameworks, the current differentation on the openai structured generation mode is causing issues. I believe it's due to this part which is from the lorax documentation.

"Note

Currently a schema is required. This differs from the existing OpenAI JSON mode, in which no schema is supported."

looking at this, #389
it seems like it requires a "response_format" format to be set.

But look at openai requests, the format looks more like

.2.x.2.x{"messages": [{"role": "user", "content": "Create a Superhero named Garden Man."}], "model": "gradientai/Llama-3-8B-Instruct-Gradient-1048k", "stream": true, "tool_choice": {"type": "function", "function": {"name": "return_superhero"}}, "tools": [{"type": "function", "function": {"name": "return_superhero", "parameters": {"properties": {"name": {"title": "Name", "type": "string"}, "age": {"title": "Age", "type": "integer"}, "power": {"title": "Power", "type": "string"}, "enemies": {"items": {"type": "string"}, "title": "Enemies", "type": "array"}}, "required": ["name", "age", "power", "enemies"], "type": "object"}}}]}

missing some chunks of what is actually expected in lorax, not to mention the correct keys not being set etc..

Lorax would expect this inside the response_format key

schema = {
    "$defs": {
        "Armor": {
            "enum": ["leather", "chainmail", "plate"],
            "title": "Armor",
            "type": "string"
        }
    },
    "properties": {
        "name": {"maxLength": 10, "title": "Name", "type": "string"},
        "age": {"title": "Age", "type": "integer"},
        "armor": {"$ref": "#/$defs/Armor"},
        "strength": {"title": "Strength", "type": "integer"}
    },
    "required": ["name", "age", "armor", "strength"],
    "title": "Character",
    "type": "object"
}

This makes one of the great usecases for lorax, namely multi-model systems, harder to just use with frameworks which are already coded for structured/function calling/tool usage with openai compatible endpoints.

codybum · 2024-06-07T17:13:12Z

I second the need for this.

jeffreyftang · 2024-06-07T21:25:46Z

Hi @nidhoggr-nil, you're absolutely correct that LoRAX does not yet support the tool choice / function-calling style of structure generation.

That being said, I have a PR in progress to add support for functions/tools that I'm hoping to land in the next week or so. Stay tuned :)

codybum · 2024-06-08T21:59:33Z

Fantastic! Looking forward to it.

codybum · 2024-06-18T21:02:06Z

Hi @jeffreyftang is there anything we can to do help or test? I noticed a function_calling branch, which I will be happy to test if it is functional. Thanks!

jeffreyftang · 2024-06-19T00:02:34Z

Hi @codybum, the branch is mostly functional in terms of enforcing the desired output format, but there's still some work to be done for automatically injecting the available tools into the prompt (currently the prompter would need to do so manually).

Once that's done and the code cleaned up a bit, should be ready to go.

codybum · 2024-06-21T12:42:31Z

@jeffreyftang this is great news. We have been experimenting with structured output and the results are promising.

Do you happen to have an example tool and prompt configuration what we could work from? I am happy to give it a go.

codybum · 2024-07-03T01:19:17Z

@jeffreyftang given the weather calling example, what would need to be provided in the input prompt? I tried to piece together the following example, which appears to hit some of the tools code, but returns only "{"generated_text":"[\n ]"}"

curl http://10.33.31.21:8080/generate \
    -X POST \
    -d '{
        "inputs": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>You are a helpful assistant that can access external functions. The responses from these function calls will be appended to this dialogue. Please provide responses based on the information from these function calls. The functions available to you are as follows:\nget_current_weather\n<|eot_id|><|start_header_id|>user<|end_header_id|>What is the current temperature of New York, San Francisco and Chicago?<|eot_id|><|start_header_id|>assistant<|end_header_id|>",
        "parameters": {
            "tools": [{"type": "function", "function": {"name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}}}}}]
        }
    }' \
    -H 'Content-Type: application/json'

jeffreyftang · 2024-07-17T20:51:36Z

Sorry for the long delay here - there's a PR up for review now: #536

There's an example of how to invoke in the PR description as well :)

codybum · 2024-07-18T01:35:05Z

Hi @jeffreyftang I need to try with Mistral like in your example, but with Llama3 8B Instruct, Hermes-2-Theta-Llama-3-8B-32k, and Llama-3-Groq-8B-Tool-Use I get a response like the following: {"generated_text":"[ ]}

I will wait until merge then try and build again, perhaps I am doing something wrong in the build process.

jeffreyftang · 2024-07-18T01:55:36Z

Hi @codybum, thanks for the feedback! It's possible I'm doing something wrong with the prompt modification - I'll take a closer look at some of those models.

llama-shepard · 2024-07-19T00:40:36Z

@jeffreyftang Chat Templates (https://huggingface.co/docs/transformers/chat_templating) will handle this already, right?

But some models do not support tools in their default Chat Templates (Like Llama 3)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve OpenAI API compatibility #216

Improve OpenAI API compatibility #216

rbollampally commented Jan 31, 2024 •

edited

Loading

magdyksaleh commented Feb 8, 2024

nidhoggr-nil commented May 16, 2024

codybum commented Jun 7, 2024

jeffreyftang commented Jun 7, 2024

codybum commented Jun 8, 2024

codybum commented Jun 18, 2024

jeffreyftang commented Jun 19, 2024

codybum commented Jun 21, 2024

codybum commented Jul 3, 2024 •

edited

Loading

jeffreyftang commented Jul 17, 2024

codybum commented Jul 18, 2024

jeffreyftang commented Jul 18, 2024

llama-shepard commented Jul 19, 2024 •

edited

Loading

Improve OpenAI API compatibility #216

Improve OpenAI API compatibility #216

Comments

rbollampally commented Jan 31, 2024 • edited Loading

Feature request

Motivation

Your contribution

magdyksaleh commented Feb 8, 2024

nidhoggr-nil commented May 16, 2024

codybum commented Jun 7, 2024

jeffreyftang commented Jun 7, 2024

codybum commented Jun 8, 2024

codybum commented Jun 18, 2024

jeffreyftang commented Jun 19, 2024

codybum commented Jun 21, 2024

codybum commented Jul 3, 2024 • edited Loading

jeffreyftang commented Jul 17, 2024

codybum commented Jul 18, 2024

jeffreyftang commented Jul 18, 2024

llama-shepard commented Jul 19, 2024 • edited Loading

rbollampally commented Jan 31, 2024 •

edited

Loading

codybum commented Jul 3, 2024 •

edited

Loading

llama-shepard commented Jul 19, 2024 •

edited

Loading