Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAI compat API adapter #466

Merged
merged 2 commits into from
Oct 9, 2023
Merged

Conversation

lofcz
Copy link

@lofcz lofcz commented Oct 8, 2023

The current OpenAI-like API uses hardcoded chat templates. This PR implements a non-breaking adapter users can exploit to use models requiring various chat templates. Testing request against Mistral7B Dolphin:

{
    "temperature": 0.5,
    "max_tokens": 1024,
    "messages": [
        {
            "role": "system",
            "content": "You roleplay as a dungeon master engaged in a session of Dungeons and Dragons with the user. Write in an immersive way to avoid spoiling the user's experience."
        },
        {
            "role": "user",
            "content": "I am a kobold named Nico, what should I do?"
        }
    ],
    "adapter": {
        "templates": {
          "system": {
              "start": "<|im_start|>system\n",
              "end": "<|im_end|>\n"
          },
          "user": {
              "start": "<|im_start|>user\n",
              "end": "<|im_end|>\n"
          },
          "assistent": {
              "start": "",
              "end": ""
          },
          "after_last_message": ""
        }
    }
}

This PR proposes the following non-breaking addition to /v1/chat/completions endpoint:

+"adapter": {
+        "templates": {
+        "system": {
+            "start": " String | None,
+            "end": String | None
+        },
+        "user": {
+            "start": String | None,
+            "end": String | None
+        },
+        "assistent": {
+            "start": String | None,
+            "end": String | None
+        },
+        "after_last_message": String | None
+        }
+    }

If users omit the adapter object in the request, we fall back to the default Vicuna-style template.

Response with this patch:
image

Response with stock 1.46.1 build:
image

@LostRuins
Copy link
Owner

This is a great idea although I would probably simplify the syntax a bit into a single object. Is there any other project that does this currently? If theres an establish spec I could follow it.

@LostRuins
Copy link
Owner

LostRuins commented Oct 9, 2023

Without any other spec, what about something like:

"adapter": {
"system_start":"str",
"system_end":"str",
"user_start":"str",
"user_end":"str",
"assistant_start":"str",
"assistant_end":"str"
}

With any missing or null field replaced with the default value for it.

What's a good use case for after_last_mes? Seems like it would break most bot responses

@LostRuins LostRuins added the enhancement New feature or request label Oct 9, 2023
@lofcz
Copy link
Author

lofcz commented Oct 9, 2023

@LostRuins thanks, I've implemented preliminary support in my lib OpenAiNg, however the format can be changed, I'm open to the one you've proposed.

As for after_last_mes it's used to support the old behaviour:
https://github.com/LostRuins/koboldcpp/pull/466/files#diff-885e6237f0dc0cc77c7b4a47ef801248f4d2e6a7743b37b85a451c3ac446cbd2L424

@LostRuins
Copy link
Owner

I see. I think the after_last_mes should not really be needed as the tag is intended to be the AI's assistant_start response tag, keeping in consistency with the earlier user/AI dialog. Most instruct formats will only use 2 (user/assistant) or 3 (user/system/assistant) tags, so this should align with them.

@LostRuins
Copy link
Owner

Hi @lofcz please take a look, I have simplified the API as mentioned, let me know if it works adequately with your frontend (try both with and without the adapter to see if everything is ok)

@lofcz
Copy link
Author

lofcz commented Oct 9, 2023

@LostRuins thanks for the edit, I've tried it both with and without and it works great. I'd share a video but my frontend is not in English so it wouldn't be legible for most.

@LostRuins
Copy link
Owner

okay then looks good to me. will merge this :) cheers

@LostRuins LostRuins added the completed completed label Oct 9, 2023
@lofcz
Copy link
Author

lofcz commented Oct 9, 2023

thanks!

@LostRuins LostRuins changed the base branch from concedo to concedo_experimental October 9, 2023 15:24
@LostRuins LostRuins merged commit 96e9539 into LostRuins:concedo_experimental Oct 9, 2023
@teddybear082
Copy link

Nice addition lofcz and LostRuins!!! Very cool.

@aseichter2007
Copy link

What is the final format expected?

@LostRuins
Copy link
Owner

As shown above. Just add the adapter to the regular json request body.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
completed completed enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants