[Feature]: VLLM suport for function calling in Mistral-7B-Instruct-v0.3 #5156

javierquin · 2024-05-31T13:31:25Z

🚀 The feature, motivation and pitch

Mistral 7B instruct 0.3 implements function calling, this is a powerfull tool and MistralAi is one of the first to implement it in a "small" model.

from mistral_inference.model import Transformer
from mistral_inference.generate import generate

from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequest


tokenizer = MistralTokenizer.from_file(f"{mistral_models_path}/tokenizer.model.v3")
model = Transformer.from_folder(mistral_models_path)

completion_request = ChatCompletionRequest(
    tools=[
        Tool(
            function=Function(
                name="get_current_weather",
                description="Get the current weather",
                parameters={
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "format": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"],
                            "description": "The temperature unit to use. Infer this from the users location.",
                        },
                    },
                    "required": ["location", "format"],
                },
            )
        )
    ],
    messages=[
        UserMessage(content="What's the weather like today in Paris?"),
        ],
)

tokens = tokenizer.encode_chat_completion(completion_request).tokens

out_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0])

print(result)

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3

vllm has the best inference speed but i can not find a way to use function calling

Alternatives

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

hmellor · 2024-05-31T21:36:05Z

Duplicate of #1869

K-Mistele · 2024-07-19T18:26:59Z

@hmellor @javierquin Please see #5649, limited support for Mistral-7B-Instruct-v0.3 is WIP and ready for testing. A couple caveats still exist but these are being worked out :)

javierquin added the feature request label May 31, 2024

hmellor marked this as a duplicate of #1869 May 31, 2024

hmellor closed this as not planned Won't fix, can't repro, duplicate, stale May 31, 2024

ybdesire mentioned this issue Jul 25, 2024

[Feature]: support Mistral-Large-Instruct-2407 function calling #6778

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: VLLM suport for function calling in Mistral-7B-Instruct-v0.3 #5156

[Feature]: VLLM suport for function calling in Mistral-7B-Instruct-v0.3 #5156

javierquin commented May 31, 2024

hmellor commented May 31, 2024

K-Mistele commented Jul 19, 2024

[Feature]: VLLM suport for function calling in Mistral-7B-Instruct-v0.3 #5156

[Feature]: VLLM suport for function calling in Mistral-7B-Instruct-v0.3 #5156

Comments

javierquin commented May 31, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

hmellor commented May 31, 2024

K-Mistele commented Jul 19, 2024