[Feature Request] When generating using mlx_lm, specify data format #761

konarkm · 2024-05-08T02:24:10Z

The instructions specify chat, completions, and text for the dataset. However in the Generate usage there aren't instructions for how to use one of those three formats.
For example, I'm not sure how I would add a system prompt as well as a user message when calling the mlx_lm.generate command.

Not sure if this is something that is supported but not documented in LORA.md, or maybe I missed something!

awni · 2024-05-11T13:57:48Z

So usually in generate it uses the models default chat template (so it's a chat). You can use the raw prompt (so it's just the text) by specifying --ignore-chat-template. There is currently no way to do the completion version in the CLI. But if you use the API you could do it like this:

from mlx_lm import load, generate

model, tokenizer = load("mistralai/Mistral-7B-Instruct-v0.1")

prompt = ""
completion = ""

text = tokenizer.apply_chat_template(
    [
        {"role": "user", "content": prompt},
        {"role": "assistant", "content": completion},
    ],
    tokenize=False,
    add_generation_prompt=True,
)

response = generate(model, tokenizer, prompt=text)

konarkm · 2024-05-12T23:04:49Z

I see, thank you!

In case it helps anyone down the line:
My goal was to generate using fine tuned adapters, as well as with a system prompt and user prompt in the chat format.

Based on generate.py and utils.py, I load the model with the adapters like this:

from mlx_lm import load, generate

model_repo = "mlx-community/Meta-Llama-3-8B-Instruct-4bit"
adapter_path = "/adapters"

model, tokenizer = load(model_repo, adapter_path=adapter_path)

Then, I can generate:

system_prompt = "Be a helpful assistant"
prompt = "Hey, tell me about Llama"

text = tokenizer.apply_chat_template(
    [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt},
    ],
    tokenize=False,
    add_generation_prompt=True,
)

response = generate(model, tokenizer, prompt=text, verbose=True)

awni closed this as completed May 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] When generating using mlx_lm, specify data format #761

[Feature Request] When generating using mlx_lm, specify data format #761

konarkm commented May 8, 2024

awni commented May 11, 2024 •

edited

Loading

konarkm commented May 12, 2024

[Feature Request] When generating using mlx_lm, specify data format #761

[Feature Request] When generating using mlx_lm, specify data format #761

Comments

konarkm commented May 8, 2024

awni commented May 11, 2024 • edited Loading

konarkm commented May 12, 2024

awni commented May 11, 2024 •

edited

Loading