You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The instructions specify chat, completions, and text for the dataset. However in the Generate usage there aren't instructions for how to use one of those three formats.
For example, I'm not sure how I would add a system prompt as well as a user message when calling the mlx_lm.generate command.
Not sure if this is something that is supported but not documented in LORA.md, or maybe I missed something!
The text was updated successfully, but these errors were encountered:
So usually in generate it uses the models default chat template (so it's a chat). You can use the raw prompt (so it's just the text) by specifying --ignore-chat-template. There is currently no way to do the completion version in the CLI. But if you use the API you could do it like this:
In case it helps anyone down the line:
My goal was to generate using fine tuned adapters, as well as with a system prompt and user prompt in the chat format.
Based on generate.py and utils.py, I load the model with the adapters like this:
Regarding llms/mlx_lm/LORA.md
The instructions specify chat, completions, and text for the dataset. However in the Generate usage there aren't instructions for how to use one of those three formats.
For example, I'm not sure how I would add a system prompt as well as a user message when calling the mlx_lm.generate command.
Not sure if this is something that is supported but not documented in LORA.md, or maybe I missed something!
The text was updated successfully, but these errors were encountered: