Ruby LLM inference toolkit built on the mlx gem.
For full reference pages and deep dives, start at docs/index.md.
gem install mlx-ruby-lmOr add it to a project:
bundle add mlx-ruby-lmSee docs/installation.md for requirements and source installs.
Executable: mlx_lm
Commands:
mlx_lm generatemlx_lm chatmlx_lm server
Quick examples:
mlx_lm generate --model /path/to/model --prompt "Hello"
mlx_lm chat --model /path/to/model --system-prompt "You are concise."
mlx_lm server --model /path/to/model --host 127.0.0.1 --port 8080See docs/cli.md for options, defaults, and current parser/behavior caveats.
require "mlx"
require "mlx_lm"
model, tokenizer = MlxLm::LoadUtils.load("/path/to/model")
text = MlxLm::Generate.generate(model, tokenizer, "Hello", max_tokens: 64)
puts textStreaming:
MlxLm::Generate.stream_generate(model, tokenizer, "Hello", max_tokens: 64).each do |resp|
print resp.text
end
putsSee docs/ruby-apis.md for the full API inventory.
LoadUtils.load expects a local model directory with files such as config.json,
tokenizer.json, and model*.safetensors.
To inspect supported model keys at runtime:
require "mlx_lm"
puts MlxLm::Models::REGISTRY.keys.sortSee docs/models.md for full registry keys and remapping behavior.