Lazy import + refactor Lora layer addition #426

awni · 2024-02-09T01:32:51Z

Add lazy loading of model architectures
Refactor lora to make a step towards making it more general
Add support for lora tuning olmo
Move Phixtral into mlx-lm and enable it with LoRA

mzbac · 2024-02-09T01:47:43Z

llms/mlx_lm/tuner/utils.py

+        starting from the last layer.
+    """
+    if model.model_type in [
+        "mistral",


Nit: It would be easier for future updates/modifications if we could make those model types as constant and define them at the top of utils.py.

I think this function is going to turn into a sequence of if elif depending on the model types, making each one a constant might not make sense? (e.g. see the branch with olmo)

I admit it's not super clean though. I couldn't think of a better approach yet though.

I also want to add a config option to choose which layers (e.g. to make it easy to add MLP layers). Will do it in a follow up probably.

Make sense, and the function already at top of utils.py. it wouldn't be too bad to updating it.

mzbac · 2024-02-09T01:48:40Z

Looks very good, a much cleaner solution than I thought 🚀

mzbac · 2024-02-10T03:02:24Z

llms/mlx_lm/utils.py

-    "qwen2": qwen2,
+MODEL_REMAPPING = {
+    "mistral": "llama",  # mistral is compatible with llama
+    "phi-msft": "phixtral",


This may cause some issues because the old phi2 was using this model type. So, if a user tries to load the old phi2 model, it will be mapped to phixtral which won't work. see https://huggingface.co/microsoft/phi-2/blob/5d8f23da6be3205c16c06a9db3f22279ee23dbbf/config.json

Hmm, that looks like an old version of phi2. it wouldn't work with mlx lm either way regardless of that remapping. We could try and put a helpful error message when constructing the Phixtral model?

It is too bad that they use the same model_type, I think it should have been different..

Maybe just mentioned it in the docs? It will error out for missing parameters during the model loading process anyway.

* lazy model import in mlx_lm * change lora loading * fix olmo lora * remove a bunch of unused stuff from plamo * move phixtral to mlx-lm and out of llms/

awni added 3 commits February 8, 2024 14:42

lazy model import in mlx_lm

baede24

change lora loading

358d9a3

fix olmo lora

0a8e299

mzbac reviewed Feb 9, 2024

View reviewed changes

awni requested a review from angeloskath February 9, 2024 05:51

awni added 2 commits February 8, 2024 22:18

remove a bunch of unused stuff from plamo

2f14279

move phixtral to mlx-lm and out of llms/

7dbb848

awni mentioned this pull request Feb 9, 2024

Enable the Mixtral-like Moe model without the quantized gate layer #394

Open

mzbac reviewed Feb 10, 2024

View reviewed changes

awni requested review from andresy and jagrit06 February 12, 2024 18:39

angeloskath approved these changes Feb 12, 2024

View reviewed changes

awni merged commit d466661 into main Feb 12, 2024

awni deleted the lazy_import branch February 12, 2024 18:51

awni mentioned this pull request Feb 13, 2024

LoRA Qwen2 Error: non-default argument 'hidden_size' follows default argument #438

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lazy import + refactor Lora layer addition #426

Lazy import + refactor Lora layer addition #426

awni commented Feb 9, 2024 •

edited

Loading

mzbac Feb 9, 2024

awni Feb 9, 2024

mzbac Feb 9, 2024

mzbac commented Feb 9, 2024

mzbac Feb 10, 2024

awni Feb 10, 2024

mzbac Feb 10, 2024

Lazy import + refactor Lora layer addition #426

Lazy import + refactor Lora layer addition #426

Conversation

awni commented Feb 9, 2024 • edited Loading

mzbac Feb 9, 2024

Choose a reason for hiding this comment

awni Feb 9, 2024

Choose a reason for hiding this comment

mzbac Feb 9, 2024

Choose a reason for hiding this comment

mzbac commented Feb 9, 2024

mzbac Feb 10, 2024

Choose a reason for hiding this comment

awni Feb 10, 2024

Choose a reason for hiding this comment

mzbac Feb 10, 2024

Choose a reason for hiding this comment

awni commented Feb 9, 2024 •

edited

Loading