From 7fe56f4d7862dde48ba9023b50f17b4203c29c92 Mon Sep 17 00:00:00 2001 From: EricLBuehler Date: Mon, 18 Mar 2024 19:14:28 -0400 Subject: [PATCH] Update readme --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 6687c84..d977b8c 100644 --- a/README.md +++ b/README.md @@ -10,6 +10,8 @@ the original layers. Because they contain fewer trainable parameters, LoRA allow However, using a fine-tuned LoRA model for inference will have a negative impact on performance. This is because the original layer must still be used to calculate the outputs. However, for a LoRA model, an algorithm known as weight merging nullifies the added cost of using the fine-tuned LoRA model by merging the LoRA and original weights. Weights may also be unmerged. +Please see our recent paper [X-LoRA](https://github.com/EricLBuehler/xlora). We introduce a MoE inspired method to densely gate LoRA adapters powered by a model self-reflection forward pass. For inference, we have created [mistral.rs](https://github.com/EricLBuehler/mistral.rs), which is written in Rust and enables inference of X-LoRA and other models including quantized. + ## Get started 1) To install, run the following: ```