From 7fe56f4d7862dde48ba9023b50f17b4203c29c92 Mon Sep 17 00:00:00 2001
From: EricLBuehler <ericlbuehler@gmail.com>
Date: Mon, 18 Mar 2024 19:14:28 -0400
Subject: [PATCH] Update readme

---
 README.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/README.md b/README.md
index 6687c84..d977b8c 100644
--- a/README.md
+++ b/README.md
@@ -10,6 +10,8 @@ the original layers. Because they contain fewer trainable parameters, LoRA allow
 However, using a fine-tuned LoRA model for inference will have a negative impact on performance. This is because the original layer must still be used to calculate the outputs. However, for a LoRA model, an algorithm known as weight merging nullifies the added cost of using the
 fine-tuned LoRA model by merging the LoRA and original weights. Weights may also be unmerged.
 
+Please see our recent paper [X-LoRA](https://github.com/EricLBuehler/xlora). We introduce a MoE inspired method to densely gate LoRA adapters powered by a model self-reflection forward pass. For inference, we have created [mistral.rs](https://github.com/EricLBuehler/mistral.rs), which is written in Rust and enables inference of X-LoRA and other models including quantized.
+
 ## Get started
 1) To install, run the following:
 ```