diff --git a/README.md b/README.md
index d4a88cf81..8e6cc2bb4 100644
--- a/README.md
+++ b/README.md
@@ -204,6 +204,14 @@ pip install "sglang[all]"
 
 You'll first launch a SGLang backend worker which will execute the models on GPUs. Remember the `--port` you've set and you'll use that later.
 
+For llava-v1.6-mistral-7b only:
+
+1. Run `git lfs install`
+2. Run `git clone https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b`
+3. Make the patches following this PR: https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b/discussions/2/files.
+4. Specify the folder name (default will be llava-v1.6-mistral-7b) as the `--model-path` instead and remove --tokenizer-path. Else, the model will not load using SGLang serving.
+5. Example: `CUDA_VISIBLE_DEVICES=0 python3 -m sglang.launch_server --model-path llava-v1.6-mistral-7b --port 30000`
+
 ```Shell
 # Single GPU
 CUDA_VISIBLE_DEVICES=0 python3 -m sglang.launch_server --model-path liuhaotian/llava-v1.5-7b --tokenizer-path llava-hf/llava-1.5-7b-hf --port 30000