## Vicuna Weights

FastChat provides [Vicuna](https://vicuna.lmsys.org/) weights as delta weights to comply with the LLaMA model license. You can add our delta to the original LLaMA weights to obtain the Vicuna weights. The following scripts can be used to get Vicuna weights by applying our delta:

1. Get the original LLaMA weights in the huggingface format by following the code block or instructions [here](https://huggingface.co/docs/transformers/main/model_doc/llama).

In [None]:
import os
import transformers

# Llama model directory
Llama_model_dir = "path/to/LLaMA"

# Huggingface model directory
Huggingface_model_dir = "huggingface_LLaMA"

# Transformers library directory
transformers_dir = os.path.dirname(transformers.__file__)

model_size = "7B"

# Create the arguments required to run the command
cmd = f"python3 {os.path.join(transformers_dir, 'models', 'llama', 'convert_llama_weights_to_hf.py')} \
    --input_dir {Llama_model_dir} --model_size {model_size} --output_dir {Huggingface_model_dir}/{model_size}"

# Run the command
os.system(cmd)


2. Use the following scripts to get Vicuna weights by applying our delta. They will automatically download delta weights from our Hugging Face [account](https://huggingface.co/lmsys).

**NOTE**:
Our released weights are only compatible with the latest main branch of huggingface/transformers.
We install the correct version of transformers when fastchat is installed.




### Vicuna-7B
This conversion command needs around 30 GB of CPU RAM.
If you do not have enough memory, you can create a large swap file that allows the operating system to automatically utilize the disk as virtual memory.

In [None]:
import os

delta_path = "lmsys/vicuna-7b-delta-v1.1"
vicuna_model_dir = "vicuna_LLaMA"

# Create the arguments required to run the command
cmd = f"python3 -m fastchat.model.apply_delta \
    --base {Huggingface_model_dir}/7B \
    --target {vicuna_model_dir}/7B \
    --delta {delta_path}"

# Run the command
os.system(cmd)

### Vicuna-13B
This conversion command needs around 60 GB of CPU RAM.
If you do not have enough memory, you can create a large swap file that allows the operating system to automatically utilize the disk as virtual memory.


In [None]:
import os

delta_path = "lmsys/vicuna-13b-delta-v1.1"
vicuna_model_dir = "vicuna_LLaMA"

# Create the arguments required to run the command
cmd = f"python3 -m fastchat.model.apply_delta \
    --base {Huggingface_model_dir}/13B \
    --target {vicuna_model_dir}/13B \
    --delta {delta_path}"

# Run the command
os.system(cmd)