Skip to content

Latest commit

 

History

History
94 lines (74 loc) · 5.17 KB

vicuna_weights_version.md

File metadata and controls

94 lines (74 loc) · 5.17 KB

Vicuna Weights

Weights version v1.3 v1.1 v0
Link 7B, 13B, 33B 7B, 13B 7B, 13B
Separator </s> </s> ###
Is delta weights No Yes Yes
FastChat PyPI package compatibility >= v0.2.1 >= v0.2.1 <= v0.1.10
FastChat source code compatibility after tag v0.2.1 after tag v0.2.1 tag v0.1.10

Updates

  • Major updates of weights v1.3

    • Train with twice the amount of ShareGPT data compared to previous versions.
    • Provide merged weights directly instead of delta weights.
  • Major updates of weights v1.1

    • Refactor the tokenization and separator. In Vicuna v1.1, the separator has been changed from ### to the EOS token </s>. This change makes it easier to determine the generation stop criteria and enables better compatibility with other libraries.
    • Fix the supervised fine-tuning loss computation for better model quality.

Prompt Template

Example prompt (weights v1.1 and v1.3)

A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.

USER: Hello!
ASSISTANT: Hello!</s>
USER: How are you?
ASSISTANT: I am good.</s>

See a full prompt template here.

Example prompt (weights v0)

A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.

### Human: Hello!
### Assistant: Hello!
### Human: How are you?
### Assistant: I am good.

See the full prompt template here.

How to Apply Delta Weights (for weights v1.1 and v0)

We release Vicuna weights (v1.1 and v0) as delta weights to comply with the LLaMA model license. You can add our delta to the original LLaMA weights to obtain the Vicuna weights. Instructions:

  1. Get the original LLaMA weights in the Hugging Face format by following the instructions here.
  2. Use the following scripts to get Vicuna weights by applying our delta. They will automatically download delta weights from our Hugging Face account.

NOTE: Weights v1.1 are only compatible with transformers>=4.28.0 and fschat >= 0.2.0. Please update your local packages accordingly. If you follow the above commands to do a fresh install, then you should get all the correct versions.

Vicuna-7B

This conversion command needs around 30 GB of CPU RAM. See the "Low CPU Memory Conversion" section below if you do not have enough memory. Replace /path/to/* with the real paths.

python3 -m fastchat.model.apply_delta \
    --base-model-path /path/to/llama-7b \
    --target-model-path /path/to/output/vicuna-7b \
    --delta-path lmsys/vicuna-7b-delta-v1.1

Vicuna-13B

This conversion command needs around 60 GB of CPU RAM. See the "Low CPU Memory Conversion" section below if you do not have enough memory. Replace /path/to/* with the real paths.

python3 -m fastchat.model.apply_delta \
    --base-model-path /path/to/llama-13b \
    --target-model-path /path/to/output/vicuna-13b \
    --delta-path lmsys/vicuna-13b-delta-v1.1

Low CPU Memory Conversion

You can try these methods to reduce the CPU RAM requirement of weight conversion.

  1. Append --low-cpu-mem to the commands above, which will split large weight files into smaller ones and use the disk as temporary storage. This can keep the peak memory at less than 16GB.
  2. Create a large swap file and rely on the operating system to automatically utilize the disk as virtual memory.

FAQ

Tokenizer issues

There are some frequently asked tokenizer issues (lm-sys#408). Some of them are not only related to FastChat or Vicuna weights but are also related to how you convert the base llama model.

We suggest that you use transformers>=4.28.0 and redo the weight conversion for the base llama model. After applying the delta, you should have a file named special_tokens_map.json in your converted weight folder for either v0 or v1.1. The contents of this file should be the same as this file: https://huggingface.co/lmsys/vicuna-13b-delta-v0/blob/main/special_tokens_map.json. If the file is not present, please copy the special_tokens_map.json and tokenizer_config.json files from https://huggingface.co/lmsys/vicuna-13b-delta-v0/tree/main to your converted weight folder. This works for both v0 and v1.1.