Skip to content

Commit

Permalink
Fix ExLlama-v2 code snippet (#3281)
Browse files Browse the repository at this point in the history
  • Loading branch information
peterjunpark authored Jun 12, 2024
1 parent e864aa5 commit d24b3fa
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions docs/how-to/llm-fine-tuning-optimization/model-quantization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -154,13 +154,13 @@ kernels by configuring the ``exllama_config`` parameter as the following.
.. code-block:: python
from transformers import AutoModelForCausalLM, GPTQConfig
pretrained_model_dir = "meta-llama/Llama-2-7b"
gptq_config = GPTQConfig(bits=4, exllama_config={"version":2})
#pretrained_model_dir = "meta-llama/Llama-2-7b"
base_model_name = "NousResearch/Llama-2-7b-hf"
gptq_config = GPTQConfig(bits=4, dataset="c4", exllama_config={"version":2})
quantized_model = AutoModelForCausalLM.from_pretrained(
base_model_name,
device_map="auto",
base_model_name,
device_map="auto",
quantization_config=gptq_config)
bitsandbytes
============

Expand Down

0 comments on commit d24b3fa

Please sign in to comment.