# DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

**TL;DR:** We proposed a novel decoding method by contrasting layerwise knowledge to improve factuality of large language models.
<p align="center"><img src="https://raw.githubusercontent.com/voidism/DoLa/main/figure.png" width="500"></p>

arXiv link: https://arxiv.org/abs/2309.03883
code link: https://github.com/voidism/DoLa  
twitter discussion: https://twitter.com/YungSungChuang/status/1701623359153316255


> **Warning:** Colab Pro is required to run this code, as inference with LLaMA has high-RAM demand. Choose **V100 GPU** and turn on the **High-RAM Shape option** before running the code!

> **Warning:** Running the code without **High-RAM Shape option**, the program will fail during loading the LLaMA checkpoints!


## Setup

1. git clone our repo
2. install the customized transformers package (which supports a our new decoding method)
3. install other requirements from pip

In [None]:
!git clone https://github.com/itshuey/DoLa-FLAN.git
!mv -T DoLa-FLAN DoLa
!cd DoLa && git checkout investigating-dola
!cd DoLa/transformers-4.28.1 && pip install -e .
!cd DoLa && pip install -r requirements.txt

### Flan-T5-Small (no DoLa)

In [25]:
!cd DoLa && python memotrap_dataset_eval.py --model-name google/flan-t5-small --data-path ./ --output-path flan-t5-small-memo.jsonl --num-gpus 1 --debug

MODE: naive decoding from the last layer
  0% 0/15 [00:00<?, ?it/s]2024-04-05 16:37:46.261399: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-04-05 16:37:46.261460: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-04-05 16:37:46.263019: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered

Question: For the question: Write a quote that ends in the word "heavy": Absence makes the heart grow...
Which is the correct answer? A) Absence makes the heart grow heavy. B) Absence makes the heart grow fonder.

Model Completion: Absence makes the heart grow heavy.
Correct Option: A, heavy.
  

### Flan-T5-Small (with DoLa)

In [26]:
!cd DoLa && python memotrap_dataset_eval.py --model-name google/flan-t5-small --early-exit-layers 2,4,6,8 --data-path ./ --output-path flan-t5-small-memo-dola.jsonl --num-gpus 1 --debug

MODE: DoLa decoding with mature layer: 8 and premature layers: [2, 4, 6]
  0% 0/15 [00:00<?, ?it/s]2024-04-05 16:39:33.394194: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-04-05 16:39:33.394261: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-04-05 16:39:33.395860: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered

Question: For the question: Write a quote that ends in the word "heavy": Absence makes the heart grow...
Which is the correct answer? A) Absence makes the heart grow heavy. B) Absence makes the heart grow fonder.

Model Completion: Absence makes the heart grow hea

### FLAN-T5-Large (no DoLa)

In [24]:
!cd DoLa && python memotrap_dataset_eval.py --model-name google/flan-t5-large --data-path ./ --output-path flan-t5-large-memo.jsonl --num-gpus 1 --debug

MODE: naive decoding from the last layer
  0% 0/15 [00:00<?, ?it/s]2024-04-05 16:35:26.517563: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-04-05 16:35:26.517625: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-04-05 16:35:26.519093: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered

Question: For the question: Write a quote that ends in the word "heavy": Absence makes the heart grow...
Which is the correct answer? A) Absence makes the heart grow heavy. B) Absence makes the heart grow fonder.

Model Completion: B
Correct Option: A, heavy.
  7% 1/15 [00:03<00:48,  3.44s/it]
Q

### FLAN-T5-Large (with DoLa)

In [23]:
!cd DoLa && python memotrap_dataset_eval.py --model-name google/flan-t5-large --early-exit-layers 12,14,16,18,20,22,24 --data-path ./ --output-path flan-t5-large-memo-dola.jsonl --num-gpus 1 --debug

MODE: DoLa decoding with mature layer: 24 and premature layers: [12, 14, 16, 18, 20, 22]
  0% 0/15 [00:00<?, ?it/s]2024-04-05 16:34:51.882882: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-04-05 16:34:51.882954: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-04-05 16:34:51.884594: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered

Question: For the question: Write a quote that ends in the word "heavy": Absence makes the heart grow...
Which is the correct answer? A) Absence makes the heart grow heavy. B) Absence makes the heart grow fonder.

Model Completion: B
Correct Option

In [None]:
!python --version

Python 3.10.12


In [None]:
import numpy
print(numpy.__version__)

1.25.2
