# Prompt Injection

----------------

We use the tuned lens to reveal knowledge that models represent during its processing. <br>

<u>Setups:</u> <br>
<u>Prompt Injection:</u> <br>
    - Inject some verbage (e.g. "ignore the task and output X") to make the model ignore the task and output X.

We are interested in the following metrics: <br>
    1) top-1 accuracy: if the top logit is the correct label. <br>
    2) label space: sum of probabilities of labels. <br>
    3) correct over incorrect: if the correct label's probability is greater than all other label probabilities. <br>
    4) calibrated correct over incorrect: if the correct label probability is greater than the kth quantile (where k is 1/(number_of_labels - 1)) probability of that label on n inputs. <br>


----------------

## Table of Contents
- <a href='#0)-Prereqs-[required]'>0) Prereqs [required]</a>
- <a href='#1)-Walkthrough-[demo]'>1) Walkthrough [demo]</a>
  * <a href='#1.1)-Model-and-Tokenizer'>1.1) Model and Tokenizer</a>
  * <a href='#1.2)-Data-Parameters'>1.2) Data Parameters</a>
  * <a href='#1.3)-Dataset'>1.3) Dataset</a>
  * <a href='#1.4)-Prefixes'>1.4) Prefixes</a>
  * <a href='#1.5)-Tokenized-Inputs-and-Label-Indices'>1.5) Tokenized Inputs and Label Indices</a>
  * <a href='#1.6)-Intermediate-Logits-and-Probabilities'>1.6) Intermediate Logits and Probabilities</a>
  * <a href='#1.7)-Metrics:-Top-1-Accuracy,-Top-Number-of-Labels-Match,-Label-Space-Probability'>1.7) Metrics: Top-1 Accuracy, Top Number of Labels Match, Label Space Probability</a>
  * <a href='#1.8)-Metrics:-Correct-over-Incorrect'>1.8) Metrics: Correct over Incorrect</a>
  * <a href='#1.9)-Metrics:-Calibrated-Correct-over-Incorrect,-Calibrated-Permute-Score'>1.9) Metrics: Calibrated Correct over Incorrect, Calibrated Permute Score</a>
  * <a href='#1.10)-Plot-Metrics'>1.10) Plot Metrics</a>
- <a href='#2)-Datasets-[start-here]'>2) Datasets [start here]</a>
- <a href='#3)-Model-Inference-[skip]'>3) Model Inference [skip]</a>
- <a href='#4)-Load-Metrics'>4) Load Metrics</a>
- <a href='#5)-Plot-Metrics'>5) Plot Metrics</a>

## 0) Prereqs [required]

In [1]:
!nvidia-smi

Sun Dec  4 19:37:06 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA A100 80G...  On   | 00000000:4F:00.0 Off |                    0 |
| N/A   47C    P0    88W / 250W |  24939MiB / 81920MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [3]:
import sys
import os
sys.path.append("../dev")
import json
import pickle
import dill

from utils import *
from model import *
from data import *
from metrics import *
from visualize import *
from run import *