# Maximizing Variance on Questionnaires via Prompt Optimization

In this notebook we will develop a system to 
1. Run LLM on questionnaires, recording the probability distribution over answer. 
2. Compare answer probability distribution over multiple prompts (e.g., character descriptions) the LLM uses to answer questionnaire questions.
3. Visualize the "landscape" of prompts. 


## Synthetic Dataset

We will start with a simple synthetic dataset of {question-answers} of the form 

```json
{'question': 'How are you feeling? ', 'answers': ['I feel happy.', 'I feel sad.', ...]}
{'question': 'Where would you like to be right now? ', 'answers': ['At home.', 'In bed.', 'At a party.', ...]}
...
```

We will then generate a set of system prompts, e.g., 
```json
{'prompt': 'You are an extremely happy, extroverted, confident person. ', 'id': '0', 'tag': 'high valence'}
{'prompt': 'You are an extremely depressed, introverted person. ', 'id': 1, 'tag': 'low valence'}
...
```

Ensure that the prompts are describing the agent in the 2nd person ("You are..."), 
the questions address the model in the second person ("How are you...") and the 
answers are short, simple answers in the first person("I am...").

Include a question mark at the end of the question, and a period at the end of
each answer and each prompt. Ensure there is a trailing space at the end of each
prompt and question so that the concatenated `prompt + question + answer` is a
properly formatted string.

The question-answer dataset and the prompt dataset should also be in JSONL format. 


## Running LLM on Questionnaire

We will use the [minference](https://github.com/amanb2000/minference) code to 
run an efficient, batch-parallelized, multi-threaded inference server with 
Llama-3 8b. To get started, clone the repository and run the setup commands in 
the README. Then run this command to start the server: 

```bash
python3 languagegame/inference_server/main.py \
	--config configs/min_llama_3_8b_instruct.json \
	--port 4444
```

We can then send requests to the server to get the CE loss on each answer 
for each question given a prompt using: 

```python
import argparse


def loss_call(prompt_question_string, answer_string, API_URL):
    data = {
        "context_string": context_string,
        "corpus_string": corpus_string
    }
    response = requests.post(API_URL, json=data)

    if response.status_code == 200:
        return response.json()
    else:
        print(f"Error {response.status_code}: {response.text}")
        return None
```

Where the API url would be `http://localhost:4444/ce_loss`. 

For each prompt, we want to compute the CE loss on each P(answer_i | prompt + question). 
We should store that in a dictionary, and we will eventually save it to disk. 


## Analyzing Results

The result of the CE loss calls should be $-\log P(a_{ki} | p_j + q_k)$ for all 
$i, j, k$ where ($a_{ki}$ is the $i$ th answer to the $k$ th question, $p_j$ is
the $j$ th prompt, and $q_k$ is the $k$ th question). 

We will concatenate these loss values into one vector per prompt $p_j$. We can 
then do PCA/tSNE and look at the landscape/distances between different prompts. 

We can also try to train a classifier that discriminates prompts with tag 
`high valence` and `low valence`. 


## Todo
 - [ ] (Claude-3 Opus) Generate a questinnaire of 60 questions, focusing on 
 valence as the primary targets for the questions. Store this in `valence_questionnaire_20240523_60.jsonl.`
 - [ ] (Claude-3 Opus) Generate a dataset of 30 prompts, focusing on varying the 
 valence of each (tags 'high valence', 'low valence'). Store this in `valence_prompts_20240523_30.jsonl`.
 - [ ] Write an inference loop that uses the `loss_call` function to get the probabilities over all 
 answers for all questions, prompts. Store in a dictionary. 
 - [ ] Extract a set of vector representations for each prompt consisting of the contatenated CE losses on 
 each answer for each question. Ensure the concatenation occurs in the same order every time so they are comparable. 
 - [ ] Make a PCA plotly plot where each datapoint shows the prompt if you hover over it. Color the prompts based on 
 their valence label. 
 - [ ] Make a tSNE plotly plot of the same format. 
 - [ ] Train a simple logistic regression model to predict valence label of each prompt given the vector of answer losses. 
 Ensure you have a separate training/test set. 

In [None]:
import requests
import jsonlines

# Load the datasets
with jsonlines.open('../datasets/valence_questionnaire_20240523_60.jsonl') as f:
    questions = list(f)

with jsonlines.open('../datasets/valence_prompts_20240523_30.jsonl') as f:
    prompts = list(f)


# Test the first prompt-question-answer triple
API_URL = "http://localhost:4444/ce_loss"


In [None]:
# Define the CE loss function
def loss_call(prompt_question_string, answer_string, API_URL):
    data = {
        "context_string": prompt_question_string,
        "corpus_string": answer_string
    }
    response = requests.post(API_URL, json=data)

    if response.status_code == 200:
        return response.json()
    else:
        print(f"Error {response.status_code}: {response.text}")
        return None
prompt = prompts[0]['prompt']
question = questions[0]['question']
answer = questions[0]['answers'][0]

prompt_question_string = prompt + question
print(f"Prompt + Question: {prompt_question_string}")
print(f"Answer: {answer}")

loss = loss_call(prompt_question_string, answer, API_URL)
print(f"CE Loss: {loss}")