# Reproduce our project

This notebook contains all the steps to reproduce the results presented in our project. It covers dependency installation, API key setup, model evaluations, and results analysis/plotting.

## 1. Install Dependencies

First, we install the project package itself along with its dependencies defined in `setup.py`. The `-e` flag installs it in "editable" mode, meaning changes to the source code are reflected immediately without needing reinstallation.

In [None]:
! pip install -e ..

## 2. Import Libraries

Import the necessary Python libraries for running the evaluations and displaying results within the notebook.

In [None]:
import os
from inspect_ai import eval
from darkthoughtbench.task import darkthoughtbench
from IPython.display import Image, display

## 3. Set Initial API Keys

Set the environment variables for the API keys required for the initial evaluations and the subsequent analysis.

* **`GOOGLE_API_KEY`**: Your Google AI Studio key (for Gemini). Needed now if `darkthoughtbench` uses Gemini as an overseer/scorer, and also later for the consistency analysis.
* **`ANTHROPIC_API_KEY`**: Your Anthropic key (for Claude). Needed now if `darkthoughtbench` uses Claude as an overseer/scorer, and also later for the consistency analysis.
* **`OPENAI_API_KEY`**: **Crucially, set this to your *DeepSeek* API key for now.** This is because the `inspect_ai.eval` calls below target the DeepSeek API endpoint (`model_base_url='https://api.deepseek.com'`), which uses the OpenAI client interface. We will update this variable later before running the consistency analysis script which calls the actual OpenAI API.

**Important:** Replace `'your-...'` placeholders with your actual API keys.

In [None]:
# IMPORTANT: Replace placeholders with your actual keys!
# Use your DeepSeek key for OPENAI_API_KEY here initially.
os.environ['GOOGLE_API_KEY'] = 'your-google-api-key'
os.environ['ANTHROPIC_API_KEY'] = 'your-anthropic-api-key'
os.environ['OPENAI_API_KEY'] = 'your-deepseek-api-key'

## 4. Run Evaluation: DeepSeek-R1 (with CoT)

Run the `inspect_ai` evaluation using the `darkthoughtbench` task on the DeepSeek-R1 model (`deepseek-reasoner`). We specify `model_is_reasoning=True` and use the DeepSeek API endpoint. The results are saved to a log file in the `../logs` directory, which is then renamed for clarity.

In [None]:
log_r1 = eval(tasks=darkthoughtbench(model_is_reasoning=True), model_base_url='https://api.deepseek.com', model='openai/deepseek-reasoner', log_dir='../logs')

old_log_r1_path = log_r1[0].location
log_r1_directory = os.path.dirname(old_log_r1_path)
new_log_r1_filename = 'DeepSeek-R1_with_CoT.eval'
new_log_r1_path = os.path.join(log_r1_directory, new_log_r1_filename)
if not os.path.exists(new_log_r1_path):
    os.rename(old_log_r1_path, new_log_r1_path)
else:
    print(f"File '{new_log_r1_filename}' already exists. Skipping rename.")

## 5. Run Evaluation: DeepSeek-V3

Run the `inspect_ai` evaluation using the `darkthoughtbench` task on the DeepSeek-V3 model (`deepseek-chat`). We specify `model_is_reasoning=False`. The results are saved to a log file in the `../logs` directory and renamed.

In [None]:
log_v3 = eval(tasks=darkthoughtbench(model_is_reasoning=False), model_base_url='https://api.deepseek.com', model='openai/deepseek-chat', log_dir='../logs')

old_log_v3_path = log_v3[0].location
log_v3_directory = os.path.dirname(old_log_v3_path)
new_log_v3_filename = 'DeepSeek-V3.eval'
new_log_v3_path = os.path.join(log_v3_directory, new_log_v3_filename)
if not os.path.exists(new_log_v3_path):
    os.rename(old_log_v3_path, new_log_v3_path)
else:
    print(f"File '{new_log_v3_filename}' already exists. Skipping rename.")

## 6. View Evaluation Logs (Optional)

Use the `inspect view` command to launch a web-based UI where you can explore the evaluation logs generated in the previous steps. This requires navigating back to the parent directory first.

In [None]:
! (cd .. && inspect view)

## 7. Update OpenAI API Key for Overseer Analysis

Now, we update the `OPENAI_API_KEY` environment variable to use your **actual OpenAI API key**. This is necessary because the `CoT_response_consistency.py` script will directly call the OpenAI API (specifically GPT-4o) as one of the overseer models. The Google and Anthropic keys are already set correctly from Cell 3.

**Important:** Replace `'your-openai-api-key'` with your actual OpenAI key.

In [None]:
# IMPORTANT: Replace placeholder with your actual OpenAI key!
os.environ['OPENAI_API_KEY'] = 'your-openai-api-key'

## 8. Run CoT-Response Consistency Analysis

Execute the Python script `CoT_response_consistency.py`. This script reads the `DeepSeek-R1_with_CoT.eval` log file generated earlier. It then uses the overseer models (GPT-4o, Gemini 1.5 Pro, Claude 3.5 Sonnet) via their respective APIs to analyze each conversation for consistency between the Chain-of-Thought (CoT) and the final response regarding dark patterns. The script generates JSON logs for each overseer's analysis and confusion matrix plots, saving them to the `../logs` and `../plots` directories, respectively.

In [None]:
! python ../src/CoT_response_consistency.py

## 9. Display Generated Plots

Load and display the confusion matrix plots (`.png` files) generated by the `CoT_response_consistency.py` script from the `../plots` directory.

In [None]:
folder = '../plots'
for file in os.listdir(folder):
    if file.endswith('.png'):
        print(os.path.splitext(file)[0])
        display(Image(filename=os.path.join(folder, file)))