chatbot-evals

A Chrome extension for running structured, multi-turn evaluations against multiple AI chatbots in batch. Designed for research use — prepare a prompt script, run it across targets, export the full transcript.

Installation

Clone or download this repository.
Open Chrome and navigate to chrome://extensions.
Enable Developer mode (top-right toggle).
Click Load unpacked and select this folder.
The extension icon appears in the toolbar. Click it to open the sidepanel.

Usage

Prepare an eval script — a JSON file following the format below.
Load the script — click "📂 Load JSON" in the sidepanel and select your file.
Select targets — check the chatbots you want to evaluate.
Click "▶ Run All Targets" — the extension opens a tab for each target, injects your prompts in order, waits for responses, and closes the tab.
Save the transcript — click "💾 Save transcript" to download the full results as JSON.

Note: Targets open as active (foreground) tabs during the run — chatbot sites do not stream responses in background tabs. Do not close them manually; they are closed automatically when each target finishes.

Eval Script Format

{
  "id": "my-eval-id",
  "name": "My Eval Name",
  "turns": [
    { "prompt": "First prompt text" },
    { "prompt": "Second prompt text" }
  ]
}

Field	Required	Description
`id`	Yes	Short identifier used in the exported filename
`name`	Yes	Human-readable name shown in the sidepanel
`turns`	Yes	Array of turns, each with a `prompt` string

Multi-turn scripts maintain conversation context — each prompt is sent into the same ongoing chat session for that target.

Transcript Format

Exported as JSON:

{
  "script_id": "string",
  "script_name": "string",
  "run_timestamp": "ISO8601",
  "targets": [
    {
      "id": "string",
      "name": "string",
      "model_version": null,
      "error": "string or null",
      "turns": [
        {
          "timestamp": "ISO8601",
          "prompt": "string",
          "response": "string"
        }
      ]
    }
  ]
}

model_version is reserved for future use (automatic detection from the UI) and is always null in the current version.

Targets

All three targets work without login.

Name	URL
ChatGPT	chatgpt.com
Gemini	gemini.google.com/app
Mistral Le Chat	chat.mistral.ai

Repository Structure

chatbot-evals/
├── manifest.json       Chrome extension manifest (MV3)
├── background.js       Service worker — opens the sidepanel on toolbar click
├── sidepanel.html      Extension sidepanel UI
├── sidepanel.js        Sidepanel logic: script loading, run orchestration,
│                       markdown rendering, transcript export
├── targets.js          Built-in target definitions (URLs, CSS selectors,
│                       response post-processing)
├── orchestrator.js     Multi-tab run orchestrator: opens tabs, injects
│                       content script, sequences turns, closes tabs
└── core/
    └── content.js      Injectable content script: prompt injection, send-button
                        detection, response capture (MutationObserver + debounce),
                        ToU/consent dialog dismissal

Adding Targets

Edit targets.js. Each entry needs:

{
  id:               "unique-id",
  name:             "Display Name",
  url:              "https://...",
  inputSelector:    "CSS selector for the text input",
  sendSelector:     "CSS selector for the send button",
  responseSelector: "CSS selector for assistant message text",
  // Optional: strip unwanted text (e.g. timestamps) from captured responses
  responseClean:    (text) => text.replace(/pattern/, "").trim(),
}

Selectors may need updating as chatbot UIs change. The sendSelector button is expected to be hidden until the input contains text — that is normal.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

chatbot-evals

Installation

Usage

Eval Script Format

Transcript Format

Targets

Repository Structure

Adding Targets

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
core		core
icons		icons
sample-evals		sample-evals
.gitignore		.gitignore
README.md		README.md
background.js		background.js
manifest.json		manifest.json
orchestrator.js		orchestrator.js
sidepanel.html		sidepanel.html
sidepanel.js		sidepanel.js
targets.js		targets.js

Folders and files

Latest commit

History

Repository files navigation

chatbot-evals

Installation

Usage

Eval Script Format

Transcript Format

Targets

Repository Structure

Adding Targets

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages