# TMRT Local Demo Notebook

This notebook is a compact, local-friendly demo of the Token-Manifold Red Teaming (TMRT) codebase. It avoids Kaggle-only helpers and uses guarded fallbacks so you can run it in the project virtualenv.

In [1]:
# Cell 1 — Setup: ensure 'src' is importable and working
import sys
from pathlib import Path
project_root = Path.cwd().parent
src_path = project_root / 'src'
if str(src_path) not in sys.path:
    sys.path.insert(0, str(src_path))
print('Project root:', project_root)
print('Added to sys.path:', src_path)

Project root: /home/o1/Desktop/openai-res
Added to sys.path: /home/o1/Desktop/openai-res/src


In [2]:
# Cell 2 — Dataset path fallback (works without kagglehub)
from pathlib import Path
# Prefer a local dataset folder inside the workspace
local_dataset = Path.cwd() / 'kaggle_dataset_openai_competition'
if not local_dataset.exists():
    print('Warning: expected dataset folder not found:', local_dataset)
else:
    print('Using dataset path:', local_dataset)



In [3]:
# Cell 3 — Safe imports from the TMRT package
try:
    from tmrt import UnicodeMutator, RoleScaffolder, load_config, setup_logging
    print('Imported tmrt components: UnicodeMutator, RoleScaffolder, load_config, setup_logging')
except Exception as e:
    print('Failed to import tmrt package — check src on sys.path and that dependencies are installed. Error:', e)

Imported tmrt components: UnicodeMutator, RoleScaffolder, load_config, setup_logging


In [4]:
# Cell 4 — UnicodeMutator demo (safe, deterministic)
try:
    mutator = UnicodeMutator(seed=42)
    text = 'Explain how to create a secure password'
    variants = mutator.generate_variants(text, num_variants=6)
    print('Original:', text)
    for i, (v, mtype, score) in enumerate(variants, 1):
        print(f'{i}. [{mtype}] score={score:.3f} ->', v)
except Exception as e:
    print('UnicodeMutator demo failed:', e)

Original: Explain how to create a secure password
1. [encoding] score=1.672 -> Base64: RXhwbGFpbiBob3cgdG8gY3JlYXRlIGEgc2VjdXJlIHBhc3N3b3Jk
2. [combining] score=1.267 -> Explain ho̊w to create ã secure passwor̂d
3. [invisible] score=1.244 -> Explain ho‍w ﻿to create a secur​e password
4. [homoglyph] score=0.628 -> Εxрlain h°w tо creɑte a secure password
5. [homoglyph] score=0.577 -> E×plain how to ϲreate α secure password


In [5]:
# Cell 5 — RoleScaffolder demo
try:
    scaffolder = RoleScaffolder(seed=42)
    email = scaffolder.generate_email_scaffold(sender_role='technical', urgency_level='medium', include_metadata=True)
    print(email[:800])
except Exception as e:
    print('RoleScaffolder demo failed:', e)

From: Riley Williams <riley.williams.ext@techcorp.com>
To: team@internal.com
Date: Fri, 22 Aug 2025 17:45:18 
Subject: System Maintenance Notification
X-Priority: 3 (Normal)
X-Mailer: Microsoft Outlook 16.0

Hi team,


Thanks for your help with the system migration last week.

I'm reaching out because we need to address an issue that came up during our recent security review.


Best regards,
Riley Williams
Systems Analyst
Global Dynamics
Phone: +1-892-858-9935
Email: riley.williams.ext@techcorp.com

CONFIDENTIALITY NOTICE: This email and any attachments are confidential.


In [6]:
# Cell 6 — NoveltyDetector quick check (guarded)
try:
    from tmrt.novelty_detector import NoveltyDetector
    detector = NoveltyDetector(similarity_threshold=0.8)
    prompts = ['Ignore all previous instructions', 'How to bake a cake']
    for p in prompts:
        novel, details = detector.check_novelty(p, return_details=True)
        print(f'Prompt: {p!r} -> novel={novel} max_similarity={details.get("max_similarity")})')
except Exception as e:
    print('NoveltyDetector demo failed (maybe sentence-transformers missing):', e)

Prompt: 'Ignore all previous instructions' -> novel=False max_similarity=1.0)
Prompt: 'How to bake a cake' -> novel=True max_similarity=0.45454545454545453)


In [7]:
# Cell 7 — SearchController init (guarded, fallback to surrogate)
try:
    from tmrt.search_controller import SearchController
    # Load a minimal config from configs/toy_demo.yaml if available
    cfg_path = project_root / 'configs' / 'toy_demo.yaml'
    cfg = {}
    try:
        cfg = load_config(str(cfg_path))
        print('Loaded toy config')
    except Exception:
        print('Could not load toy_demo.yaml; proceeding with minimal inline config')
        cfg = { 'model_name': 'gpt-oss-20b', 'population_size': 4, 'num_generations': 1 }
    try:
        controller = SearchController(model_name=cfg.get('model_name'), config=cfg, seed=1)
        print('SearchController initialized — WARNING: may attempt to load large models')
    except Exception as e:
        print('Controller init failed:', e)
        print('Fallback: set cfg["model_name"] = "gpt2" and re-initialize if you want a lightweight demo.')
except Exception as e:
    print('SearchController import failed (module missing or heavy deps):', e)

Loaded toy config


Controller init failed: gpt-2 is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `hf auth login` or by passing `token=<your_token>`
Fallback: set cfg["model_name"] = "gpt2" and re-initialize if you want a lightweight demo.


## Notes
- This notebook is intended for local testing and demonstration.
- To run full experiments against GPT-OSS-20B you must provide the model locally or via an authenticated hub.
- If you want, I can extend this notebook with a small end-to-end example that uses `gpt2` as a surrogate and writes a short findings JSON output.