# XDetox with DecompX Masking, LLM Infilling, and DecompX Reranking

This notebook runs an XDetox variant with:

1. **DecompX masking** (token-level toxicity attribution on RoBERTa).
2. **LLM infilling** using Mistral-7B-Instruct  
   (`mistralai/Mistral-7B-Instruct-v0.2`).
3. **DecompX-based reranking** of multiple LLM candidates, following the
   reranking strategy described in *“XDetox: Text Detoxification with Token-Level Toxicity Explanations”*.

For each toxic input sentence, the goal is to pick **one final detoxified candidate** that:

- Has the **lowest DecompX toxicity importance**,
- While being produced by an LLM that sees both the **raw toxic sentence** and the **DecompX-masked sentence**.

Compared to the **DecompX-masking + LLM infilling + global reranking** pipeline, this notebook:

- Uses the **same DecompX masking** and **same LLM infilling** setup,
- But replaces global reranking (toxicity + similarity + fluency) with **pure DecompX toxicity-based reranking**.

---

## Scoring: DecompX-Based Reranking

The reranking stage follows the idea described in Section 2.3 of the XDetox paper:

1. For each input, we generate multiple candidate detoxified sentences $s_j$ with the LLM.
2. For each candidate sentence $s_j$, we run **DecompX** and compute **token-level importance scores** with respect to toxicity.
3. For each candidate, we sum the importance scores of its tokens:

   - Let $t_{i,j}$ be the $i$-th token in candidate sentence $s_j$.
   - Let $\text{Importance}(t_{i,j})$ be its DecompX importance score (toxicity contribution).

   The **total toxicity importance** of candidate $s_j$ is:

   $$
   \sum_{i=1}^{N_j} \text{Importance}(t_{i,j})
   $$

4. We choose the candidate with the **lowest total importance**:

   $$
   s^* = \arg\min_{s_j} \sum_{i=1}^{N_j} \text{Importance}(t_{i,j})
   $$

Intuitively:

- A **lower sum** of token-level importance scores means **lower overall toxicity**.
- Reranking selects the candidate with the **minimum DecompX toxicity** among all candidates for that input.

In the implementation, this is encapsulated in a helper like:

- `rerank_candidates_decompx(sources, candidates, threshold, batch_size_mask)`

which:

- Flattens all candidates,
- Runs DecompX on each candidate sentence,
- Computes a DecompX-based toxicity score per candidate,
- Reshapes scores back to `[num_inputs, num_candidates]`,
- Picks the **index of the candidate with the lowest score** for each input.

There is **no XLM-R / LaBSE / GPT-2 global scoring** in this notebook.  
All reranking is done by **DecompX**.

---

## DecompX Masking

Masking uses the original XDetox **DecompX masker**:

- Implementation: `rewrite.mask_orig.Masker`.
- Backend: RoBERTa with DecompX token-level toxicity attribution.

For each input sentence:

1. DecompX computes an **importance score** for each token based on its contribution to toxicity.
2. If the importance score of a token exceeds a **threshold** $t$, that token is considered **toxic**.
3. Such tokens are replaced by the `<mask>` token.

For a given dataset and DecompX threshold $t$:

- Inputs are loaded via `rewrite_example.get_data`.
- Masked outputs are written to:

```text
  data/model_outputs/{output_folder}/{data_type}/DecompX_LLM_DecompX{t}/masked_inputs.txt
```

(where the exact directory name may encode the DecompX threshold and the fact that this is a DecompX-masking + LLM pipeline.)

These masked sentences are later fed into the LLM infiller.

---

## LLM Infilling (Mistral-7B-Instruct)

After DecompX masking, we use **Mistral-7B-Instruct** as an **infilling model**.

### Inputs to the LLM

For each example we provide **both**:

* **Toxic Sentence**: the original toxic sentence, unchanged.
* **Masked Sentence**: the DecompX-masked sentence, where toxic spans have been replaced by `<mask>`.

The LLM prompt is structured along the lines of:

```text
You are a helpful assistant trained to make toxic or offensive sentences more polite and respectful
while keeping their original meaning. ...

Toxic Sentence: {raw_toxic}
Masked Sentence: {masked_by_DecompX}
Final Output:
```

The instructions emphasize:

* **Only fill in the `<mask>` tokens** in the Masked Sentence.
* Keep **all non-masked parts** of the Masked Sentence as close as possible to their original form.
* Preserve the **meaning and intent** of the Toxic Sentence.
* Use the **same language** as the Toxic Sentence.
* Return **only** the final detoxified sentence **inside one pair of square brackets**:

```text
[Detoxified sentence here.]
```

### Candidate generation

For each $(\text{toxic}, \text{masked})$ pair, we ask Mistral for `num_candidates` completions:

* The generation parameters include:

  * `llm_sample` (sampling vs greedy),
  * `llm_temperature`,
  * `llm_top_p`,
  * `max_new_tokens`.

* For each completion, we:

  1. **Extract content inside the first `[ ... ]` block.**
  2. **Strip any remaining outer brackets.**
  3. **Normalize whitespace.**

If a cleaned candidate is empty, we fall back to using the Masked Sentence.

The result: for each input sentence, we obtain a list of `num_candidates` **LLM-generated candidates** to be reranked by DecompX.

---

## End-to-End Flow: Masking, LLM Infilling, DecompX Reranking

For each dataset:

1. **Subset selection**

   * The script can run on the full dataset or only the first `num_examples` instances.
   * A subset file is written under:

```text
datasets/_subsets/{data_type}/
```

2. **DecompX masking (per threshold)**

   * For each threshold $t$ in `thresholds`, we run DecompX masking.
   * Masked sentences are saved to:

```text
data/model_outputs/{output_folder}/{data_type}/DecompX_LLM_DecompX{t}/masked_inputs.txt
```

3. **LLM infilling (Mistral-7B-Instruct)**

   * For each masked sentence and its corresponding toxic input, we call Mistral with the prompt described above.
   * We generate `num_candidates` candidates per input.
   * The raw LLM outputs are post-processed (bracket extraction, whitespace cleanup).

4. **DecompX-based reranking**

   * For each threshold $t$ and each input sentence, we apply DecompX to **all LLM candidates**.

   * For each candidate $s_j$, we compute the total toxicity importance:

$$
\sum_{i=1}^{N_j} \text{Importance}(t_{i,j})
$$

   * We choose the candidate with the **lowest** total importance as the final output.

   * For each run, we write:

```text
data/model_outputs/{output_folder}/{data_type}/DecompX_LLM_DecompX{t}/{run_folder}/orig.txt
data/model_outputs/{output_folder}/{data_type}/DecompX_LLM_DecompX{t}/{run_folder}/gen.txt
```

   where:

   * `orig.txt`: original toxic inputs (one per line),
   * `gen.txt`: selected (reranked) LLM outputs (one per line),
   * `{run_folder}` encodes LLM generation hyperparameters.

---

## Evaluation

If `run_eval=True`, the pipeline calls `evaluation.evaluate_all` to compute:

* BERTScore (F1),
* MeaningBERT,
* BLEU-4,
* Toxicity (orig / gen),
* Perplexity (orig / gen).

For each `(threshold, run_folder)` we write:

```text
data/model_outputs/{output_folder}/{data_type}/DecompX_LLM_DecompX{t}/{run_folder}/gen_stats.txt
```

The notebook also builds a **summary CSV per dataset** by scanning all `DecompX_LLM_DecompX*` directories:

```text
data/model_outputs/{output_folder}/{data_type}/{data_type}.csv
```

This CSV aggregates:

* `threshold` (DecompX masking / reranking threshold),
* `folder` (run folder name),
* `bertscore`, `meaningbert`, `bleu4`,
* `perplexity_gen`, `perplexity_orig`,
* `toxicity_gen`, `toxicity_orig`.

---

## How to Use `detoxify()`

A typical function signature for this notebook looks like:

```python
def detoxify(
    data_type: str = "paradetox",
    output_folder: str = "colab_run_decompx_mask_llm_decompx",
    thresholds = (0.20,),
    echo: bool = False,
    num_examples: int = 100,        # None = full dataset
    overwrite_gen: bool = False,
    run_eval: bool = False,
    overwrite_eval: bool = False,
    skip_ref_eval: bool = False,
    # DecompX masking
    mask_batch_size: int = 10,
    # LLM infilling
    llm_sample: bool = True,
    llm_temperature: float = 0.7,
    llm_top_p: float = 0.95,
    max_new_tokens: int = 64,
    num_candidates: int = 3,        # LLM candidates per input
)
```

### Key arguments

#### Core I/O

* `data_type`:

  * Key in `data_configs`, for example:

    * `"paradetox"`, `"dynabench_val"`, `"dynabench_test"`,
    * `"jigsaw_toxic"`, `"microagressions_val"`, `"sbf_val"`,
    * `"appdia_original"`, `"appdia_discourse"`, etc.

* `output_folder`:

  * Top-level directory under:

    ```text
    data/model_outputs/{output_folder}/{data_type}/...
    ```

* `num_examples`:

  * `None`: use the full dataset.
  * Integer: run on the first `num_examples` examples.

* `overwrite_gen`:

  * `False`: if `gen.txt` already exists for a given `(threshold, run_folder)`, reuse existing generations.
  * `True`: regenerate and overwrite `gen.txt`.

* `echo`:

  * If `True`, print:

    * Basic dataset info,
    * Example inputs,
    * Example masked sentences,
    * Example final outputs,
    * Per-run metrics (if `run_eval=True`).

#### DecompX masking and thresholds

* `thresholds`:

  * Tuple of DecompX thresholds (e.g. `(0.15, 0.20, 0.25)`).
  * For each $t$, we:

    * Run DecompX masking with threshold $t$,
    * Run LLM infilling,
    * Apply DecompX-based reranking (using the same DecompX mechanism).

* `mask_batch_size`:

  * Batch size used when running DecompX masking over inputs.

#### LLM infilling (Mistral)

* `llm_sample`:

  * `True`: sampling.
  * `False`: deterministic decoding.

* `llm_temperature`:

  * Sampling temperature for Mistral (used when `llm_sample=True`).

* `llm_top_p`:

  * Top-p nucleus sampling cutoff.

* `max_new_tokens`:

  * Maximum number of new tokens generated per candidate.

* `num_candidates`:

  * Number of LLM candidates per input to be reranked by DecompX.

#### Evaluation

* `run_eval`:

  * If `True`, compute evaluation metrics and write `gen_stats.txt` files.

* `overwrite_eval`:

  * If `True`, recompute metrics even if `gen_stats.txt` already exists.

* `skip_ref_eval`:

  * If `True`, skip reference-based evaluation (for example, perplexity on reference outputs).

---

## Example Calls

### Quick sanity check (single threshold, small subset)

```python
detoxify(
    data_type="paradetox",
    output_folder="colab_run_decompx_mask_llm_decompx_demo_50_ex",
    thresholds=(0.20,),
    echo=True,
    num_examples=50,          # small subset
    overwrite_gen=True,
    run_eval=True,            # BLEU / BERTScore / MeaningBERT / PPL / Toxicity
    overwrite_eval=True,
    skip_ref_eval=False,
    mask_batch_size=8,
    llm_sample=True,
    llm_temperature=0.7,
    llm_top_p=0.95,
    max_new_tokens=64,
    num_candidates=10,
)
```

### Larger run (multiple thresholds, full dataset)

```python
detoxify(
    data_type="paradetox",
    output_folder="paradetox_decompx_mask_llm_decompx_full",
    thresholds=(0.15, 0.20, 0.25),
    echo=True,
    num_examples=None,        # full dataset
    overwrite_gen=False,
    run_eval=True,
    overwrite_eval=False,
    skip_ref_eval=False,
    mask_batch_size=8,
    llm_sample=True,
    llm_temperature=0.7,
    llm_top_p=0.95,
    max_new_tokens=64,
    num_candidates=10,
)
```

After running `detoxify`, you can inspect:

* Per-threshold, per-run outputs:

```text
data/model_outputs/{output_folder}/{data_type}/DecompX_LLM_DecompX{t}/{run_folder}/orig.txt
data/model_outputs/{output_folder}/{data_type}/DecompX_LLM_DecompX{t}/{run_folder}/gen.txt
data/model_outputs/{output_folder}/{data_type}/DecompX_LLM_DecompX{t}/{run_folder}/gen_stats.txt
```

* Aggregated metrics:

```text
data/model_outputs/{output_folder}/{data_type}/{data_type}.csv
```

This notebook lets you compare:

* **DecompX masking + LLM infilling + DecompX reranking** (this pipeline),
* Against:

  * **DecompX masking + LLM infilling + global reranking**, and
  * The original **DecompX masking + MaRCo + DecompX/global reranking** pipelines,

on the same datasets, using a DecompX-based toxicity selection rule.



In [1]:
#@title Mount Drive, Imports & locate XDetox
from google.colab import drive; drive.mount('/content/drive')

import os, glob, re, sys, json, shutil, math
import numpy as np
import pandas as pd
from tqdm.auto import tqdm
from pathlib import Path
from subprocess import run, PIPE
import torch
import nltk
from typing import List

# Try My Drive
candidate = "/content/drive/MyDrive/w266 - Project/XDetox"
print("Try MyDrive:", candidate, "->", os.path.isdir(candidate))

XDETOX_DIR = candidate
print("Using XDETOX_DIR:", XDETOX_DIR)
assert os.path.isdir(XDETOX_DIR), f"XDETOX_DIR does not exist: {XDETOX_DIR}"

Mounted at /content/drive
Try MyDrive: /content/drive/MyDrive/w266 - Project/XDetox -> True
Using XDETOX_DIR: /content/drive/MyDrive/w266 - Project/XDetox


In [2]:
#@title Runtime setup (paths, cache, GPU)
HF_CACHE = os.path.join(XDETOX_DIR, "cache")
os.makedirs(HF_CACHE, exist_ok=True)
os.environ["TRANSFORMERS_CACHE"] = HF_CACHE

if XDETOX_DIR not in sys.path:
    sys.path.append(XDETOX_DIR)

print("XDETOX_DIR:", XDETOX_DIR)
print("TRANSFORMERS_CACHE:", HF_CACHE)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))

XDETOX_DIR: /content/drive/MyDrive/w266 - Project/XDetox
TRANSFORMERS_CACHE: /content/drive/MyDrive/w266 - Project/XDetox/cache
CUDA available: True
GPU: NVIDIA A100-SXM4-80GB


In [3]:
#@title Verify XDetox repo layout
for d in ["rewrite", "evaluation", "datasets"]:
    assert os.path.isdir(os.path.join(XDETOX_DIR, d)), f"Missing folder: {d}"
print("Repo folders OK.")

Repo folders OK.


In [4]:
#@title Install dependencies (restart runtime if major errors)
!pip -q install --upgrade pip setuptools wheel
!pip -q install "transformers==4.41.2" "tokenizers==0.19.1" \
                "datasets==2.19.0" "evaluate==0.4.1" \
                "sacrebleu==2.4.1" sacremoses ftfy nltk matplotlib pandas jedi \
                sentencepiece
!pip -q install bert-score

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.8 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m77.2 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.2 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m73.1 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ipython 7.34.0 requires jedi>=0.16, which is not installed.[0m[31m
[0m[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
gcsfs 2025.3.0 requires fsspec==2025.3.0, but you have fsspec 2024.3.1 which is incompatible.[0m[

In [5]:
#@title Imports from transformers / rewrite
from transformers import AutoTokenizer, AutoModelForCausalLM
from rewrite.mask_orig import Masker as Masker_single
from rewrite import rewrite_example as rx
import argparse as _argparse



In [6]:
#@title NLTK data
nltk.download("punkt", quiet=True)
try:
    nltk.download("punkt_tab", quiet=True)
except Exception:
    pass
print("NLTK ready")

NLTK ready


In [7]:
#@title Data configs
data_configs = {
    "microagressions_val": {
        "data_path": "./datasets/microagressions/val.csv",
    },
    "microagressions_test": {
        "data_path": "./datasets/microagressions/test.csv",
    },
    "sbf_val": {
        "data_path": "./datasets/sbf/sbfdev.csv",
    },
    "sbf_test": {
        "data_path": "./datasets/sbf/sbftst.csv",
    },
    "dynabench_val": {
        "data_path": "./datasets/dynabench/db_dev.csv",
    },
    "dynabench_test": {
        "data_path": "./datasets/dynabench/db_test.csv",
    },
    "jigsaw_toxic": {
        "data_path": "./datasets/jigsaw_full_30/test_10k_toxic.txt",
    },
    "paradetox": {
        "data_path": "./datasets/paradetox/test_toxic_parallel.txt",
    },
    "appdia_original": {
        "data_path": "./datasets/appdia/original-annotated-data/original-test.tsv",
    },
    "appdia_discourse": {
        "data_path": "./datasets/appdia/discourse-augmented-data/discourse-test.tsv",
    }
}
print("Datasets:", ", ".join(data_configs.keys()))

REPO = XDETOX_DIR

Datasets: microagressions_val, microagressions_test, sbf_val, sbf_test, dynabench_val, dynabench_test, jigsaw_toxic, paradetox, appdia_original, appdia_discourse


In [8]:
#@title Helpers: subset data
def _abs_repo_path(rel: str) -> str:
    return os.path.join(REPO, rel.lstrip("./"))

def _ensure_dir(p: str):
    Path(p).mkdir(parents=True, exist_ok=True)

def _subset_for_data_type(data_type, data_path, n, out_dir):
    """
    Create a small subset file matching rewrite_example.get_data().
    Returns the new subset path (or original path if n is None/<=0).
    """
    if n is None or n <= 0:
        return data_path

    src = _abs_repo_path(data_path)
    _ensure_dir(out_dir)

    if "microagressions" in data_path:
        df = pd.read_csv(src)
        sub = df.head(n)
        out = os.path.join(out_dir, os.path.basename(src))
        sub.to_csv(out, index=False)
        return out

    if "sbf" in data_path:
        df = pd.read_csv(src)
        sub = df.head(n)
        out = os.path.join(out_dir, os.path.basename(src))
        sub.to_csv(out, index=False)
        return out

    if "dynabench" in data_path:
        df = pd.read_csv(src)
        sub = df.head(n)
        out = os.path.join(out_dir, os.path.basename(src))
        sub.to_csv(out, index=False)
        return out

    if any(k in data_path for k in ["paradetox", "jigsaw"]):
        if data_path.endswith(".txt"):
            with open(src, "r") as f:
                lines = [s.rstrip("\n") for s in f.readlines()]
            out = os.path.join(out_dir, os.path.basename(src))
            with open(out, "w") as g:
                for s in lines[:n]:
                    g.write(s + "\n")
            return out
        elif data_path.endswith(".csv"):
            df = pd.read_csv(src).head(n)
            out = os.path.join(out_dir, os.path.basename(src))
            df.to_csv(out, index=False)
            return out

    if "appdia" in data_path:
        df = pd.read_csv(src, sep="\t").head(n)
        out = os.path.join(out_dir, os.path.basename(src))
        df.to_csv(out, sep="\t", index=False)
        return out

    out = os.path.join(out_dir, os.path.basename(src))
    shutil.copy(src, out)
    return out

In [9]:
#@title DecompX helpers for masking and reranking

def _decompx_mask_texts(texts: List[str],
                        threshold: float = 0.20,
                        batch_size: int = 16) -> List[str]:
    """
    Run DecompX Masker on a list of texts and return masked versions.
    """
    if not texts:
        return []

    masker = Masker_single()
    masked_all = []
    for i in tqdm(range(0, len(texts), batch_size),
                  desc="DecompX masking", leave=False):
        batch = texts[i:i + batch_size]
        batch_out = masker.process_text(sentence=batch, threshold=threshold)
        masked_all.extend(batch_out)
    masker.release_model()

    cleaned = [
        m.replace("<s>", "").replace("</s>", "").strip()
        for m in masked_all
    ]
    return cleaned

def _decompx_toxicity_scores(texts: List[str],
                             threshold: float = 0.20,
                             batch_size: int = 16) -> np.ndarray:
    """
    Approximate DecompX toxicity for texts by:
      score = (# of <mask> tokens DecompX inserts) / (# tokens)

    Lower score => less DecompX-toxic.
    """
    if not texts:
        return np.zeros((0,), dtype=float)

    masked = _decompx_mask_texts(texts, threshold=threshold, batch_size=batch_size)
    scores = []
    for m in masked:
        num_masks = len(re.findall(r"<mask>", m))
        tokens = m.split()
        length = max(len(tokens), 1)
        scores.append(num_masks / length)
    return np.asarray(scores, dtype=float)

def rerank_candidates_decompx(
    sources: List[str],
    candidates: List[List[str]],
    threshold: float = 0.20,
    batch_size_mask: int = 16,
):
    """
    DecompX-based reranking (XDetox-style):

    For each candidate sentence s_j:
      - apply DecompX
      - compute a toxicity score (here approximated by #masks / length)
      - select the candidate with MINIMUM score.

    Returns:
      best_idx: np.ndarray (N,) with chosen candidate index per source
      details: dict with 'score' matrix shape [N, C]
    """
    N = len(sources)
    assert len(candidates) == N, "candidates length mismatch"

    if N == 0:
        return np.array([], dtype=int), {}

    C_list = [len(c) for c in candidates]
    assert len(set(C_list)) == 1, "All inputs must have same num_candidates"
    C = C_list[0]
    if C == 0:
        raise ValueError("num_candidates must be >= 1")

    flat_cands = []
    flat_src_idx = []
    for i, cand_list in enumerate(candidates):
        for cand in cand_list:
            flat_cands.append(cand)
            flat_src_idx.append(i)
    flat_src_idx = np.array(flat_src_idx, dtype=int)

    scores = _decompx_toxicity_scores(
        flat_cands,
        threshold=threshold,
        batch_size=batch_size_mask,
    )  # [N*C]

    scores2 = scores.reshape(N, C)
    best_idx = np.argmin(scores2, axis=1)

    details = {
        "score": scores2,
    }
    return best_idx, details


In [10]:
#@title Evaluation helpers (evaluate_all.py with MeaningBERT + toxicity)
def _parse_run_folder_name(folder_name):
    """
    For this pipeline we simply treat all subfolders as valid runs.
    """
    return True

def _eval_with_toxicity(base_path, overwrite_eval=False, skip_ref=False,
                        tox_threshold=0.5, tox_batch_size=32):
    """
    Call evaluation.evaluate_all on each gen folder under base_path.
    """
    import sys as _sys, os as _os
    for folder in os.listdir(base_path):
        gen_dir = os.path.join(base_path, folder)
        if not os.path.isdir(gen_dir) or not _parse_run_folder_name(folder):
            continue
        orig_path = os.path.join(gen_dir, "orig.txt")
        gen_path  = os.path.join(gen_dir, "gen.txt")
        out_stats = os.path.join(gen_dir, "gen_stats.txt")
        if not (os.path.exists(orig_path) and os.path.exists(gen_path)):
            continue
        if os.path.exists(out_stats) and not overwrite_eval:
            continue

        env = os.environ.copy()
        env["PYTHONPATH"] = REPO + (":" + env.get("PYTHONPATH","") if env.get("PYTHONPATH") else "")
        cmd = [
            _sys.executable, "-m", "evaluation.evaluate_all",
            "--orig_path", orig_path,
            "--gen_path",  gen_path,
            "--tox_threshold", str(tox_threshold),
            "--tox_batch_size", str(tox_batch_size),
        ]
        if skip_ref:
            cmd.append("--skip_ref")
        print("Eval:", " ".join(cmd))
        res = run(cmd, cwd=REPO, env=env, stdout=PIPE, stderr=PIPE, text=True)
        if res.returncode != 0:
            print(res.stdout)
            print(res.stderr)
            res.check_returncode()

def _safe_float(x):
    try:
        return float(x)
    except Exception:
        return float('nan')

def _read_stats_file(path):
    out = {}
    with open(path, "r") as f:
        for line in f:
            if ":" not in line:
                continue
            k, v = line.strip().split(": ", 1)
            k = k.replace("(skipped)", "").strip().lower()
            out[k] = _safe_float(v)
    return out

def _aggregate_eval_csv(output_folder, data_type, base_out_dir):
    """
    Aggregate eval metrics for DecompX-masking + LLM-infilling + DecompX-reranking.

    Layout (absolute base_out_dir):
      base_out_dir/
        └── {data_type}/
            └── DecompX_LLM{thresh}/
                └── {run_folder}/
                    └── gen_stats.txt
    """
    rows = []

    root = os.path.join(base_out_dir, data_type)
    if not os.path.isdir(root):
        print("No evaluation directory found:", root)
        return

    for mask_dir in os.listdir(root):
        if not mask_dir.startswith("DecompX_LLM"):
            continue
        thresh_str = mask_dir.replace("DecompX_LLM", "")
        try:
            threshold = float(thresh_str)
        except Exception:
            threshold = np.nan

        base_path = os.path.join(root, mask_dir)
        for folder in os.listdir(base_path):
            gen_dir = os.path.join(base_path, folder)
            stats_path = os.path.join(gen_dir, "gen_stats.txt")
            if not os.path.exists(stats_path):
                continue
            s = _read_stats_file(stats_path)
            rows.append({
                "threshold":        threshold,
                "folder":           folder,
                "bertscore":        s.get("bertscore", np.nan),
                "meaningbert":      s.get("meaningbert", np.nan),
                "bleu4":            s.get("bleu4", np.nan),
                "perplexity_gen":   s.get("perplexity gen", np.nan),
                "perplexity_orig":  s.get("perplexity orig", np.nan),
                "toxicity_gen":     s.get("toxicity gen", np.nan),
                "toxicity_orig":    s.get("toxicity orig", np.nan),
            })

    if rows:
        cols = [
            "threshold", "folder",
            "bertscore", "meaningbert", "bleu4",
            "perplexity_gen", "perplexity_orig",
            "toxicity_gen", "toxicity_orig",
        ]
        df = pd.DataFrame(rows)
        df = df[cols]
        out_csv = os.path.join(base_out_dir, data_type, f"{data_type}.csv")
        df.to_csv(out_csv, index=False)
        print("Wrote summary CSV:", out_csv)
    else:
        print("No evaluation files found to summarize.")

In [11]:
#@title LLM infilling (Mistral-7B-Instruct) for masked sentences
USE_LLM_GPU = True
DEVICE_LLM = torch.device("cuda" if USE_LLM_GPU and torch.cuda.is_available() else "cpu")
print("LLM infiller device:", DEVICE_LLM)

LLM_MODEL_NAME = "mistralai/Mistral-7B-Instruct-v0.2"
_LLM_TOKENIZER = None
_LLM_MODEL = None

INFILL_SYSTEM_PROMPT = """You are a helpful assistant trained to make toxic or offensive sentences
more polite and respectful by INFILLING <mask> tokens in a masked sentence.

You are given:
- Toxic Sentence: the original sentence, which may contain offensive language.
- Masked Sentence: the same sentence, but all toxic spans are replaced by <mask>.

Your rules:
1. Only replace <mask> tokens in the Masked Sentence with polite, non-toxic alternatives.
2. Do NOT change any other words or punctuation from the Masked Sentence,
   except for small grammar fixes needed after infilling.
3. Preserve the original meaning and intent of the Toxic Sentence as much as possible.
4. Keep the same language as the Toxic Sentence.
5. If the Masked Sentence has no <mask> tokens, return it unchanged.

Output rules (very strict):
- Return ONLY the final detoxified sentence inside ONE pair of square brackets, like:
  [You are such a rude person, nobody wants to hear your opinion.]
- Do NOT print anything before or after the brackets.
- Do NOT add explanations, comments, or extra lines.
- Do NOT include additional '[' or ']' characters inside the sentence.
"""

INFILL_FEW_SHOT = """Toxic Sentence: You're such a stupid idiot, nobody wants to hear your crap.
Masked Sentence: You're such a <mask>, nobody wants to hear your <mask>.
Step 1 - Decide polite replacements for <mask>: "rude person", "opinion"
Step 2 - Insert them into the Masked Sentence, keeping all other tokens:
You're such a rude person, nobody wants to hear your opinion.
Final Output: [You're such a rude person, nobody wants to hear your opinion.]"""

def _lazy_load_llm_infiller():
    global _LLM_MODEL, _LLM_TOKENIZER
    if _LLM_MODEL is not None and _LLM_TOKENIZER is not None:
        return
    print(f"Loading LLM infiller: {LLM_MODEL_NAME} on {DEVICE_LLM} ...")
    _LLM_TOKENIZER = AutoTokenizer.from_pretrained(LLM_MODEL_NAME)
    _LLM_MODEL = AutoModelForCausalLM.from_pretrained(
        LLM_MODEL_NAME,
        torch_dtype=torch.float16 if DEVICE_LLM.type == "cuda" else torch.float32,
        device_map=None,
    ).to(DEVICE_LLM)
    _LLM_MODEL.eval()
    print("LLM infiller loaded.")

def _extract_bracket_content(text: str) -> str:
    """
    Extract content inside the first [ ... ] block.
    If missing ']', take everything after '['.
    If no '[', return the whole string.
    """
    text = text.strip()
    m = re.search(r"\[([^\]]*)\]", text, flags=re.DOTALL)
    if m:
        return m.group(1).strip()
    if "[" in text:
        return text.split("[", 1)[1].strip()
    return text

def _cleanup_llm_output(s: str) -> str:
    """
    Remove outer brackets if still present and normalize whitespace.
    """
    s = s.strip()
    if s.startswith("[") and s.endswith("]") and len(s) > 2:
        s = s[1:-1].strip()
    else:
        if s.startswith("["):
            s = s[1:].strip()
        if s.endswith("]"):
            s = s[:-1].strip()
    s = re.sub(r"\s+", " ", s).strip()
    return s

@torch.no_grad()
def llm_infill_candidates(
    toxic_sentences: List[str],
    masked_sentences: List[str],
    num_candidates: int = 3,
    temperature: float = 0.7,
    top_p: float = 0.95,
    max_new_tokens: int = 64,
    sample: bool = True,
) -> List[List[str]]:
    """
    For each (toxic, masked) pair, generate `num_candidates` detoxified candidates
    by infilling <mask> tokens with Mistral-7B-Instruct.

    Returns: list of length N; each element is a list of length num_candidates.
    """
    _lazy_load_llm_infiller()
    assert len(toxic_sentences) == len(masked_sentences), "length mismatch"

    all_candidates: List[List[str]] = []

    for idx in tqdm(range(len(toxic_sentences)), desc="LLM infilling", leave=False):
        toxic = toxic_sentences[idx]
        masked = masked_sentences[idx]

        messages = [
            {
                "role": "system",
                "content": INFILL_SYSTEM_PROMPT + "\n\nBelow is an example:\n" + INFILL_FEW_SHOT,
            },
            {
                "role": "user",
                "content": (
                    f"Toxic Sentence: {toxic}\n"
                    f"Masked Sentence: {masked}\n"
                    "Final Output:"
                ),
            },
        ]
        try:
            prompt = _LLM_TOKENIZER.apply_chat_template(
                messages,
                tokenize=False,
                add_generation_prompt=True,
            )
        except Exception:
            prompt = (
                INFILL_SYSTEM_PROMPT
                + "\n\nExample:\n"
                + INFILL_FEW_SHOT
                + "\n\nToxic Sentence: "
                + toxic
                + "\nMasked Sentence: "
                + masked
                + "\nFinal Output:"
            )

        inputs = _LLM_TOKENIZER(prompt, return_tensors="pt").to(DEVICE_LLM)
        gen = _LLM_MODEL.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            do_sample=sample,
            temperature=temperature if sample else 0.0,
            top_p=top_p,
            num_return_sequences=num_candidates,
            pad_token_id=_LLM_TOKENIZER.eos_token_id,
        )
        input_len = inputs["input_ids"].shape[1]

        cands_for_this = []
        for k in range(num_candidates):
            gen_text = _LLM_TOKENIZER.decode(
                gen[k][input_len:], skip_special_tokens=True
            )
            detox = _extract_bracket_content(gen_text)
            detox = _cleanup_llm_output(detox)
            if not detox:
                detox = masked  # fallback
            cands_for_this.append(detox)

        all_candidates.append(cands_for_this)

    return all_candidates


LLM infiller device: cuda


In [12]:
#@title Masking + LLM infilling + DecompX reranking

def _bool2str(x: bool) -> str:
    return "T" if x else "F"

def _build_llm_run_folder_name(
    temperature: float,
    sample: bool,
    top_p: float,
    max_new_tokens: int,
    num_candidates: int,
):
    return (
        "llmtemp" + str(temperature) +
        "_sample" + _bool2str(sample) +
        "_topp" + str(top_p) +
        "_maxnew" + str(max_new_tokens) +
        "_ncand" + str(num_candidates)
    )

def _run_decompx_masking_llm_infill_and_decompx_reranking_for_threshold(
    data_type,
    subset_path,
    thresh,
    base_out_rel,
    mask_batch_size,
    llm_sample,
    llm_temperature,
    llm_top_p,
    max_new_tokens,
    num_candidates,
    decompx_threshold,
    overwrite_gen=False,
    inputs=None,
    rerank_batch_size: int = 16,
    echo: bool = False,
):
    """
    For one DecompX threshold:
      - Mask with DecompX (on raw toxic inputs).
      - Use Mistral LLM to infill <mask> tokens and generate num_candidates per input.
      - Rerank candidates with DecompX-based toxicity (lowest score wins).
      - Save orig.txt / gen.txt under:
          data/model_outputs/{output_folder}/{data_type}/DecompX_LLM{thresh}/{run_folder}/
    """
    # Load inputs if not provided
    if inputs is None:
        args_data = _argparse.Namespace(data_type=data_type, data_path=subset_path)
        inputs = rx.get_data(args_data)
    print(f"#inputs at thresh={thresh}: {len(inputs)}")

    # Paths
    mask_dir = f"DecompX_LLM{abs(thresh):g}" if thresh != 0 else "DecompX_LLM0.0"
    cur_rel = os.path.join(base_out_rel, data_type, mask_dir)
    cur_abs = os.path.join(REPO, cur_rel)
    _ensure_dir(cur_abs)

    masked_file = os.path.join(cur_abs, "masked_inputs.txt")

    # DecompX masking (reuse if exists)
    if not os.path.exists(masked_file):
        print(f"Running DecompX masking (threshold={thresh:.2f}) to create masked_inputs.txt ...")
        decoded_mask_inputs = _decompx_mask_texts(
            inputs, threshold=thresh, batch_size=mask_batch_size
        )
        with open(masked_file, "w") as f:
            for d in decoded_mask_inputs:
                f.write(re.sub(r"\s+", " ", d).strip() + "\n")
    else:
        with open(masked_file, "r") as f:
            decoded_mask_inputs = [s.strip() for s in f.readlines()]
        print("Reusing existing masked_inputs.txt")

    assert len(decoded_mask_inputs) == len(inputs), "Masked vs inputs mismatch"

    if echo:
        print("\n[echo] Example masked inputs (first up to 3):")
        for i, m in enumerate(decoded_mask_inputs[:3]):
            print(f"  masked[{i}]: {m}")

    # Run folder for this LLM configuration
    run_folder = _build_llm_run_folder_name(
        llm_temperature, llm_sample, llm_top_p, max_new_tokens, num_candidates
    )
    final_abs = os.path.join(cur_abs, run_folder)
    gen_txt = os.path.join(final_abs, "gen.txt")
    orig_txt = os.path.join(final_abs, "orig.txt")

    if os.path.exists(gen_txt) and not overwrite_gen:
        print("Generation already exists at:", gen_txt, "— skipping generation.")
        _ensure_dir(final_abs)
        with open(gen_txt, "r") as f:
            best_generations = [s.strip() for s in f.readlines()]
        return inputs, decoded_mask_inputs, best_generations, final_abs

    _ensure_dir(final_abs)

    # LLM infilling: generate candidates
    print(f"LLM infilling: generating {num_candidates} candidates per input (sampling={llm_sample})")
    all_candidates = llm_infill_candidates(
        toxic_sentences=inputs,
        masked_sentences=decoded_mask_inputs,
        num_candidates=num_candidates,
        temperature=llm_temperature,
        top_p=llm_top_p,
        max_new_tokens=max_new_tokens,
        sample=llm_sample,
    )

    # DecompX-based reranking
    print(f"DecompX reranking of LLM candidates (threshold={decompx_threshold:.2f}) ...")
    best_idx, details = rerank_candidates_decompx(
        sources=inputs,
        candidates=all_candidates,
        threshold=decompx_threshold,
        batch_size_mask=rerank_batch_size,
    )
    best_generations = [
        all_candidates[i][best_idx[i]] for i in range(len(inputs))
    ]

    if echo:
        print("\n[echo] Example detoxified outputs (first up to 3):")
        for i, g in enumerate(best_generations[:3]):
            print(f"  detox[{i}]: {g}")

    # Save orig + chosen gen
    with open(orig_txt, "w") as f:
        for l in inputs:
            f.write(re.sub(r"\s+", " ", l).strip() + "\n")
    with open(gen_txt, "w") as f:
        for l in best_generations:
            f.write(re.sub(r"\s+", " ", l).strip() + "\n")

    print("Saved:", orig_txt)
    print("Saved:", gen_txt)

    return inputs, decoded_mask_inputs, best_generations, final_abs

In [13]:
#@title `detoxify()` — DecompX masking + LLM infilling + DecompX reranking + optional eval

def detoxify(
    data_type: str = "paradetox",
    output_folder: str = "colab_run_decompx_mask_llm_decompx",
    thresholds = (0.20,),
    echo: bool = False,
    num_examples: int = 100,        # None = full dataset
    overwrite_gen: bool = False,
    run_eval: bool = False,
    overwrite_eval: bool = False,
    skip_ref_eval: bool = False,
    # DecompX params
    mask_batch_size: int = 10,
    rerank_batch_size: int = 16,
    decompx_rerank_threshold: float = None,  # if None, use each t
    # LLM infilling params
    llm_sample: bool = True,
    llm_temperature: float = 0.7,
    llm_top_p: float = 0.95,
    max_new_tokens: int = 64,
    num_candidates: int = 3,
):
    """
    Run XDetox with:
      - DecompX masking (token-level) on raw toxic inputs,
      - LLM infilling (Mistral-7B-Instruct) to fill <mask> tokens,
      - DecompX-based reranking over LLM candidates,
      - optional evaluation via evaluation.evaluate_all.

    Outputs are stored under:
      data/model_outputs/{output_folder}/{data_type}/DecompX_LLM{thresh}/{run_folder}/
    """
    assert data_type in data_configs, f"Unknown data_type: {data_type}"
    cfg = data_configs[data_type].copy()

    if num_candidates < 1:
        raise ValueError("num_candidates must be >= 1")

    base_out_rel = os.path.join("data", "model_outputs", output_folder)
    base_out_abs = os.path.join(REPO, base_out_rel)
    _ensure_dir(base_out_abs)

    # subset path (file)
    original_data_path = cfg["data_path"]
    subset_dir = os.path.join(REPO, "datasets", "_subsets", data_type)
    _ensure_dir(subset_dir)
    subset_path = _subset_for_data_type(
        data_type, original_data_path, num_examples, subset_dir
    )

    # Load inputs once for echo and reuse
    args_data = _argparse.Namespace(data_type=data_type, data_path=subset_path)
    inputs = rx.get_data(args_data)
    num_inputs = len(inputs)

    if echo:
        print("=" * 80)
        print(f"[echo] Dataset: {data_type}")
        print(f"[echo] Subset path: {subset_path}")
        print(f"[echo] Output base: {base_out_abs}")
        print(f"[echo] Number of examples to detoxify: {num_inputs}")
        print(f"[echo] Thresholds (DecompX masking): {', '.join(f'{t:.2f}' for t in thresholds)}")
        print(f"[echo] LLM: temperature={llm_temperature}, top_p={llm_top_p}, "
              f"sample={llm_sample}, max_new_tokens={max_new_tokens}")
        print(f"[echo] num_candidates per input: {num_candidates}")
        print("\n[echo] Example inputs (first up to 3):")
        for i, s in enumerate(inputs[:3]):
            print(f"  input[{i}]: {s}")
        print("=" * 80)

    # Run for each DecompX masking threshold
    last_run_dir = None
    for t in thresholds:
        print("=" * 60)
        print(f"DecompX masking threshold = {t:.2f}")
        effective_rerank_t = decompx_rerank_threshold if decompx_rerank_threshold is not None else t

        inputs, masked_inputs, best_generations, run_dir = \
            _run_decompx_masking_llm_infill_and_decompx_reranking_for_threshold(
                data_type=data_type,
                subset_path=subset_path,
                thresh=t,
                base_out_rel=base_out_rel,
                mask_batch_size=mask_batch_size,
                llm_sample=llm_sample,
                llm_temperature=llm_temperature,
                llm_top_p=llm_top_p,
                max_new_tokens=max_new_tokens,
                num_candidates=num_candidates,
                decompx_threshold=effective_rerank_t,
                overwrite_gen=overwrite_gen,
                inputs=inputs,
                rerank_batch_size=rerank_batch_size,
                echo=echo,
            )
        last_run_dir = run_dir

        if run_eval:
            mask_dir = f"DecompX_LLM{abs(t):g}" if t != 0 else "DecompX_LLM0.0"
            base_path = os.path.join(base_out_abs, data_type, mask_dir)
            _eval_with_toxicity(
                base_path,
                overwrite_eval=overwrite_eval,
                skip_ref=skip_ref_eval,
                tox_threshold=0.5,
                tox_batch_size=32,
            )

            if echo:
                run_folder = os.path.basename(run_dir)
                stats_path = os.path.join(base_path, run_folder, "gen_stats.txt")
                if os.path.exists(stats_path):
                    stats = _read_stats_file(stats_path)
                    print("\n[echo] Evaluation metrics for this run "
                          f"(t={t:.2f}):")
                    metric_keys = [
                        ("bertscore",        "BERTScore"),
                        ("meaningbert",      "MeaningBERT"),
                        ("bleu4",            "BLEU-4"),
                        ("perplexity gen",   "Perplexity (gen)"),
                        ("perplexity orig",  "Perplexity (orig)"),
                        ("toxicity gen",     "Toxicity (gen)"),
                        ("toxicity orig",    "Toxicity (orig)"),
                    ]
                    for key, label in metric_keys:
                        val = stats.get(key, None)
                        if isinstance(val, float) and math.isnan(val):
                            continue
                        if val is None:
                            continue
                        print(f"  {label}: {val:.4f}")
                else:
                    print(f"[echo] gen_stats.txt not found at {stats_path}")

    # Summarize across thresholds
    if run_eval:
        _aggregate_eval_csv(
            output_folder,
            data_type,
            os.path.join(REPO, "data", "model_outputs", output_folder),
        )


In [14]:
#@title Example run — paradetox, DecompX masking + LLM infilling + DecompX reranking

# Example (small subset; adjust as needed):
# detoxify(
#     data_type="paradetox",
#     output_folder="colab_run_decompx_mask_llm_decompx_demo_50_examples",
#     thresholds=(0.20,),
#     echo=True,
#     num_examples=50,
#     overwrite_gen=True,
#     run_eval=True,
#     overwrite_eval=True,
#     skip_ref_eval=False,
#     mask_batch_size=8,
#     rerank_batch_size=16,
#     llm_sample=True,
#     llm_temperature=0.7,
#     llm_top_p=0.95,
#     max_new_tokens=64,
#     num_candidates=10,
# )

In [15]:
detoxify(
    data_type="paradetox",
    output_folder="XDetox_w_DecompX-Masking_LLM-Infilling_DecompX-Reranking_Pipeline",
    thresholds=(0.20,),
    echo=True,
    num_examples=1000,
    overwrite_gen=True,
    run_eval=True,
    overwrite_eval=True,
    skip_ref_eval=False,
    mask_batch_size=8,
    rerank_batch_size=16,
    llm_sample=True,
    llm_temperature=0.7,
    llm_top_p=0.95,
    max_new_tokens=64,
    num_candidates=10,
)


[echo] Dataset: paradetox
[echo] Subset path: /content/drive/MyDrive/w266 - Project/XDetox/datasets/_subsets/paradetox/test_toxic_parallel.txt
[echo] Output base: /content/drive/MyDrive/w266 - Project/XDetox/data/model_outputs/XDetox_w_DecompX-Masking_LLM-Infilling_DecompX-Reranking_Pipeline
[echo] Number of examples to detoxify: 671
[echo] Thresholds (DecompX masking): 0.20
[echo] LLM: temperature=0.7, top_p=0.95, sample=True, max_new_tokens=64
[echo] num_candidates per input: 10

[echo] Example inputs (first up to 3):
  input[0]: . or the loud ass one - thousand ton beast roaring towards you howling its horn .
  input[1]: mandated  and " right fucking now " would be good .
  input[2]: * neither * of my coworkers gave a shit when it came time to ditch mitch . ugh .
DecompX masking threshold = 0.20
#inputs at thresh=0.2: 671
Running DecompX masking (threshold=0.20) to create masked_inputs.txt ...


Error while fetching `HF_TOKEN` secret value from your vault: 'Requesting secret HF_TOKEN timed out. Secrets can only be fetched when running from the Colab UI.'.
You are not authenticated with the Hugging Face Hub in this notebook.
If the error persists, please let us know by opening an issue on GitHub (https://github.com/huggingface/huggingface_hub/issues/new).
Some weights of the model checkpoint at s-nlp/roberta_toxicity_classifier were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequence

DecompX masking:   0%|          | 0/84 [00:00<?, ?it/s]




[echo] Example masked inputs (first up to 3):
  masked[0]: . or the loud <mask> one - thousand ton beast roaring towards you howling its horn .
  masked[1]: mandated  and " right <mask> now " would be good .
  masked[2]: * neither * of my coworkers gave a <mask> when it came time to ditch mitch . ugh .
LLM infilling: generating 10 candidates per input (sampling=True)
Loading LLM infiller: mistralai/Mistral-7B-Instruct-v0.2 on cuda ...


Downloading shards:   0%|          | 0/3 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

LLM infiller loaded.


LLM infilling:   0%|          | 0/671 [00:00<?, ?it/s]

DecompX reranking of LLM candidates (threshold=0.20) ...


Some weights of the model checkpoint at s-nlp/roberta_toxicity_classifier were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


DecompX masking:   0%|          | 0/420 [00:00<?, ?it/s]




[echo] Example detoxified outputs (first up to 3):
  detox[0]: or the loud person one - thousand ton beast roaring towards you howling its horn.
  detox[1]: mandated and " right now " would be good .
  detox[2]: neither of my coworkers showed interest when it came time to ditch mitch . ugh .
Saved: /content/drive/MyDrive/w266 - Project/XDetox/data/model_outputs/XDetox_w_DecompX-Masking_LLM-Infilling_DecompX-Reranking_Pipeline/paradetox/DecompX_LLM0.2/llmtemp0.7_sampleT_topp0.95_maxnew64_ncand10/orig.txt
Saved: /content/drive/MyDrive/w266 - Project/XDetox/data/model_outputs/XDetox_w_DecompX-Masking_LLM-Infilling_DecompX-Reranking_Pipeline/paradetox/DecompX_LLM0.2/llmtemp0.7_sampleT_topp0.95_maxnew64_ncand10/gen.txt
Eval: /usr/bin/python3 -m evaluation.evaluate_all --orig_path /content/drive/MyDrive/w266 - Project/XDetox/data/model_outputs/XDetox_w_DecompX-Masking_LLM-Infilling_DecompX-Reranking_Pipeline/paradetox/DecompX_LLM0.2/llmtemp0.7_sampleT_topp0.95_maxnew64_ncand10/orig.txt --ge