<a href="https://colab.research.google.com/github/SuperMusey/FoundationOfPrivacy/blob/main/MIA_phase2_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## MIA FOR Finetuned LLM

In [1]:
# update the downloading command as my LFS runs out so cannot directly clone model.safetensors
%cd /content

!git clone https://github.com/2020pyfcrawl/18734-17731_Project_Phase2_3.git

%cd /content/18734-17731_Project_Phase2_3


/content
Cloning into '18734-17731_Project_Phase2_3'...
remote: Enumerating objects: 55, done.[K
remote: Counting objects: 100% (19/19), done.[K
remote: Compressing objects: 100% (14/14), done.[K
remote: Total 55 (delta 8), reused 15 (delta 5), pack-reused 36 (from 1)[K
Receiving objects: 100% (55/55), 55.70 MiB | 20.78 MiB/s, done.
Resolving deltas: 100% (11/11), done.
Updating files: 100% (36/36), done.
/content/18734-17731_Project_Phase2_3


### Variables and libraries

In [None]:
# install the required libraries if you have not done so (on you local machine or GPU server)
# you may not need to run this if you use colab as they are pre-installed, but you can always do it.
%pip install -r requirements.txt

In [None]:
import os, math, argparse
os.environ.setdefault("TRANSFORMERS_NO_TORCHVISION", "1")

import torch
import numpy as np
from datasets import load_from_disk
from torch.utils.data import DataLoader
from transformers import AutoTokenizer, AutoModelForCausalLM
from sklearn.metrics import roc_auc_score, roc_curve, auc as _auc
import matplotlib.pyplot as plt
import json
from pathlib import Path
from datasets import Dataset

In [None]:
# global variable, check the current position to adjust the path
phase = "train" # or train / val / final
target_model_dir = f"./models/{phase}/gpt2_3_lora32_adamw_b8_lr2"
data_dir = f"./data/{phase}/"
batch_size = 50

# you may change block size if you like (max length for the tokenizer below)
block_size = 512

### Data pre-processing

In [None]:
def tokenize_dataset(ds, tok, max_len):
    ds = ds.filter(lambda ex: ex.get("text", None) and len(ex["text"].strip()) > 0)

    def _map(ex):
        out = tok(ex["text"], truncation=True, padding=True, max_length=max_len, return_attention_mask=True)
        out["labels"] = out["input_ids"].copy()
        return out

    ds = ds.map(_map, batched=True, remove_columns=ds.column_names)
    ds.set_format(type="torch", columns=["input_ids", "attention_mask", "labels"])
    return ds

def _read_json(path: Path):
    with path.open("r", encoding="utf-8") as f:
        return json.load(f)

In [None]:
# for tests, you may only load a part of the data to save time while implementing,
# as running all 2000 samples on CPU may be slow, but not a problem here for GPU

# load test data
data_dir = Path(data_dir)
test_path = data_dir / "test.json"
test_items = _read_json(test_path)
ds_test = Dataset.from_dict({"text": test_items})

# tokenizer the test data
tokenizer = AutoTokenizer.from_pretrained(target_model_dir, use_fast=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side="right"

ds_test = tokenize_dataset(ds_test, tokenizer, block_size)
dl_test = DataLoader(ds_test, batch_size=batch_size)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# you may load the model using the code:

Filter:   0%|          | 0/2000 [00:00<?, ? examples/s]

Map:   0%|          | 0/2000 [00:00<?, ? examples/s]

In [None]:
model = AutoModelForCausalLM.from_pretrained(target_model_dir, dtype="auto").to(device)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

### MIA

Implement your attack here! \
Hint: use shadow models.

In [None]:
# implement your attack here
@torch.no_grad()
def your_attack(

):
    pass

scores_test = your_attack(...)

### Result

In [None]:
# load the label here to compute the performance, you will only have full access to the label in train set to test your method
if phase == "train":
    label_path = data_dir / "test_label.json"
    label_items = _read_json(label_path)

    y_true = np.array(label_items)
    scores = np.array(scores_test)
    fpr, tpr, thr = roc_curve(y_true, scores)
    auc_val = roc_auc_score(y_true, scores)
    print(auc_val)

    print(max(tpr[fpr < 0.01])) # TPR @ 0.01FPR
    # WE ONLY CARE TPR @ 0.01FPR!!! SO INCREASE THIS AS MUCH AS POSSIBLE!!!
elif phase == "val" or phase == "final":
    pred_path = data_dir / "prediction.csv"
    with open(pred_path, "w", encoding="utf-8") as f:
        for s in scores_test:
            f.write(json.dumps(float(s), ensure_ascii=False) + "\n")
else:
    print("Wrong phase.")

### Packaging the submission

zip the prediction file and upload to the leaderboard.

In [None]:
import os
from pathlib import Path
import zipfile

with zipfile.ZipFile(f"project_submission.zip", 'w') as zipf:
    for phase in ["val", "final"]:
        data_dir = f"./data/{phase}/"
        data_dir = Path(data_dir)

        file = data_dir / "prediction.csv"
        if file.exists():
            arcname = os.path.join(phase, file.name)
            zipf.write(file, arcname=arcname)
        else:
            raise FileNotFoundError(f"`prediction.csv` not found in {data_dir}.")

### Visualization

A few visualizations that may help you develop your method and write reports.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

eps = 1e-12
fpr_ = np.clip(fpr, 1e-5, 1)
tpr_ = np.clip(tpr, 1e-5, 1)

fig, ax = plt.subplots(figsize=(8, 6))

ax.plot(fpr_, tpr_, lw=2, label=f'ROC (AUC = {auc_val:.4f})')
ax.plot([1e-5, 1], [1e-5, 1], lw=2, ls='--', label='Chance')

ax.set_xscale('log')
ax.set_yscale('log')
ax.set_xlim(1e-5, 1.0)
ax.set_ylim(1e-5, 1.0)

ticks = [1e-5, 1e-4, 1e-3, 1e-2, 1e-1, 1]
ax.set_xticks(ticks)
ax.set_yticks(ticks)
ax.get_xaxis().set_minor_formatter(plt.NullFormatter())
ax.get_yaxis().set_minor_formatter(plt.NullFormatter())

ax.set_xlabel('False Positive Rate')
ax.set_ylabel('True Positive Rate')
ax.set_title('MIA ROC (log–log focus on small FPR/TPR)')
ax.legend(loc='lower right')
ax.grid(True, which='both', alpha=0.5)

plt.show()

In [None]:
# draw distribution
import matplotlib.pyplot as plt
import seaborn as sns

# y_true = np.array(label_items)
# scores = np.array(scores_test)

scores_mem = scores[y_true == 1]
scores_non = scores[y_true == 0]

plt.figure(figsize=(12, 6))
sns.histplot(scores_mem, bins=50, color='salmon', kde=True, label='member')
sns.histplot(scores_non, bins=50, color='skyblue', kde=True, label='non-member')

threshold_value = np.percentile(scores_non, q=99)
print(threshold_value)
plt.axvline(
    x=threshold_value,
    color='purple',
    linestyle='--',
    linewidth=2,
    label=f'0.01 FPR: {threshold_value:.2f}'
)


plt.title('Loss distribution', fontsize=16)
plt.xlabel('Loss', fontsize=12)
plt.ylabel('Count', fontsize=12)
plt.legend(fontsize=10)

plt.show()

In [None]:
# draw ROC curve and attach the figure in the report
plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, color='darkorange', lw=2, label=f'ROC curve (AUC = {auc_val:.4f})')
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--', label='Chance line')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate (FPR)')
plt.ylabel('True Positive Rate (TPR)')
plt.title(f'MIA ROC Curve for Train Data')
plt.legend(loc="lower right")
plt.grid(alpha=0.5)
plt.show()

**ADDITIONALS**

**For Shadow Data**

In [None]:
#!/usr/bin/env python
# prepare_data.py
# curate training data for the model

import os, json, random
from pathlib import Path
from datasets import load_dataset, Dataset
from transformers import AutoTokenizer

# dataset - seed
# 0 - 4042
# 1 - 3042
# 2 - 2042
# 3 - 1042
# 4 - 420

OUTDIR = "shadow_data/shadow_0"
SEED = 4042
TRAIN_PER_SRC = 10_000
MIN_TOKENS = 25
TEST_MEMBERS = 1000
TEST_NONMEMBERS = 1000

def set_seed_all(seed: int):
    import numpy as np
    random.seed(seed); np.random.seed(seed)
    try:
        import torch
        torch.manual_seed(seed); torch.cuda.manual_seed_all(seed)
    except Exception:
        pass

def ensure_text_column(ds: Dataset, src: str) -> Dataset:
    if src == "wikitext103":
        assert "text" in ds.column_names
        return ds.remove_columns([c for c in ds.column_names if c != "text"])
    raise ValueError(src)

def basic_clean(ds: Dataset) -> Dataset:
    ds = ds.filter(lambda ex: isinstance(ex.get("text", None), str) and len(ex["text"].strip()) > 0)
    def _strip_map(ex): return {"text": " ".join(ex["text"].split())}
    return ds.map(_strip_map, batched=False)

def filter_by_tokens(ds: Dataset, tok, min_tokens: int) -> Dataset:
    def _len_map(batch):
        enc = tok(batch["text"], add_special_tokens=False)
        return {"_tok_len": [len(ids) for ids in enc["input_ids"]]}
    ds = ds.map(_len_map, batched=True)
    ds = ds.filter(lambda ex: ex["_tok_len"] >= min_tokens)
    return ds.remove_columns(["_tok_len"])

def sample_n(ds: Dataset, n: int, seed: int):
    n = min(n, len(ds))
    idx = list(range(len(ds)))
    random.Random(seed).shuffle(idx)
    take = sorted(idx[:n])
    return ds.select(take), set(take)

def dump_json(path: Path, obj):
    path.parent.mkdir(parents=True, exist_ok=True)
    with path.open("w", encoding="utf-8") as f:
        json.dump(obj, f, ensure_ascii=False, indent=2)

def main():
    set_seed_all(SEED)
    os.makedirs(OUTDIR, exist_ok=True)

    tok = AutoTokenizer.from_pretrained("gpt2", use_fast=True)
    if tok.pad_token is None:
        tok.pad_token = tok.eos_token

    # ---------- Load & filter (WikiText-103-raw-v1) ----------
    wiki_raw = load_dataset("Salesforce/wikitext", "wikitext-103-raw-v1")["train"]
    wiki = ensure_text_column(wiki_raw, "wikitext103")
    wiki = basic_clean(wiki)
    wiki = filter_by_tokens(wiki, tok, MIN_TOKENS)

    # train set
    wiki_train, wiki_train_idx = sample_n(wiki, TRAIN_PER_SRC, SEED + 1)

    out_dir = Path(OUTDIR)
    train_json = [{"text": ex["text"]} for ex in wiki_train]
    dump_json(out_dir / "train_finetune.json", train_json)

    train_texts = [ex["text"] for ex in wiki_train]
    train_set = set(train_texts)

    # Get member samples (from training set)
    member_samples = random.sample(train_texts, min(TEST_MEMBERS, len(train_texts)))

    # Get non-member samples (from wiki but NOT in training)
    nonmember_candidates = [ex["text"] for ex in wiki if ex["text"] not in train_set]
    nonmember_samples = random.sample(nonmember_candidates, min(TEST_NONMEMBERS, len(nonmember_candidates)))

    # Combine: members first, then non-members
    test_texts = member_samples + nonmember_samples
    test_labels = [1] * len(member_samples) + [0] * len(nonmember_samples)

    # Save test.json and test_label.json
    dump_json(out_dir / "test.json", test_texts)
    dump_json(out_dir / "test_label.json", test_labels)

    print(f"[OK] Train JSON ({len(train_texts)} samples) saved to {OUTDIR}")
    print(f"[OK] Test JSON ({len(member_samples)} members + {len(nonmember_samples)} non-members) saved to {OUTDIR}")


    print("[OK] JSON saved to", OUTDIR)

if __name__ == "__main__":
    main()


[OK] Train JSON (10000 samples) saved to shadow_data/shadow_0
[OK] Test JSON (1000 members + 1000 non-members) saved to shadow_data/shadow_0
[OK] JSON saved to shadow_data/shadow_0


**Canary Data Set Gen**

In [2]:
#!/usr/bin/env python
# prepare_canary_shadow_data.py
# Prepare shadow model data using canary-based partitioning

import os, json, random
from pathlib import Path
from datasets import load_dataset, Dataset
from transformers import AutoTokenizer

def set_seed_all(seed: int):
    import numpy as np
    random.seed(seed)
    np.random.seed(seed)
    try:
        import torch
        torch.manual_seed(seed)
        torch.cuda.manual_seed_all(seed)
    except Exception:
        pass

def basic_clean_text(text):
    """Clean a single text string"""
    if not isinstance(text, str) or len(text.strip()) == 0:
        return None
    return " ".join(text.split())

def main():
    set_seed_all(42)  # Global seed for reproducibility

    # ========== STEP 1: Load Target Test Set (Canaries) ==========
    print("="*60)
    print("STEP 1: Loading Target Test Set")
    print("="*60)

    target_test_path = Path("./data/train/test.json")
    if not target_test_path.exists():
        raise FileNotFoundError(f"Target test file not found: {target_test_path}")

    target_test_items = json.load(open(target_test_path))
    print(f"Loaded {len(target_test_items)} test samples from target model\n")

    # ========== STEP 2: Split into 5 Canary Sets ==========
    print("="*60)
    print("STEP 2: Creating Canary Partitions")
    print("="*60)

    # Shuffle with fixed seed
    random.seed(42)
    shuffled_indices = list(range(len(target_test_items)))
    random.shuffle(shuffled_indices)

    canary_sets = []
    canary_size = len(target_test_items) // 5

    for i in range(5):
        start_idx = i * canary_size
        end_idx = (i + 1) * canary_size if i < 4 else len(target_test_items)
        canary_indices = shuffled_indices[start_idx:end_idx]
        canary_texts = [target_test_items[idx] for idx in canary_indices]
        canary_sets.append(canary_texts)
        print(f"Canary set {i}: {len(canary_texts)} samples")

    # Convert to sets for fast lookup
    canary_sets_lookup = [set(texts) for texts in canary_sets]
    all_canaries_set = set(target_test_items)

    # ========== STEP 3: Load WikiText for Filling ==========
    print(f"\n{'='*60}")
    print("STEP 3: Loading WikiText-103")
    print("="*60)

    tok = AutoTokenizer.from_pretrained("gpt2", use_fast=True)
    if tok.pad_token is None:
        tok.pad_token = tok.eos_token

    wiki_raw = load_dataset("Salesforce/wikitext", "wikitext-103-raw-v1")["train"]
    print(f"✓ Loaded {len(wiki_raw)} WikiText samples")

    # Clean and filter WikiText, excluding ALL canaries
    wiki_texts = []
    for item in wiki_raw:
        if "text" in item:
            cleaned = basic_clean_text(item["text"])
            if cleaned and len(cleaned) > 50:  # Min length filter
                # Exclude all target test samples
                if cleaned not in all_canaries_set:
                    wiki_texts.append(cleaned)

    print(f"Filtered to {len(wiki_texts)} WikiText samples (excluding all canaries)\n")

    # ========== STEP 4: Create Shadow Datasets ==========
    print("="*60)
    print("STEP 4: Creating Shadow Model Datasets")
    print("="*60)

    for shadow_id in range(5):
        print(f"\n--- Shadow Model {shadow_id} ---")

        # This model's canaries (members)
        my_canaries = canary_sets[shadow_id]

        # Other models' canaries (non-members for testing)
        other_canaries = []
        for j in range(5):
            if j != shadow_id:
                other_canaries.extend(canary_sets[j])

        # Sample only 400 non-members to match members**
        random.seed(shadow_id + 2000)
        sampled_non_members = random.sample(other_canaries, len(my_canaries))

        # Sample WikiText to fill to 10,000 training samples
        fill_size = 10000 - len(my_canaries)
        random.seed(shadow_id + 1000)
        wiki_sample = random.sample(wiki_texts, min(fill_size, len(wiki_texts)))

        # Training data: my canaries + WikiText fill
        train_data = my_canaries + wiki_sample
        random.shuffle(train_data)

        # Test data: BALANCED 400 members + 400 non-members
        test_data = my_canaries + sampled_non_members
        test_labels = [1] * len(my_canaries) + [0] * len(sampled_non_members)

        # Shuffle test data with labels together
        combined = list(zip(test_data, test_labels))
        random.shuffle(combined)
        test_data, test_labels = zip(*combined)
        test_data = list(test_data)
        test_labels = list(test_labels)

        # Save to files
        out_dir = Path(f"./data/shadow_data_canary/shadow_{shadow_id}")
        out_dir.mkdir(parents=True, exist_ok=True)

        # Save train_finetune.json
        train_json = [{"text": t} for t in train_data]
        with open(out_dir / "train_finetune.json", "w", encoding="utf-8") as f:
            json.dump(train_json, f, ensure_ascii=False, indent=2)

        # Save test.json
        with open(out_dir / "test.json", "w", encoding="utf-8") as f:
            json.dump(test_data, f, ensure_ascii=False, indent=2)

        # Save test_label.json
        with open(out_dir / "test_label.json", "w", encoding="utf-8") as f:
            json.dump(test_labels, f, ensure_ascii=False, indent=2)

        print(f"  Training: {len(train_data):,} samples")
        print(f"    - Canaries (members): {len(my_canaries)}")
        print(f"    - WikiText fill: {len(wiki_sample)}")
        print(f"  Test: {len(test_data):,} samples")
        print(f"    - Members: {sum(test_labels)}")
        print(f"    - Non-members: {len(test_labels) - sum(test_labels)}")
        print(f"    Saved to {out_dir}")

    print(f"\n{'='*60}")
    print("ALL SHADOW DATASETS CREATED SUCCESSFULLY")

if __name__ == "__main__":
    main()

STEP 1: Loading Target Test Set
Loaded 2000 test samples from target model

STEP 2: Creating Canary Partitions
Canary set 0: 400 samples
Canary set 1: 400 samples
Canary set 2: 400 samples
Canary set 3: 400 samples
Canary set 4: 400 samples

STEP 3: Loading WikiText-103


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

wikitext-103-raw-v1/test-00000-of-00001.(…):   0%|          | 0.00/733k [00:00<?, ?B/s]

wikitext-103-raw-v1/train-00000-of-00002(…):   0%|          | 0.00/157M [00:00<?, ?B/s]

wikitext-103-raw-v1/train-00001-of-00002(…):   0%|          | 0.00/157M [00:00<?, ?B/s]

wikitext-103-raw-v1/validation-00000-of-(…):   0%|          | 0.00/657k [00:00<?, ?B/s]

Generating test split:   0%|          | 0/4358 [00:00<?, ? examples/s]

Generating train split:   0%|          | 0/1801350 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/3760 [00:00<?, ? examples/s]

✓ Loaded 1801350 WikiText samples
Filtered to 795044 WikiText samples (excluding all canaries)

STEP 4: Creating Shadow Model Datasets

--- Shadow Model 0 ---
  Training: 10,000 samples
    - Canaries (members): 400
    - WikiText fill: 9600
  Test: 2,000 samples
    - Members: 400
    - Non-members: 1600
    Saved to data/shadow_data_canary/shadow_0

--- Shadow Model 1 ---
  Training: 10,000 samples
    - Canaries (members): 400
    - WikiText fill: 9600
  Test: 2,000 samples
    - Members: 400
    - Non-members: 1600
    Saved to data/shadow_data_canary/shadow_1

--- Shadow Model 2 ---
  Training: 10,000 samples
    - Canaries (members): 400
    - WikiText fill: 9600
  Test: 2,000 samples
    - Members: 400
    - Non-members: 1600
    Saved to data/shadow_data_canary/shadow_2

--- Shadow Model 3 ---
  Training: 10,000 samples
    - Canaries (members): 400
    - WikiText fill: 9600
  Test: 2,000 samples
    - Members: 400
    - Non-members: 1600
    Saved to data/shadow_data_canary/sh

In [3]:
#!/usr/bin/env python
# prepare_canary_shadow_data.py
# Prepare shadow model data using canary-based partitioning (direct comparison method)

import os, json, random
from pathlib import Path
from datasets import load_dataset, Dataset
from transformers import AutoTokenizer


def set_seed_all(seed: int):
    import numpy as np
    random.seed(seed)
    np.random.seed(seed)
    try:
        import torch
        torch.manual_seed(seed)
        torch.cuda.manual_seed_all(seed)
    except Exception:
        pass


def basic_clean_text(text):
    """Clean a single text string"""
    if not isinstance(text, str) or len(text.strip()) == 0:
        return None
    return " ".join(text.split())


def main():
    set_seed_all(42)

    # ========== STEP 1: Load Target Test Set (Canaries) ==========
    print("=" * 60)
    print("STEP 1: Loading Target Test Set")
    print("=" * 60)

    target_test_path = Path("./data/train/test.json")
    if not target_test_path.exists():
        raise FileNotFoundError(f"Target test file not found: {target_test_path}")

    target_test_items = json.load(open(target_test_path))
    print(f"Loaded {len(target_test_items)} test samples from target model\n")

    # ========== STEP 2: Split into 5 Canary Sets ==========
    print("=" * 60)
    print("STEP 2: Creating Canary Partitions")
    print("=" * 60)

    random.seed(42)
    shuffled_indices = list(range(len(target_test_items)))
    random.shuffle(shuffled_indices)

    canary_sets = []
    canary_size = len(target_test_items) // 5

    for i in range(5):
        start_idx = i * canary_size
        end_idx = (i + 1) * canary_size if i < 4 else len(target_test_items)
        canary_indices = shuffled_indices[start_idx:end_idx]
        canary_texts = [target_test_items[idx] for idx in canary_indices]
        canary_sets.append(canary_texts)
        print(f"Canary set {i}: {len(canary_texts)} samples")

    all_canaries_set = set(target_test_items)

    # ========== STEP 3: Load WikiText for Filling ==========
    print(f"\n{'=' * 60}")
    print("STEP 3: Loading WikiText-103")
    print("=" * 60)

    tok = AutoTokenizer.from_pretrained("gpt2", use_fast=True)
    if tok.pad_token is None:
        tok.pad_token = tok.eos_token

    wiki_raw = load_dataset("Salesforce/wikitext", "wikitext-103-raw-v1")["train"]
    print(f"Loaded {len(wiki_raw)} WikiText samples")

    wiki_texts = []
    for item in wiki_raw:
        if "text" in item:
            cleaned = basic_clean_text(item["text"])
            if cleaned and len(cleaned) > 50:
                if cleaned not in all_canaries_set:
                    wiki_texts.append(cleaned)

    print(f"Filtered to {len(wiki_texts)} WikiText samples (excluding all canaries)\n")

    # ========== STEP 4: Create Shadow Datasets ==========
    print("=" * 60)
    print("STEP 4: Creating Shadow Model Datasets")
    print("=" * 60)

    for shadow_id in range(5):
        print(f"\n--- Shadow Model {shadow_id} ---")

        # This model's canaries (members)
        my_canaries = canary_sets[shadow_id]

        # Sample WikiText to fill to 10,000 training samples
        fill_size = 10000 - len(my_canaries)
        random.seed(shadow_id + 1000)
        wiki_sample = random.sample(wiki_texts, min(fill_size, len(wiki_texts)))

        # Training data: my canaries + WikiText fill
        train_data = my_canaries + wiki_sample
        random.shuffle(train_data)

        # Save to files
        out_dir = Path(f"./data/shadow_data_canary/shadow_{shadow_id}")
        out_dir.mkdir(parents=True, exist_ok=True)

        # Save train_finetune.json
        train_json = [{"text": t} for t in train_data]
        with open(out_dir / "train_finetune.json", "w", encoding="utf-8") as f:
            json.dump(train_json, f, ensure_ascii=False, indent=2)

        # Save canary mapping (which canaries this model saw)
        canary_mapping = {
            "canary_indices": list(range(shadow_id * 400, (shadow_id + 1) * 400)),
            "canary_texts": my_canaries
        }
        with open(out_dir / "canary_mapping.json", "w", encoding="utf-8") as f:
            json.dump(canary_mapping, f, ensure_ascii=False, indent=2)

        print(f"  Training: {len(train_data):,} samples")
        print(f"    - Canaries (members): {len(my_canaries)}")
        print(f"    - WikiText fill: {len(wiki_sample)}")
        print(f"    Saved to {out_dir}")

    print(f"\n{'=' * 60}")
    print("ALL SHADOW DATASETS CREATED SUCCESSFULLY")

if __name__ == "__main__":
    main()

STEP 1: Loading Target Test Set
Loaded 2000 test samples from target model

STEP 2: Creating Canary Partitions
Canary set 0: 400 samples
Canary set 1: 400 samples
Canary set 2: 400 samples
Canary set 3: 400 samples
Canary set 4: 400 samples

STEP 3: Loading WikiText-103


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

wikitext-103-raw-v1/test-00000-of-00001.(…):   0%|          | 0.00/733k [00:00<?, ?B/s]

wikitext-103-raw-v1/train-00000-of-00002(…):   0%|          | 0.00/157M [00:00<?, ?B/s]

wikitext-103-raw-v1/train-00001-of-00002(…):   0%|          | 0.00/157M [00:00<?, ?B/s]

wikitext-103-raw-v1/validation-00000-of-(…):   0%|          | 0.00/657k [00:00<?, ?B/s]

Generating test split:   0%|          | 0/4358 [00:00<?, ? examples/s]

Generating train split:   0%|          | 0/1801350 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/3760 [00:00<?, ? examples/s]

Loaded 1801350 WikiText samples
Filtered to 795044 WikiText samples (excluding all canaries)

STEP 4: Creating Shadow Model Datasets

--- Shadow Model 0 ---
  Training: 10,000 samples
    - Canaries (members): 400
    - WikiText fill: 9600
    Saved to data/shadow_data_canary/shadow_0

--- Shadow Model 1 ---
  Training: 10,000 samples
    - Canaries (members): 400
    - WikiText fill: 9600
    Saved to data/shadow_data_canary/shadow_1

--- Shadow Model 2 ---
  Training: 10,000 samples
    - Canaries (members): 400
    - WikiText fill: 9600
    Saved to data/shadow_data_canary/shadow_2

--- Shadow Model 3 ---
  Training: 10,000 samples
    - Canaries (members): 400
    - WikiText fill: 9600
    Saved to data/shadow_data_canary/shadow_3

--- Shadow Model 4 ---
  Training: 10,000 samples
    - Canaries (members): 400
    - WikiText fill: 9600
    Saved to data/shadow_data_canary/shadow_4

ALL SHADOW DATASETS CREATED SUCCESSFULLY


In [2]:
# Save to cloud
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


In [4]:
# Copy shadow data to Drive
!cp -r /content/18734-17731_Project_Phase2_3/data/shadow_data_canary /content/drive/MyDrive/FoundOfPriv/shadow_data_canary/

In [None]:
# To load back:
!cp -r /content/drive/MyDrive/FoundOfPriv/shadow_data/ ./data/shadow_data

**For training shadow models**

In [9]:
!python ft_llm/ft_llm_colab.py \
  --data_dir ./data/shadow_data_canary/shadow_3 \
  --train_file "train_finetune.json" \
  -m gpt2 \
  --block_size 512 \
  --epochs 3 \
  --batch_size 8 \
  --gradient_accumulation_steps 1 \
  --lr 2e-4 \
  --outdir ./models/shadow_model_canary/shadow_3/gpt2_shadow \
  --lora \
  --lora_r 32 \
  --lora_alpha 64 \
  --lora_dropout 0.05 \
  --merge_lora

!cp -r ./models/shadow_model_canary/shadow_3/ /content/drive/MyDrive/FoundOfPriv/models/shadow_3/

2025-11-07 21:33:24.130104: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1762551204.151399   26132 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1762551204.158567   26132 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1762551204.175748   26132 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1762551204.175778   26132 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1762551204.175795   26132 computation_placer.cc:177] computation placer alr

In [None]:
# To copy models from drive to folder
!cp -r /content/drive/MyDrive/FoundOfPriv/models/ ./models/shadow_models/

In [None]:
!cp -r ./models/shadow_models/shadow_1/ /content/drive/MyDrive/FoundOfPriv/models/shadow_1/

**Using Shadow Models**

In [None]:
# Verify Shadow Model Signal
@torch.no_grad()
def compute_losses(model, dataloader, device):
    """Compute per-sample losses"""
    model.eval()
    losses = []

    for batch in dataloader:
        input_ids = batch["input_ids"].to(device)
        attention_mask = batch["attention_mask"].to(device)
        labels = batch["labels"].to(device)

        outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=labels)

        # Get per-sample loss
        loss_fct = torch.nn.CrossEntropyLoss(reduction='none')
        shift_logits = outputs.logits[..., :-1, :].contiguous()
        shift_labels = labels[..., 1:].contiguous()

        per_token_loss = loss_fct(
            shift_logits.view(-1, shift_logits.size(-1)),
            shift_labels.view(-1)
        )
        per_sample_loss = per_token_loss.view(shift_labels.size(0), -1).mean(dim=1)
        losses.extend(per_sample_loss.cpu().numpy())

    return np.array(losses)

# Collect features from shadow models
shadow_features = []
shadow_labels = []

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
batch_size = 32

for shadow_id in range(5):
    print(f"\nProcessing shadow model {shadow_id}...")

    # Load shadow model
    model_dir = f"./models/shadow_models/shadow_{shadow_id}/gpt2_shadow"
    model = AutoModelForCausalLM.from_pretrained(model_dir).to(device)
    tokenizer = AutoTokenizer.from_pretrained(model_dir)
    if tokenizer.pad_token is None:
        tokenizer.pad_token = tokenizer.eos_token

    # Load shadow data
    data_dir = Path(f"./data/shadow_data/shadow_{shadow_id}")
    test_texts = json.load(open(data_dir / "test.json"))
    test_labels = json.load(open(data_dir / "test_label.json"))

    # Create dataset
    ds = Dataset.from_dict({"text": test_texts})
    ds = tokenize_dataset(ds, tokenizer, max_len=512)
    dl = DataLoader(ds, batch_size=batch_size)

    # Compute losses
    losses = compute_losses(model, dl, device)

    shadow_features.extend(losses)
    shadow_labels.extend(test_labels)

    del model
    torch.cuda.empty_cache()

shadow_features = np.array(shadow_features)
shadow_labels = np.array(shadow_labels)

print(f"\n{'='*60}")
print(f"Total samples collected: {len(shadow_features)}")
print(f"Members: {sum(shadow_labels)}, Non-members: {len(shadow_labels) - sum(shadow_labels)}")
print(f"{'='*60}")

Processing shadow model 0...


Filter:   0%|          | 0/2000 [00:00<?, ? examples/s]

Map:   0%|          | 0/2000 [00:00<?, ? examples/s]

`loss_type=None` was set in the config but it is unrecognized. Using the default loss: `ForCausalLMLoss`.


Processing shadow model 1...


Filter:   0%|          | 0/2000 [00:00<?, ? examples/s]

Map:   0%|          | 0/2000 [00:00<?, ? examples/s]

Processing shadow model 2...


Filter:   0%|          | 0/2000 [00:00<?, ? examples/s]

Map:   0%|          | 0/2000 [00:00<?, ? examples/s]

Processing shadow model 3...


Filter:   0%|          | 0/2000 [00:00<?, ? examples/s]

Map:   0%|          | 0/2000 [00:00<?, ? examples/s]

Processing shadow model 4...


Filter:   0%|          | 0/2000 [00:00<?, ? examples/s]

Map:   0%|          | 0/2000 [00:00<?, ? examples/s]

Collected 10000 samples for training attack model


**Visualize the shadow model label losses**

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

member_losses = shadow_features[shadow_labels == 1]
nonmember_losses = shadow_features[shadow_labels == 0]

print(f"\n LOSS DISTRIBUTION ANALYSIS:")
print(f"Member losses     - Mean: {member_losses.mean():.4f}, Std: {member_losses.std():.4f}")
print(f"Non-member losses - Mean: {nonmember_losses.mean():.4f}, Std: {nonmember_losses.std():.4f}")
print(f"Difference: {abs(member_losses.mean() - nonmember_losses.mean()):.4f}")

# Visual check
plt.figure(figsize=(12, 6))
sns.histplot(member_losses, bins=50, color='salmon', kde=True, label='Member', alpha=0.6)
sns.histplot(nonmember_losses, bins=50, color='skyblue', kde=True, label='Non-member', alpha=0.6)
plt.xlabel('Loss')
plt.ylabel('Count')
plt.title('Shadow Model Loss Distribution - Member vs Non-member')
plt.legend()
plt.grid(alpha=0.3)
plt.show()

# Statistical test
from scipy import stats
t_stat, p_value = stats.ttest_ind(member_losses, nonmember_losses)
print(f"\n T-test: t-statistic = {t_stat:.4f}, p-value = {p_value:.6f}")

if p_value < 0.05 and abs(member_losses.mean() - nonmember_losses.mean()) > 0.05:
    print(" GOOD SIGNAL! Distributions are significantly different.")
    print("   Expected AUC: Likely > 0.60")
else:
    print("WEAK SIGNAL! Members and non-members have similar losses.")
    print("   Expected AUC: ~0.50 (random guessing)")

# ========== STEP 3: Quick Baseline Test ==========
print(f"\n{'='*60}")
print("TESTING SIMPLE BASELINE (Negative Loss as Score)")
print(f"{'='*60}")

# Simple attack: negate loss (lower loss = member)
X = -shadow_features.reshape(-1, 1)
y = shadow_labels

from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score, roc_curve

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Simple threshold-based attack (no ML model needed)
from sklearn.linear_model import LogisticRegression
baseline_model = LogisticRegression(random_state=42, max_iter=1000)
baseline_model.fit(X_train, y_train)

val_preds = baseline_model.predict_proba(X_val)[:, 1]
val_auc = roc_auc_score(y_val, val_preds)

print(f"\n Baseline Results:")
print(f"   Validation AUC: {val_auc:.4f}")

# Calculate TPR @ 1% FPR
fpr, tpr, thresholds = roc_curve(y_val, val_preds)
tpr_at_1fpr = max(tpr[fpr <= 0.01]) if any(fpr <= 0.01) else 0.0
print(f"   TPR @ 1% FPR: {tpr_at_1fpr:.4f}")

**Train Attack Model**

In [None]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

# Reshape features for sklearn (make it 2D if needed)
X = shadow_features.reshape(-1, 1)  # Loss is a single feature
y = shadow_labels

# Split for validation
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Train attack model
attack_model = LogisticRegression(random_state=42, max_iter=1000)
# Or use: attack_model = RandomForestClassifier(n_estimators=100, random_state=42)

attack_model.fit(X_train, y_train)

# Validate
from sklearn.metrics import roc_auc_score
val_preds = attack_model.predict_proba(X_val)[:, 1]
val_auc = roc_auc_score(y_val, val_preds)
print(f"Attack model validation AUC: {val_auc:.4f}")

Attack model validation AUC: 0.5087


In [None]:
# Check if data is correctly loaded for each shadow model
for shadow_id in range(5):
    data_dir = Path(f"./data/shadow_data/shadow_{shadow_id}")

    test_texts = json.load(open(data_dir / "test.json"))
    test_labels = json.load(open(data_dir / "test_label.json"))

    print(f"\n=== Shadow {shadow_id} ===")
    print(f"Total samples: {len(test_texts)}")
    print(f"Total labels: {len(test_labels)}")
    print(f"Members (label=1): {sum(test_labels)}")
    print(f"Non-members (label=0): {len(test_labels) - sum(test_labels)}")
    print(f"First 10 labels: {test_labels[:10]}")
    print(f"Last 10 labels: {test_labels[-10:]}")

    # Verify alignment
    if len(test_texts) != len(test_labels):
        print(f"❌ ERROR: Mismatch in shadow_{shadow_id}!")

    # Re-collect shadow features with detailed logging
shadow_features = []
shadow_labels = []

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
batch_size = 32

for shadow_id in range(5):
    print(f"\n=== Processing shadow model {shadow_id} ===")

    # Load shadow model
    model_dir = f"./models/shadow_models/shadow_{shadow_id}/gpt2_shadow"

    # Check if model exists
    if not Path(model_dir).exists():
        print(f"❌ Model directory not found: {model_dir}")
        continue

    model = AutoModelForCausalLM.from_pretrained(model_dir).to(device)
    tokenizer = AutoTokenizer.from_pretrained(model_dir)
    if tokenizer.pad_token is None:
        tokenizer.pad_token = tokenizer.eos_token

    # Load shadow data
    data_dir = Path(f"./data/shadow_data/shadow_{shadow_id}")
    test_texts = json.load(open(data_dir / "test.json"))
    test_labels = json.load(open(data_dir / "test_label.json"))

    print(f"Loaded {len(test_texts)} texts and {len(test_labels)} labels")

    # Create dataset
    ds = Dataset.from_dict({"text": test_texts})
    ds = tokenize_dataset(ds, tokenizer, max_len=256)
    dl = DataLoader(ds, batch_size=batch_size)

    print(f"Created dataloader with {len(dl)} batches")

    # Compute losses
    losses = compute_losses(model, dl, device)

    print(f"Computed {len(losses)} loss values")
    print(f"Loss stats - Mean: {losses.mean():.4f}, Std: {losses.std():.4f}, Min: {losses.min():.4f}, Max: {losses.max():.4f}")

    # Verify lengths match
    if len(losses) != len(test_labels):
        print(f"ERROR: Loss count {len(losses)} != Label count {len(test_labels)}")
        continue

    shadow_features.extend(losses)
    shadow_labels.extend(test_labels)

    del model
    torch.cuda.empty_cache()

shadow_features = np.array(shadow_features)
shadow_labels = np.array(shadow_labels)

print(f"\n=== TOTAL ===")
print(f"Total features: {len(shadow_features)}")
print(f"Total labels: {len(shadow_labels)}")
print(f"Total members: {sum(shadow_labels)}")
print(f"Total non-members: {len(shadow_labels) - sum(shadow_labels)}")

# Check both with and without negation
print("\n=== WITHOUT Negation ===")
X = shadow_features.reshape(-1, 1)  # No negation
y = shadow_labels

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)
attack_model_no_neg = LogisticRegression(random_state=42, max_iter=1000)
attack_model_no_neg.fit(X_train, y_train)
val_preds = attack_model_no_neg.predict_proba(X_val)[:, 1]
val_auc = roc_auc_score(y_val, val_preds)
print(f"AUC without negation: {val_auc:.4f}")

print("\n=== WITH Negation ===")
X_neg = -shadow_features.reshape(-1, 1)  # WITH negation
X_train_neg, X_val_neg, y_train, y_val = train_test_split(X_neg, y, test_size=0.2, random_state=42)
attack_model_neg = LogisticRegression(random_state=42, max_iter=1000)
attack_model_neg.fit(X_train_neg, y_train)
val_preds_neg = attack_model_neg.predict_proba(X_val_neg)[:, 1]
val_auc_neg = roc_auc_score(y_val, val_preds_neg)
print(f"AUC with negation: {val_auc_neg:.4f}")

# One of these should be significantly better than 0.5
# If both are ~0.5, there's no signal
# Test loss computation on a single batch
print("\n=== Testing Loss Computation ===")

# Load one shadow model
model_dir = f"./models/shadow_models/shadow_0/gpt2_shadow"
model = AutoModelForCausalLM.from_pretrained(model_dir).to(device)
tokenizer = AutoTokenizer.from_pretrained(model_dir)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# Create a small test dataset
data_dir = Path(f"./data/shadow_data/shadow_0")
test_texts = json.load(open(data_dir / "test.json"))[:10]  # Just 10 samples

ds = Dataset.from_dict({"text": test_texts})
ds = tokenize_dataset(ds, tokenizer, max_len=256)
dl = DataLoader(ds, batch_size=2)

# Compute losses manually
model.eval()
for i, batch in enumerate(dl):
    print(f"\nBatch {i}:")
    input_ids = batch["input_ids"].to(device)
    attention_mask = batch["attention_mask"].to(device)
    labels = batch["labels"].to(device)

    print(f"  Input shape: {input_ids.shape}")
    print(f"  Batch size: {input_ids.shape[0]}")

    with torch.no_grad():
        outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=labels)
        print(f"  Model loss (averaged): {outputs.loss.item():.4f}")

        # Per-sample loss
        logits = outputs.logits
        shift_logits = logits[..., :-1, :].contiguous()
        shift_labels = labels[..., 1:].contiguous()

        loss_fct = torch.nn.CrossEntropyLoss(reduction='none')
        per_token_loss = loss_fct(
            shift_logits.view(-1, shift_logits.size(-1)),
            shift_labels.view(-1)
        )
        per_token_loss = per_token_loss.view(labels.size(0), -1)
        per_sample_loss = per_token_loss.mean(dim=1)

        print(f"  Per-sample losses: {per_sample_loss.cpu().numpy()}")

del model
torch.cuda.empty_cache()


=== Shadow 0 ===
Total samples: 2000
Total labels: 2000
Members (label=1): 1000
Non-members (label=0): 1000
First 10 labels: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Last 10 labels: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

=== Shadow 1 ===
Total samples: 2000
Total labels: 2000
Members (label=1): 1000
Non-members (label=0): 1000
First 10 labels: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Last 10 labels: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

=== Shadow 2 ===
Total samples: 2000
Total labels: 2000
Members (label=1): 1000
Non-members (label=0): 1000
First 10 labels: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Last 10 labels: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

=== Shadow 3 ===
Total samples: 2000
Total labels: 2000
Members (label=1): 1000
Non-members (label=0): 1000
First 10 labels: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Last 10 labels: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

=== Shadow 4 ===
Total samples: 2000
Total labels: 2000
Members (label=1): 1000
Non-members (label=0): 1000
First 10 labels: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Last 10 labels: [0, 0, 0, 0

Filter:   0%|          | 0/2000 [00:00<?, ? examples/s]

Map:   0%|          | 0/2000 [00:00<?, ? examples/s]

Created dataloader with 63 batches
Computed 2000 loss values
Loss stats - Mean: 5.6726, Std: 1.5104, Min: 2.1416, Max: 10.3367

=== Processing shadow model 1 ===
Loaded 2000 texts and 2000 labels


Filter:   0%|          | 0/2000 [00:00<?, ? examples/s]

Map:   0%|          | 0/2000 [00:00<?, ? examples/s]

Created dataloader with 63 batches
Computed 2000 loss values
Loss stats - Mean: 5.5962, Std: 1.4481, Min: 1.8306, Max: 9.4466

=== Processing shadow model 2 ===
Loaded 2000 texts and 2000 labels


Filter:   0%|          | 0/2000 [00:00<?, ? examples/s]

Map:   0%|          | 0/2000 [00:00<?, ? examples/s]

Created dataloader with 63 batches
Computed 2000 loss values
Loss stats - Mean: 5.9332, Std: 1.7241, Min: 1.7887, Max: 10.0979

=== Processing shadow model 3 ===
Loaded 2000 texts and 2000 labels


Filter:   0%|          | 0/2000 [00:00<?, ? examples/s]

Map:   0%|          | 0/2000 [00:00<?, ? examples/s]

Created dataloader with 63 batches
Computed 2000 loss values
Loss stats - Mean: 5.8881, Std: 1.6274, Min: 2.1148, Max: 10.0281

=== Processing shadow model 4 ===
Loaded 2000 texts and 2000 labels


Filter:   0%|          | 0/2000 [00:00<?, ? examples/s]

Map:   0%|          | 0/2000 [00:00<?, ? examples/s]

Created dataloader with 63 batches
Computed 2000 loss values
Loss stats - Mean: 5.6982, Std: 1.5593, Min: 2.1203, Max: 9.9180

=== TOTAL ===
Total features: 10000
Total labels: 10000
Total members: 5000
Total non-members: 5000

=== WITHOUT Negation ===
AUC without negation: 0.5087

=== WITH Negation ===
AUC with negation: 0.5087

=== Testing Loss Computation ===


Filter:   0%|          | 0/10 [00:00<?, ? examples/s]

Map:   0%|          | 0/10 [00:00<?, ? examples/s]


Batch 0:
  Input shape: torch.Size([2, 243])
  Batch size: 2
  Model loss (averaged): 6.3486
  Per-sample losses: [9.485578 3.211627]

Batch 1:
  Input shape: torch.Size([2, 243])
  Batch size: 2
  Model loss (averaged): 6.9557
  Per-sample losses: [6.0211473 7.890325 ]

Batch 2:
  Input shape: torch.Size([2, 243])
  Batch size: 2
  Model loss (averaged): 7.3463
  Per-sample losses: [6.9518785 7.740688 ]

Batch 3:
  Input shape: torch.Size([2, 243])
  Batch size: 2
  Model loss (averaged): 4.8796
  Per-sample losses: [4.2994947 5.459803 ]

Batch 4:
  Input shape: torch.Size([2, 243])
  Batch size: 2
  Model loss (averaged): 5.2855
  Per-sample losses: [5.199942 5.371129]
