<a href="https://colab.research.google.com/github/jamelof23/AG-PTR/blob/main/AG_PTR1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Experiment 1**

In [None]:
#Cell 1 ‚Äî Install dependencies
!pip -q install opacus==1.4.0 tqdm pandas matplotlib


[?25l   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m0.0/224.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m224.8/224.8 kB[0m [31m7.6 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
#Cell 2 ‚Äî Imports + reproducibility
import os, math, random
from copy import deepcopy
import numpy as np
import pandas as pd
from tqdm import tqdm

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader, Subset
import torchvision
import torchvision.transforms as T

import matplotlib.pyplot as plt

from opacus.accountants import RDPAccountant

def seed_all(seed: int):
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

device = "cuda" if torch.cuda.is_available() else "cpu"
print("device:", device)

In [None]:
#Cell 3 ‚Äî Download Fashion‚ÄëMNIST
#torchvision downloads it automatically.
transform = T.Compose([
    T.ToTensor(),
    T.Normalize((0.5,), (0.5,))
])

train_ds = torchvision.datasets.FashionMNIST(
    root="./data", train=True, download=True, transform=transform
)
test_ds = torchvision.datasets.FashionMNIST(
    root="./data", train=False, download=True, transform=transform
)

test_loader = DataLoader(test_ds, batch_size=512, shuffle=False, num_workers=2, pin_memory=True)
print(len(train_ds), len(test_ds))

In [None]:
#Cell 4 ‚Äî Make 6000 clients √ó 10 samples/client (FedVRDP-style)
def make_cross_device_clients(train_dataset, num_clients=6000, samples_per_client=10, seed=0):
    assert num_clients * samples_per_client <= len(train_dataset)
    rng = np.random.default_rng(seed)
    idx = rng.permutation(len(train_dataset)).tolist()
    clients = []
    for c in range(num_clients):
        clients.append(idx[c*samples_per_client:(c+1)*samples_per_client])
    return clients

clients = make_cross_device_clients(train_ds, num_clients=6000, samples_per_client=10, seed=0)
print("num clients:", len(clients), "samples/client:", len(clients[0]))

In [None]:
#Cell 5 ‚Äî Optional ‚Äúpublic anchor set‚Äù (2 samples/class)
#This mirrors the idea ‚Äútiny auxiliary data is easy to obtain‚Äù used by Xiang et al
#If you want to skip this, set public_idx = [] and don‚Äôt filter.
def extract_public_per_class(dataset, per_class=2, seed=0):
    rng = np.random.default_rng(seed)
    targets = np.array(dataset.targets)
    public_idx = []
    for k in range(10):
        cls_idx = np.where(targets == k)[0]
        rng.shuffle(cls_idx)
        public_idx.extend(cls_idx[:per_class].tolist())
    public_idx = sorted(public_idx)
    return public_idx

public_idx = extract_public_per_class(train_ds, per_class=2, seed=0)
public_loader = DataLoader(Subset(train_ds, public_idx), batch_size=20, shuffle=False)

# Remove public samples from clients so they are not used privately
public_set = set(public_idx)
clients_wo_public = []
for cid in range(len(clients)):
    filtered = [i for i in clients[cid] if i not in public_set]
    clients_wo_public.append(filtered)

clients = clients_wo_public
print("public samples:", len(public_idx), "| example client size after removal:", len(clients[0]))

In [None]:
#Cell 6 ‚Äî Model (CNN like FedVRDP description)
#FedVRDP describes a CNN with 2 conv layers, maxpool, ReLU, FC(512).
class FMNIST_CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=5, padding=0)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=5, padding=0)
        self.fc1 = nn.Linear(64*4*4, 512)
        self.fc2 = nn.Linear(512, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))       # 28->24
        x = F.max_pool2d(x, 2)          # 24->12
        x = F.relu(self.conv2(x))       # 12->8
        x = F.max_pool2d(x, 2)          # 8->4
        x = x.view(x.size(0), -1)       # 64*4*4
        x = F.relu(self.fc1(x))
        return self.fc2(x)

@torch.no_grad()
def evaluate(model, loader):
    model.eval()
    correct, total = 0, 0
    for x, y in loader:
        x, y = x.to(device), y.to(device)
        logits = model(x)
        pred = logits.argmax(dim=1)
        correct += (pred == y).sum().item()
        total += y.numel()
    return correct / total

In [None]:
#Cell 7 ‚Äî Tensor-list utilities (fast, no giant flatten)
def model_param_list(model):
    return [p for p in model.parameters() if p.requires_grad]

@torch.no_grad()
def zero_like_params(model):
    return [torch.zeros_like(p) for p in model_param_list(model)]

@torch.no_grad()
def copy_params_(dst_model, src_model):
    for dp, sp in zip(model_param_list(dst_model), model_param_list(src_model)):
        dp.data.copy_(sp.data)

@torch.no_grad()
def add_update_(model, update_list, scale=1.0):
    for p, u in zip(model_param_list(model), update_list):
        p.data.add_(u, alpha=scale)

@torch.no_grad()
def l2_norm_list(tlist):
    s = None
    for t in tlist:
        v = (t*t).sum()
        s = v if s is None else s + v
    return torch.sqrt(s + 1e-12)

@torch.no_grad()
def dot_list(a_list, b_list):
    s = None
    for a, b in zip(a_list, b_list):
        v = (a*b).sum()
        s = v if s is None else s + v
    return s

@torch.no_grad()
def add_scaled_list_(dst, src, alpha):
    for d, s in zip(dst, src):
        d.add_(s, alpha=alpha)

@torch.no_grad()
def sub_list(a, b):
    return [x - y for x, y in zip(a, b)]

In [None]:
#Cell 8 ‚Äî Local client training (returns update = local_model ‚àí global_model)
def client_update(global_model, client_indices, lr, momentum, local_epochs, batch_size=10):
    local_model = deepcopy(global_model)
    local_model.train()

    loader = DataLoader(Subset(train_ds, client_indices),
                        batch_size=batch_size, shuffle=True, drop_last=False)

    opt = torch.optim.SGD(local_model.parameters(), lr=lr, momentum=momentum)
    loss_fn = nn.CrossEntropyLoss()

    for _ in range(local_epochs):
        for x, y in loader:
            x, y = x.to(device), y.to(device)
            opt.zero_grad()
            loss = loss_fn(local_model(x), y)
            loss.backward()
            opt.step()

    # delta = local - global
    delta = []
    for lp, gp in zip(model_param_list(local_model), model_param_list(global_model)):
        delta.append((lp.data - gp.data).detach())
    return delta

In [None]:
#Cell 9 ‚Äî DP accounting helpers (compute noise multiplier for target Œµ)
#We will use RDP accountant_toggle: sample rate  ùëû=clients per round/total clients, steps = rounds.
def epsilon_from_sigma_dp_sgd(sigma, q, steps, delta):
    acc = RDPAccountant()
    for _ in range(steps):
        acc.step(noise_multiplier=sigma, sample_rate=q)
    return acc.get_epsilon(delta)

def epsilon_from_sigma_two_mech(sigma_sel, sigma_rel, q, steps, delta):
    acc = RDPAccountant()
    for _ in range(steps):
        acc.step(noise_multiplier=sigma_sel, sample_rate=q)  # selection
        acc.step(noise_multiplier=sigma_rel, sample_rate=q)  # release
    return acc.get_epsilon(delta)

def find_sigma_for_target_eps_single(target_eps, q, steps, delta, lo=0.1, hi=50.0, iters=40):
    for _ in range(iters):
        mid = (lo + hi) / 2
        eps = epsilon_from_sigma_dp_sgd(mid, q, steps, delta)
        if eps > target_eps:
            lo = mid
        else:
            hi = mid
    return hi

def find_sigma_for_target_eps_two(target_eps, q, steps, delta, sel_factor=4.0, lo=0.1, hi=50.0, iters=40):
    for _ in range(iters):
        mid = (lo + hi) / 2
        eps = epsilon_from_sigma_two_mech(sel_factor*mid, mid, q, steps, delta)
        if eps > target_eps:
            lo = mid
        else:
            hi = mid
    return hi

In [None]:
#Cell 10 ‚Äî DP‚ÄëFedAvg training loop
def train_dp_fedavg(seed, eps_total, delta=1e-5,
                    num_clients=6000, clients_per_round=100,
                    rounds=180, local_epochs=10, batch_size=10,
                    lr0=0.125, lr_decay=0.99, momentum=0.5,
                    clip_C=1.0):

    seed_all(seed)
    model = FMNIST_CNN().to(device)
    q = clients_per_round / num_clients

    sigma = find_sigma_for_target_eps_single(eps_total, q, rounds, delta)
    achieved_eps = epsilon_from_sigma_dp_sgd(sigma, q, rounds, delta)
    print(f"[DP-FedAvg] target eps={eps_total} -> sigma={sigma:.4f}, achieved eps‚âà{achieved_eps:.3f}")

    for t in tqdm(range(rounds), desc=f"DP-FedAvg eps={eps_total}"):
        lr_t = lr0 * (lr_decay ** t)
        chosen = np.random.choice(num_clients, size=clients_per_round, replace=False)

        sum_update = zero_like_params(model)

        for cid in chosen:
            delta_i = client_update(model, clients[cid], lr=lr_t, momentum=momentum,
                                    local_epochs=local_epochs, batch_size=batch_size)

            norm = l2_norm_list(delta_i)
            scale = min(1.0, clip_C / (norm.item() + 1e-12))
            add_scaled_list_(sum_update, delta_i, scale)

        # add Gaussian noise to the SUM (std = sigma * C)
        for j in range(len(sum_update)):
            sum_update[j].add_(torch.randn_like(sum_update[j]) * (sigma * clip_C))

        # average and apply
        avg_update = [u / clients_per_round for u in sum_update]
        add_update_(model, avg_update, scale=1.0)

    acc = evaluate(model, test_loader)
    return acc, achieved_eps, sigma

In [None]:
#Cell 11 ‚Äî Minimal AG‚ÄëPTR training loop (efficient R=2 anchors: ¬±previous_update)
#This is a ‚Äúminimal but faithful‚Äù implementation for Experiment 1:
def train_ag_ptr(seed, eps_total, delta=1e-5,
                 num_clients=6000, clients_per_round=100,
                 rounds=180, local_epochs=10, batch_size=10,
                 lr0=0.125, lr_decay=0.99, momentum=0.5,
                 rho=0.3, tau=60, sel_factor=4.0):

    """
    rho: anchor-relative clipping radius for offsets
    tau: minimum population threshold for release (gate)
    sel_factor: sigma_sel = sel_factor * sigma_rel (make selection cost small)
    """

    seed_all(seed)
    model = FMNIST_CNN().to(device)
    q = clients_per_round / num_clients

    sigma_rel = find_sigma_for_target_eps_two(eps_total, q, rounds, delta, sel_factor=sel_factor)
    sigma_sel = sel_factor * sigma_rel
    achieved_eps = epsilon_from_sigma_two_mech(sigma_sel, sigma_rel, q, rounds, delta)
    print(f"[AG-PTR] target eps={eps_total} -> sigma_rel={sigma_rel:.4f}, sigma_sel={sigma_sel:.4f}, achieved eps‚âà{achieved_eps:.3f}")

    # anchor a_t (list of tensors), initialize as zero update
    anchor = zero_like_params(model)

    accept_count = 0

    for t in tqdm(range(rounds), desc=f"AG-PTR eps={eps_total}"):
        lr_t = lr0 * (lr_decay ** t)
        chosen = np.random.choice(num_clients, size=clients_per_round, replace=False)

        # ---- Phase 1: Propose/Test (R=2 anchors: +a, -a) ----
        # Count how many clients align with +a vs -a using dot(delta, a)
        count_pos = 0

        for cid in chosen:
            delta_i = client_update(model, clients[cid], lr=lr_t, momentum=momentum,
                                    local_epochs=local_epochs, batch_size=batch_size)
            s = dot_list(delta_i, anchor).item()
            if s >= 0:
                count_pos += 1

        count_neg = clients_per_round - count_pos

        # DP noisy counts (std = sigma_sel * 1 for counts under add/remove)
        noisy_pos = count_pos + np.random.normal(0.0, sigma_sel)
        noisy_neg = count_neg + np.random.normal(0.0, sigma_sel)

        # select anchor
        select_pos = (noisy_pos >= noisy_neg)
        noisy_winner = noisy_pos if select_pos else noisy_neg

        if noisy_winner < tau:
            # reject: no update released
            continue

        accept_count += 1

        chosen_anchor = anchor if select_pos else [(-a) for a in anchor]

        # ---- Phase 2: Release anchored mean with clipped offsets + DP noise ----
        sum_offsets = zero_like_params(model)
        contributors = 0

        for cid in chosen:
            delta_i = client_update(model, clients[cid], lr=lr_t, momentum=momentum,
                                    local_epochs=local_epochs, batch_size=batch_size)

            s = dot_list(delta_i, anchor).item()
            assigned_pos = (s >= 0)
            if assigned_pos != select_pos:
                continue

            contributors += 1

            offset = sub_list(delta_i, chosen_anchor)
            off_norm = l2_norm_list(offset).item()
            scale = min(1.0, rho / (off_norm + 1e-12))
            add_scaled_list_(sum_offsets, offset, scale)

        m_hat = max(tau, int(max(noisy_winner, 0)))

        # add DP Gaussian noise to SUM offsets (std = sigma_rel * rho)
        for j in range(len(sum_offsets)):
            sum_offsets[j].add_(torch.randn_like(sum_offsets[j]) * (sigma_rel * rho))

        mean_update = [ca + (so / m_hat) for ca, so in zip(chosen_anchor, sum_offsets)]

        # apply update
        add_update_(model, mean_update, scale=1.0)

        # update anchor to be the released update (DP-safe by construction)
        anchor = [u.detach() for u in mean_update]

    acc = evaluate(model, test_loader)
    accept_rate = accept_count / rounds
    return acc, achieved_eps, (sigma_sel, sigma_rel), accept_rate

In [None]:
#Cell 12 ‚Äî Run the Experiment 1 sweep and plot
EPS_GRID = [1, 2, 4]
SEEDS = [1, 2, 3]   # papers often do multiple seeds

# Common settings
DELTA = 1e-5
ROUNDS = 180
NUM_CLIENTS = 6000
CLIENTS_PER_ROUND = 100
LOCAL_EPOCHS = 10
BATCH_SIZE = 10
LR0 = 0.125
LR_DECAY = 0.99
MOMENTUM = 0.5

# DP hyperparams (you can tune later)
CLIP_C = 1.0      # DP-FedAvg clip
RHO = 0.3         # AG-PTR anchor-relative clip for offsets
TAU = 60          # threshold
SEL_FACTOR = 4.0  # sigma_sel = SEL_FACTOR * sigma_rel

rows = []

for eps in EPS_GRID:
    # DP-FedAvg
    accs = []
    for sd in SEEDS:
        acc, achieved_eps, sigma = train_dp_fedavg(
            sd, eps, delta=DELTA,
            num_clients=NUM_CLIENTS, clients_per_round=CLIENTS_PER_ROUND,
            rounds=ROUNDS, local_epochs=LOCAL_EPOCHS, batch_size=BATCH_SIZE,
            lr0=LR0, lr_decay=LR_DECAY, momentum=MOMENTUM,
            clip_C=CLIP_C
        )
        accs.append(acc)
    rows.append({"method":"DP-FedAvg", "eps_target":eps, "acc_mean":np.mean(accs), "acc_std":np.std(accs)})

    # AG-PTR
    accs = []
    acc_rates = []
    for sd in SEEDS:
        acc, achieved_eps, (sig_sel, sig_rel), ar = train_ag_ptr(
            sd, eps, delta=DELTA,
            num_clients=NUM_CLIENTS, clients_per_round=CLIENTS_PER_ROUND,
            rounds=ROUNDS, local_epochs=LOCAL_EPOCHS, batch_size=BATCH_SIZE,
            lr0=LR0, lr_decay=LR_DECAY, momentum=MOMENTUM,
            rho=RHO, tau=TAU, sel_factor=SEL_FACTOR
        )
        accs.append(acc)
        acc_rates.append(ar)
    rows.append({"method":"AG-PTR", "eps_target":eps, "acc_mean":np.mean(accs), "acc_std":np.std(accs),
                 "accept_rate_mean":np.mean(acc_rates)})

df = pd.DataFrame(rows)
print(df)

In [None]:
#Cell 13 ‚Äî Save CSV + plot accuracy vs epsilon
os.makedirs("results", exist_ok=True)
csv_path = "results/exp1_privacy_utility.csv"
df.to_csv(csv_path, index=False)
print("Saved:", csv_path)

plt.figure()
for method in df["method"].unique():
    sub = df[df["method"]==method].sort_values("eps_target")
    plt.errorbar(sub["eps_target"], sub["acc_mean"], yerr=sub["acc_std"], marker="o", capsize=4, label=method)

plt.xscale("log", base=2)
plt.xlabel(r"Target $\varepsilon_{\mathrm{total}}$ (log2 scale)")
plt.ylabel("Final test accuracy")
plt.title("Experiment 1: Privacy‚Äìutility (Fashion-MNIST, no attack)")
plt.grid(True, which="both", linestyle="--", alpha=0.4)
plt.legend()
fig_path = "results/exp1_privacy_utility.png"
plt.savefig(fig_path, dpi=200, bbox_inches="tight")
plt.show()

print("Saved:", fig_path)

In [None]:
# cell 14 - Xiang et al. (official repo) ‚Äî ‚Äúno attack‚Äù run
!git clone https://github.com/zihangxiang/Practical-Differentially-Private-and-Byzantine-resilient-Federated.git
%cd Practical-Differentially-Private-and-Byzantine-resilient-Federated
!pip -q install -r requirements.txt

run (example pattern taken from their README; adapt eps=1,2,4 and set no Byzantine):

bash code:

python main.py \
  --dataset fashion \
  --att_key nobyz \
  --epsilon 1 \
  --DP_mode centralDP \
  --seed 1 \
  --mal_worker_portion 0 \
  --anti_byz 1 \
  --non_iid 0 \
  --start_att 0.0 \
  --base_lr 0.2

Do this for --epsilon 2 and --epsilon 4 (and seeds 1/2/3)

How to merge results into your plot:
After each run, copy the printed final test accuracy into your exp1_privacy_utility.csv under method "Xiang-et-al".

In [None]:
# cell 15 - DP‚ÄëBREM/+ (official repo)
%cd /content
!git clone https://github.com/xiaolangu/DP-BREM.git
%cd DP-BREM

The repo says you can run main.py and to look at args.py for required parameters.

To quickly discover arguments in Colab:

bash code:

grep -n "add_argument" -n args.py | head -n 80

Then run with 0 Byzantine, Fashion‚ÄëMNIST, and your target epsilon(s). If the repo uses noise multiplier instead of epsilon, you‚Äôll sweep noise multipliers and report the achieved epsilon using their accounting (that‚Äôs still OK as long as you report achieved eps).