Cell 1: 环境设置与路径配置

功能说明：
将项目根目录添加到 Python 模块搜索路径中，确保后续能够正确加载项目内部模块。

In [1]:
# Cell 1: 环境设置与路径配置
import sys
import os

# 将项目根目录添加到 sys.path 中
project_path = "/scratch/guanguowei/Code/MyWork/VIP5_Shadowcast_DPA"
if project_path not in sys.path:
    sys.path.insert(0, project_path)
print("Project path:", project_path)

Project path: /scratch/guanguowei/Code/MyWork/VIP5_Shadowcast_DPA


Cell 2: 导入依赖库与模块

功能说明：
导入所有需要的第三方库和项目内部模块。注意部分模块（如 P5Tokenizer）在后续 cell 中会用到。

In [2]:
# Cell 2: 导入依赖库与模块
import collections
import random
import re
import os
import logging
import shutil
import time
from pathlib import Path
from packaging import version
from collections import defaultdict

from tqdm import tqdm
import numpy as np
import gzip
import torch
import torch.nn as nn
from torch.nn.parallel import DistributedDataParallel as DDP
import torch.distributed as dist
import torch.backends.cudnn as cudnn

# 导入项目内部模块
from src.param import parse_args
from src.utils import LossMeter, load_state_dict, set_global_logging_level
from src.dist_utils import reduce_dict
from transformers import T5Tokenizer
from src.tokenization import P5Tokenizer
from src.model import VIP5Tuning
from src.trainer_base import TrainerBase

# 判断是否使用 native AMP 或 Apex
_use_native_amp = False
_use_apex = False
if version.parse(torch.__version__) < version.parse("1.6"):
    from transormers.file_utils import is_apex_available
    if is_apex_available():
        from apex import amp
    _use_apex = True
else:
    _use_native_amp = True
    from torch.cuda.amp import autocast

print("所有依赖库已导入")


  from .autonotebook import tqdm as notebook_tqdm


所有依赖库已导入


Cell 3: 定义辅助函数

功能说明：
定义常用的辅助函数，如 pickle、json 的加载函数，以及文件读取函数等，方便后续调用。

In [3]:
# Cell 3: 定义辅助函数
import pickle
import json

def load_pickle(filename):
    with open(filename, "rb") as f:
        return pickle.load(f)

def save_pickle(data, filename):
    with open(filename, "wb") as f:
        pickle.dump(data, f, protocol=pickle.HIGHEST_PROTOCOL)
        
def load_json(file_path):
    with open(file_path, "r") as f:
        return json.load(f)
    
def ReadLineFromFile(path):
    lines = []
    with open(path, 'r') as fd:
        for line in fd:
            lines.append(line.rstrip('\n'))
    return lines

def parse(path):
    g = gzip.open(path, 'r')
    for l in g:
        yield eval(l)

print("辅助函数定义完成")


辅助函数定义完成


Cell 4: 定义 DotDict 类及参数设置

功能说明：
定义一个 DotDict 类，使得可以通过属性方式访问字典中的值；并设置所有实验参数、随机种子等，保证实验结果可复现。

In [6]:
# Cell 4: 设置参数与随机种子
# 功能：构造参数对象、设置随机种子及各项实验参数，保证实验结果可复现。

class DotDict(dict):
    """将字典转化为对象，支持通过属性访问"""
    def __init__(self, **kwds):
        self.update(kwds)
        self.__dict__ = self
    def __repr__(self):
        # 避免递归调用 __repr__，直接调用 dict 的 __repr__
        return dict.__repr__(self)

# 构造参数对象
args = DotDict()

# ----------------- 基本训练参数 -----------------
args.distributed = False
args.multiGPU = True
args.fp16 = True

args.split = "toys"
args.train = args.split
args.valid = args.split
args.test = args.split
args.batch_size = 16
args.optim = 'adamw'
args.warmup_ratio = 0.1
args.lr = 1e-3
args.num_workers = 4
args.clip_grad_norm = 5.0
args.losses = 'sequential,direct,explanation'
args.backbone = 't5-small'

# ----------------- 模型及视觉特征参数 -----------------
args.image_feature_type = 'vitb32'
args.image_feature_size_ratio = 2
args.use_adapter = True
args.reduction_factor = 8
args.use_single_adapter = True
args.use_vis_layer_norm = True
args.add_adapter_cross_attn = True
args.use_lm_head_adapter = True

# ----------------- 训练轮数、随机种子等 -----------------
args.epoch = 20
args.local_rank = 0
args.comment = ''
args.train_topk = -1
args.valid_topk = -1
args.dropout = 0.1
args.tokenizer = 'p5'
args.max_text_length = 1024
args.gen_max_length = 64
args.do_lower_case = False
args.weight_decay = 0.01
args.adam_eps = 1e-6
args.gradient_accumulation_steps = 1

# 设置随机种子
args.seed = 2022
torch.manual_seed(args.seed)
random.seed(args.seed)
np.random.seed(args.seed)

# ----------------- 启用 Whole Word 和 Category Embedding -----------------
args.whole_word_embed = True
args.category_embed = True

# ----------------- cudnn 及 GPU 参数 -----------------
cudnn.benchmark = True
ngpus_per_node = torch.cuda.device_count()
args.world_size = ngpus_per_node

# 设置损失项名称列表
LOSSES_NAME = [f'{name}_loss' for name in args.losses.split(',')]
LOSSES_NAME.append('total_loss')
args.LOSSES_NAME = LOSSES_NAME

# ----------------- 攻击模式与恶意比例 -----------------
# 请根据当前实验替换下面两行
args.attack_mode = "DirectBoostingAttack"
args.mr = 0.1

print("当前参数配置：")
print(args)


当前参数配置：
{'distributed': False, 'multiGPU': True, 'fp16': True, 'split': 'toys', 'train': 'toys', 'valid': 'toys', 'test': 'toys', 'batch_size': 16, 'optim': 'adamw', 'warmup_ratio': 0.1, 'lr': 0.001, 'num_workers': 4, 'clip_grad_norm': 5.0, 'losses': 'sequential,direct,explanation', 'backbone': 't5-small', 'image_feature_type': 'vitb32', 'image_feature_size_ratio': 2, 'use_adapter': True, 'reduction_factor': 8, 'use_single_adapter': True, 'use_vis_layer_norm': True, 'add_adapter_cross_attn': True, 'use_lm_head_adapter': True, 'epoch': 20, 'local_rank': 0, 'comment': '', 'train_topk': -1, 'valid_topk': -1, 'dropout': 0.1, 'tokenizer': 'p5', 'max_text_length': 1024, 'gen_max_length': 64, 'do_lower_case': False, 'weight_decay': 0.01, 'adam_eps': 1e-06, 'gradient_accumulation_steps': 1, 'seed': 2022, 'whole_word_embed': True, 'category_embed': True, 'world_size': 4, 'LOSSES_NAME': ['sequential_loss', 'direct_loss', 'explanation_loss', 'total_loss'], 'attack_mode': 'DirectBoostingAttack', '

Cell 5: GPU设置与生成运行名称

功能说明：
指定使用的 GPU（手动设置），并构造一个运行名称（run_name），便于后续日志及保存结果区分。

In [9]:
# Cell 5: GPU设置与生成运行名称
# 功能：指定 GPU（手动设置），并构造一个运行名称

# 手动指定 GPU ID
gpu = 3
args.gpu = gpu
args.rank = gpu
print(f'Process Launching at GPU {gpu}')

# 设置当前 GPU 设备
torch.cuda.set_device(f'cuda:{gpu}')

# 构造运行名称
comments = []
dsets = []
if 'toys' in args.train:
    dsets.append('toys')
if 'beauty' in args.train:
    dsets.append('beauty')
if 'sports' in args.train:
    dsets.append('sports')
if 'clothing' in args.train:
    dsets.append('clothing')
comments.append(''.join(dsets))
if args.backbone:
    comments.append(args.backbone)
comments.append(''.join(args.losses.split(',')))
if args.comment != '':
    comments.append(args.comment)
comment = '_'.join(comments)

from datetime import datetime
current_time = datetime.now().strftime('%m%d')  # 例如 '0304'

if args.local_rank in [0, -1]:
    run_name = f'{current_time}_GPU{args.world_size}'
    if len(comments) > 0:
        run_name += f'_{comment}'
    args.run_name = run_name
    print("运行名称:", args.run_name)


Process Launching at GPU 3
运行名称: 0425_GPU4_toys_t5-small_sequentialdirectexplanation


Cell 6: 构建模型配置、Tokenizer 与模型

功能说明：
根据参数构建模型配置（config）、创建 Tokenizer，并加载预训练模型。
注意：由于 checkpoint 使用的是 T5Tokenizer，而我们调用 P5Tokenizer，所以会有警告信息，但功能不受影响。
另外，为了适配 adapter，需要将 config.d_model 赋值给 adapter_config。

In [10]:
# Cell 6: 构建模型配置、Tokenizer 与模型
# 功能：根据参数构建模型配置，创建 Tokenizer，并加载预训练模型
import re  # 确保导入 re 模块

def create_config(args):
    from transformers import T5Config
    from adapters import AdapterConfig  # 使用适配器配置

    # 从预训练 checkpoint 加载 T5 配置
    config = T5Config.from_pretrained(args.backbone)
    # 将所有参数写入配置中
    for k, v in vars(args).items():
        setattr(config, k, v)
    config.non_linearity = "relu"

    # 设置视觉特征参数
    image_feature_dim_dict = {
        'vitb32': 512,
        'vitb16': 512,
        'vitl14': 768,
        'rn50': 1024,
        'rn101': 512
    }
    config.feat_dim = image_feature_dim_dict[args.image_feature_type]
    config.n_vis_tokens = args.image_feature_size_ratio
    config.use_vis_layer_norm = args.use_vis_layer_norm
    config.reduction_factor = args.reduction_factor

    config.use_adapter = args.use_adapter
    config.add_adapter_cross_attn = args.add_adapter_cross_attn
    config.use_lm_head_adapter = args.use_lm_head_adapter
    config.use_single_adapter = args.use_single_adapter

    config.dropout_rate = args.dropout
    config.dropout = args.dropout
    config.attention_dropout = args.dropout
    config.activation_dropout = args.dropout

    config.losses = args.losses

    # 如果使用适配器，则创建适配器配置，并将主配置的 d_model 传给 adapter_config
    tasks = re.split("[, ]+", args.losses)
    if args.use_adapter:
        adapter_config = AdapterConfig()
        adapter_config.tasks = tasks
        adapter_config.d_model = config.d_model  # 传递隐藏维度
        adapter_config.use_single_adapter = args.use_single_adapter
        adapter_config.reduction_factor = args.reduction_factor
        adapter_config.track_z = False
        config.adapter_config = adapter_config
    else:
        config.adapter_config = None

    return config

def create_tokenizer(args):
    from transformers import T5Tokenizer
    # 根据参数决定使用 P5Tokenizer 或 T5Tokenizer
    if 'p5' in args.tokenizer:
        from src.tokenization import P5Tokenizer
        tokenizer_class = P5Tokenizer
    else:
        tokenizer_class = T5Tokenizer

    tokenizer = tokenizer_class.from_pretrained(
        args.backbone,
        max_length=args.max_text_length,
        do_lower_case=args.do_lower_case,
    )
    print("Tokenizer:", tokenizer_class, args.backbone)
    return tokenizer

def create_model(model_class, config):
    print(f'Building Model at GPU {args.gpu}')
    model = model_class.from_pretrained(
        args.backbone,
        config=config
    )
    return model

# 构建配置、Tokenizer 和模型
config = create_config(args)
if args.tokenizer is None:
    args.tokenizer = args.backbone
tokenizer = create_tokenizer(args)
model_class = VIP5Tuning
model = create_model(model_class, config)

# 将模型移至指定 GPU
model = model.cuda()

# 如果使用 P5Tokenizer，则调整模型的词嵌入
if 'p5' in args.tokenizer:
    model.resize_token_embeddings(tokenizer.vocab_size)
model.tokenizer = tokenizer

print("模型和 Tokenizer 构建完成")


The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'T5Tokenizer'. 
The class this function is called from is 'P5Tokenizer'.


Tokenizer: <class 'src.tokenization.P5Tokenizer'> t5-small
Building Model at GPU 3
JointEncoder initialized successfully.
T5Stack initialized successfully.


Some weights of VIP5Tuning were not initialized from the model checkpoint at t5-small and are newly initialized: ['decoder.block.0.layer.2.ff_adapter.adapters.direct.down_sampler.weight', 'decoder.block.2.layer.2.ff_adapter.adapters.direct.up_sampler.bias', 'encoder.block.3.layer.0.attn_adapter.adapters.explanation.down_sampler.bias', 'encoder.visual_embedding.feat_embedding.1.weight', 'decoder.block.2.layer.1.enc_attn_adapter.adapters.direct.up_sampler.weight', 'decoder.block.2.layer.2.ff_adapter.adapters.sequential.up_sampler.weight', 'encoder.block.1.layer.1.ff_adapter.adapters.direct.up_sampler.weight', 'encoder.block.0.layer.1.ff_adapter.adapters.sequential.down_sampler.weight', 'decoder.block.1.layer.0.attn_adapter.adapters.explanation.up_sampler.bias', 'decoder.block.0.layer.2.ff_adapter.adapters.sequential.down_sampler.weight', 'encoder.block.3.layer.0.attn_adapter.adapters.sequential.down_sampler.weight', 'encoder.block.4.layer.1.ff_adapter.adapters.explanation.up_sampler.bias

lm_head initialized successfully.
OutputParallelAdapterLayer initialized successfully.
AdapterConfig: AdapterConfig(add_layer_norm_before_adapter=False, add_layer_norm_after_adapter=False, non_linearity='gelu_new', reduction_factor=8)
模型和 Tokenizer 构建完成


Cell 7: 加载预训练模型权重

功能说明：
从指定 checkpoint 路径加载预训练模型权重，并打印加载结果。

In [11]:
# Cell 7: 加载预训练模型权重
# 功能：从 checkpoint 加载预训练模型权重，并打印实际加载路径

from pprint import pprint

def load_checkpoint(ckpt_path):
    # 如果没有以 .pth 结尾，自动补全
    if not ckpt_path.endswith('.pth'):
        ckpt_path = ckpt_path + '.pth'
    print(f"📥 Loading checkpoint from: {ckpt_path}")
    state_dict = load_state_dict(ckpt_path, 'cpu')
    results = model.load_state_dict(state_dict, strict=False)
    print("ℹ️  load_state_dict 结果：")
    pprint(results)

# 指定 checkpoint 的完整路径（带或不带 .pth 都行）
args.load = "/scratch/guanguowei/Code/MyWork/VIP5_Shadowcast_DPA/snap/toys/0425/DirectBoostingAttack_0.1_toys-vitb32-2-8-20/BEST_EVAL_LOSS.pth"
# 真正去调用加载
load_checkpoint(args.load)


📥 Loading checkpoint from: /scratch/guanguowei/Code/MyWork/VIP5_Shadowcast_DPA/snap/toys/0425/DirectBoostingAttack_0.1_toys-vitb32-2-8-20/BEST_EVAL_LOSS.pth
ℹ️  load_state_dict 结果：
_IncompatibleKeys(missing_keys=['output_adapter.adapter.down_sampler.weight', 'output_adapter.adapter.down_sampler.bias', 'output_adapter.adapter.up_sampler.weight', 'output_adapter.adapter.up_sampler.bias'], unexpected_keys=[])


Cell 8: 加载数据集及数据映射

功能说明：
加载数据分割文件（如 rating_splits_augmented.pkl）以及数据映射文件（datamaps.json），用于后续评估。

In [12]:
# Cell 8: 加载数据集及数据映射
# 功能：加载 rating_splits_augmented.pkl 和 datamaps.json 数据文件

data_splits = load_pickle(f'../data/{args.split}/rating_splits_augmented.pkl')
test_review_data = data_splits['test']
print("Test data长度:", len(test_review_data))
print("Test data示例:", test_review_data[0])

data_maps = load_json(os.path.join('../data', args.split, 'datamaps.json'))
print("用户数量:", len(data_maps['user2id']))
print("物品数量:", len(data_maps['item2id']))


Test data长度: 16759
Test data示例: {'reviewerID': 'A5K3CK2PWYQ7O', 'asin': 'B00F4CFEYG', 'reviewerName': 'Ellie "mittbooks"', 'helpful': [0, 0], 'reviewText': "I've found the Melissa & Doug brand to be overall good, although there are occasional negatives.  This is definitely one of the toys we'll mark a &#34;winner.&#34;  The vacuum comes in two pieces that require minimal assembly (the long handle and the base need to be put together - no tools required).  The height is perfect for our two year old who is 3 feet tall.  The top part moves at about a 45 degree angle to facilitate little people pushing the vacuum.  I'm not sure how long the six wooden pieces of &#34;trash&#34; will last.  Although not tiny, they would be easy to lose.  The vacuum does a good job of picking them up easily and there is a small area in the back of the base to take them out again.  There is also a rotating knob on the front of the handle that makes a good clicking noise when it moves.  Our son is truly enjoyin

Cell 9: 加载数据生成器与评价指标

功能说明：
导入数据加载函数和评价指标函数，为后续评估生成数据加载器和计算 BLEU/ROUGE 等指标。

In [13]:
# Cell 9: 导入数据加载器与评价指标函数
# 功能：导入 get_loader、BLEU、ROUGE 等评价指标函数

from torch.utils.data import DataLoader, Dataset, Sampler
from src.data import get_loader
from evaluate.utils import rouge_score, bleu_score, unique_sentence_percent, root_mean_square_error, mean_absolute_error, feature_detect, feature_matching_ratio, feature_coverage_ratio, feature_diversity
from evaluate.metrics4rec import evaluate_all

print("数据加载器与评价指标函数已导入")


数据加载器与评价指标函数已导入


Cell 10: Evaluation - Explanation 任务

功能说明：
加载 explanation 任务的数据生成器，调用模型生成输出，并计算 BLEU、ROUGE 指标。

In [15]:
print("Loading checkpoint from:", args.load)

# =============================================================================
# Cell 10: Evaluation - Explanation 任务（带 Prompt 信息）
# =============================================================================

import os
from datetime import datetime
from pathlib import Path
from tqdm import tqdm
import torch

# 如果 args.load 不为空，则从其中提取日期，否则使用当前日期
if args.load is not None:
    eval_date = Path(args.load).parents[1].name
else:
    eval_date = datetime.now().strftime("%m%d")

# 指定 Explanation 任务的 prompt 及样本数量
exp_prompt = 'C-3'  # 可修改为 'C-12', 'C-3' 等所需的 prompt 编号
test_task_list = {'explanation': [exp_prompt]}
test_sample_numbers = {'sequential': (1, 1), 'direct': (1, 1), 'explanation': 1}

# 获取 Explanation 任务的测试数据加载器
zeroshot_test_loader = get_loader(
    args,
    test_task_list,
    test_sample_numbers,
    split=args.test, 
    mode='test', 
    batch_size=args.batch_size,
    workers=args.num_workers,
    distributed=args.distributed,
    data_root="../data",
    feature_root="../features"
)
print(f"Explanation 任务 (Prompt: {exp_prompt}) 数据量:", len(zeroshot_test_loader))

tokens_predict = []
tokens_test = []

# 遍历测试数据加载器，调用模型生成预测结果
for _, batch in tqdm(enumerate(zeroshot_test_loader), total=len(zeroshot_test_loader), ncols=100):
    with torch.no_grad():
        results = model.generate_step(batch)
        tokens_predict.extend(results)
        tokens_test.extend(batch['target_text'])

# 计算 BLEU 与 ROUGE 指标
BLEU1 = bleu_score(tokens_test, tokens_predict, n_gram=1, smooth=False)
BLEU4 = bleu_score(tokens_test, tokens_predict, n_gram=4, smooth=False)
ROUGE = rouge_score(tokens_test, tokens_predict)

print(f'BLEU-1 {BLEU1:7.4f}, BLEU-4 {BLEU4:7.4f}')
for k, v in ROUGE.items():
    print(f'{k} {v:7.4f}')

# 构建保存评估结果的目录和文件名
eval_dir = Path("/scratch/guanguowei/Code/MyWork/VIP5_Shadowcast_DPA/log") \
           / args.split / eval_date / "evaluation_logs"
eval_dir.mkdir(parents=True, exist_ok=True)

# 保持与训练时相同的 base_name
suffix = args.attack_mode
mr = args.mr
dataset = args.split
img_feat = args.image_feature_type
reduction = args.reduction_factor
epoch = args.epoch
base_name = f"{suffix}_{mr}_VIP5_{dataset}_{img_feat}_{reduction}_{epoch}"

explanation_filename = f"{base_name}_eval_explanation_{exp_prompt}.txt"
explanation_log_path = eval_dir / explanation_filename

# 保存评估结果，文件头加入 Dataset、AttackMode 和 MaliciousRatio
with open(explanation_log_path, "w", encoding="utf-8") as f:
    f.write("Explanation Evaluation Results\n")
    f.write(f"Dataset: {dataset}\n")
    f.write(f"AttackMode: {args.attack_mode}\n")
    f.write(f"MaliciousRatio: {args.mr}\n")
    f.write(f"Prompt: {exp_prompt}\n\n")
    f.write(f"BLEU-1: {BLEU1:7.4f}\n")
    f.write(f"BLEU-4: {BLEU4:7.4f}\n")
    for k, v in ROUGE.items():
        f.write(f"{k}: {v:7.4f}\n")

print(f"Explanation 任务 (Prompt: {exp_prompt}) 评价结果已保存至: {explanation_log_path}")


Loading checkpoint from: /scratch/guanguowei/Code/MyWork/VIP5_Shadowcast_DPA/snap/toys/0425/DirectBoostingAttack_0.1_toys-vitb32-2-8-20/BEST_EVAL_LOSS.pth


The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'T5Tokenizer'. 
The class this function is called from is 'P5Tokenizer'.


Data sources:  ['toys']
[INFO] 加载扩展映射: ../data/toys/user_id2name_poisoned.pkl
compute_datum_info
Explanation 任务 (Prompt: C-3) 数据量: 646


100%|█████████████████████████████████████████████████████████████| 646/646 [01:32<00:00,  6.99it/s]


BLEU-1  2.9930, BLEU-4  1.0293
rouge_1/f_score  6.7310
rouge_1/r_score  5.1061
rouge_1/p_score 13.6365
rouge_2/f_score  1.2484
rouge_2/r_score  1.0052
rouge_2/p_score  2.3944
rouge_l/f_score  5.0368
rouge_l/r_score  4.8688
rouge_l/p_score 13.2084
Explanation 任务 (Prompt: C-3) 评价结果已保存至: /scratch/guanguowei/Code/MyWork/VIP5_Shadowcast_DPA/log/toys/0425/evaluation_logs/DirectBoostingAttack_0.1_VIP5_toys_vitb32_8_20_eval_explanation_C-3.txt


Cell 11: Evaluation - Direct 任务

功能说明：
加载 direct 任务的测试数据，生成输出并计算评价指标。

In [63]:
# =============================================================================
# Cell 11: Evaluation - Direct 任务（带 Prompt 信息）
# =============================================================================

import os
from datetime import datetime
from pathlib import Path
from tqdm import tqdm
import torch

# 1. 确定 eval_date：若从 checkpoint 恢复则取目录名，否则用当前日期
if args.load is not None:
    eval_date = Path(args.load).parents[1].name
else:
    eval_date = datetime.now().strftime("%m%d")

# 2. 指定 Direct 任务的 Prompt
test_task_list = {'direct': ['B-8']}  # 可选 'B-5' 或 'B-8'
prompt = test_task_list['direct'][0]

test_sample_numbers = {
    'sequential': (1, 1),
    'direct': (1, 1),
    'explanation': 1
}

# 3. 获取 Direct 测试 Loader
zeroshot_test_loader = get_loader(
    args,
    test_task_list,
    test_sample_numbers,
    split=args.test,
    mode='test',
    batch_size=args.batch_size,
    workers=args.num_workers,
    distributed=args.distributed,
    data_root="../data",
    feature_root="../features"
)

print(f"攻击模式：{args.attack_mode}，恶意比例：{args.mr}")
print(f"Direct 任务 (Prompt: {prompt}) 数据量:", len(zeroshot_test_loader))

# 4. 收集所有样本的 GT 与模型输出
all_info = []
for _, batch in tqdm(enumerate(zeroshot_test_loader), total=len(zeroshot_test_loader)):
    with torch.no_grad():
        results = model.generate_step(batch)
        beam_outputs = model.generate(
            input_ids=batch['input_ids'].cuda(),
            whole_word_ids=batch['whole_word_ids'].cuda(),
            category_ids=batch['category_ids'].cuda(),
            vis_feats=batch['vis_feats'].cuda(),
            task=batch["task"][0],
            max_length=50,
            num_beams=20,
            no_repeat_ngram_size=0,
            num_return_sequences=20,
            early_stopping=True
        )
        generated_sents = model.tokenizer.batch_decode(beam_outputs, skip_special_tokens=True)

        for j, (_, tgt_text, _) in enumerate(zip(results, batch['target_text'], batch['source_text'])):
            all_info.append({
                'target_item': tgt_text,
                'gen_item_list': generated_sents[j * 20: (j + 1) * 20]
            })

# 5. 构造 GT 与评分字典
gt = {}
ui_scores = {}
for i, info in enumerate(all_info):
    gt[i] = [int(info['target_item'])]
    pred_dict = {}
    for j, pred in enumerate(info['gen_item_list']):
        try:
            pred_dict[int(pred)] = -(j + 1)
        except:
            pass
    ui_scores[i] = pred_dict

# 6. 定义用于 ER@K 的目标集合
targeted_items = gt.copy()

# 7. 计算指标 + ER@K
msg1, res1 = evaluate_all(ui_scores, gt, topk=1,  targeted_items=targeted_items)
msg5, res5 = evaluate_all(ui_scores, gt, topk=5,  targeted_items=targeted_items)
msg10, res10 = evaluate_all(ui_scores, gt, topk=10, targeted_items=targeted_items)

print("\nMetrics @1:")
print(msg1)
print(f"ER@1: {res1['er']:.4f}")
print("\nMetrics @5:")
print(msg5)
print(f"ER@5: {res5['er']:.4f}")
print("\nMetrics @10:")
print(msg10)
print(f"ER@10: {res10['er']:.4f}")

# 8. 保存结果目录
eval_dir = Path("/scratch/guanguowei/Code/MyWork/VIP5_Shadowcast_DPA/log") \
           / args.split / eval_date / "evaluation_logs"
eval_dir.mkdir(parents=True, exist_ok=True)

# 9. 生成与训练时一致的 base_name，并组装文件名（改为 eval_direct）
suffix = args.attack_mode          # e.g. DirectBoostingAttack
mr = args.mr                       # e.g. 0.1
dataset = args.split               # e.g. toys
img_feat = args.image_feature_type # e.g. t5-small
reduction = args.reduction_factor  # e.g. 8
epoch = args.epoch                 # e.g. 20
base_name = f"{suffix}_{mr}_VIP5_{dataset}_{img_feat}_{reduction}_{epoch}"

direct_filename = f"{base_name}_eval_direct_{prompt}.txt"
direct_log_path = eval_dir / direct_filename

# 10. 写入文件，开头加入 Direct Evaluation Results 与 Dataset
with open(direct_log_path, "w", encoding="utf-8") as f:
    f.write("Direct Evaluation Results\n")
    f.write(f"Dataset: {dataset}\n\n")
    f.write(f"AttackMode: {args.attack_mode}\n")
    f.write(f"MaliciousRatio: {args.mr}\n\n")
    f.write("=== Metrics @1 ===\n")
    f.write(msg1 + "\n")
    f.write(f"ER@1: {res1['er']:.4f}\n\n")
    f.write("=== Metrics @5 ===\n")
    f.write(msg5 + "\n")
    f.write(f"ER@5: {res5['er']:.4f}\n\n")
    f.write("=== Metrics @10 ===\n")
    f.write(msg10 + "\n")
    f.write(f"ER@10: {res10['er']:.4f}\n")

print(f"Direct 结果已保存至: {direct_log_path}")


The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'T5Tokenizer'. 
The class this function is called from is 'P5Tokenizer'.


Data sources:  ['toys']
[INFO] 加载扩展映射: ../data/toys/user_id2name_poisoned.pkl
compute_datum_info
攻击模式：DirectBoostingAttack，恶意比例：0.1
Direct 任务 (Prompt: B-8) 数据量: 1214


100%|██████████| 1214/1214 [16:31<00:00,  1.22it/s]



NDCG@1	Rec@1	Hits@1	Prec@1	MAP@1	MRR@1	ER@1
0.0414	0.0414	0.0414	0.0414	0.0414	0.0414	0.0414

NDCG@5	Rec@5	Hits@5	Prec@5	MAP@5	MRR@5	ER@5
0.0826	0.1218	0.1218	0.0244	0.0696	0.0696	0.1218

NDCG@10	Rec@10	Hits@10	Prec@10	MAP@10	MRR@10	ER@10
0.1037	0.1879	0.1879	0.0188	0.0783	0.0783	0.1879

Metrics @1:

NDCG@1	Rec@1	Hits@1	Prec@1	MAP@1	MRR@1	ER@1
0.0414	0.0414	0.0414	0.0414	0.0414	0.0414	0.0414
ER@1: 0.0414

Metrics @5:

NDCG@5	Rec@5	Hits@5	Prec@5	MAP@5	MRR@5	ER@5
0.0826	0.1218	0.1218	0.0244	0.0696	0.0696	0.1218
ER@5: 0.1218

Metrics @10:

NDCG@10	Rec@10	Hits@10	Prec@10	MAP@10	MRR@10	ER@10
0.1037	0.1879	0.1879	0.0188	0.0783	0.0783	0.1879
ER@10: 0.1879
Direct 结果已保存至: /scratch/guanguowei/Code/MyWork/VIP5_Shadowcast_DPA/log/toys/0417/evaluation_logs/DirectBoostingAttack_0.1_VIP5_toys_vitb32_8_20_eval_direct_B-8.txt


Cell 12: Evaluation - Sequential 任务

功能说明：
加载 sequential 任务的测试数据，生成输出并计算评价指标，同时对 beam search 结果进行解码。

In [64]:
# =============================================================================
# Cell 12: Evaluation - Sequential 任务（带 Prompt 信息）
# =============================================================================

import os
from datetime import datetime
from pathlib import Path
from tqdm import tqdm
import torch

# 如果 args.load 不为空，则从 load 路径中提取日期，否则使用当前日期
if args.load is not None:
    eval_date = Path(args.load).parents[1].name
else:
    eval_date = datetime.now().strftime("%m%d")

# 指定 Sequential 任务的 prompt 及样本数量
test_task_list = {'sequential': ['A-3']}
prompt = test_task_list['sequential'][0]
test_sample_numbers = {'sequential': (1, 1), 'direct': (1, 1), 'explanation': 1}

# 获取 Sequential 任务的测试数据加载器
zeroshot_test_loader = get_loader(
    args,
    test_task_list,
    test_sample_numbers,
    split=args.test,
    mode='test',
    batch_size=args.batch_size,
    workers=args.num_workers,
    distributed=args.distributed,
    data_root="../data",
    feature_root="../features"
)

print(f"攻击模式：{args.attack_mode}，恶意比例：{args.mr}")
print(f"Sequential 任务 (Prompt: {prompt}) 数据量:", len(zeroshot_test_loader))

# 生成候选并收集结果
all_info = []
for _, batch in tqdm(enumerate(zeroshot_test_loader), total=len(zeroshot_test_loader), ncols=100):
    with torch.no_grad():
        # 单次生成
        results = model.generate_step(batch)
        # Beam search 多样本生成
        beam_outputs = model.generate(
            input_ids=batch['input_ids'].cuda(),
            whole_word_ids=batch['whole_word_ids'].cuda(),
            category_ids=batch['category_ids'].cuda(),
            vis_feats=batch['vis_feats'].cuda(),
            task=batch["task"][0],
            max_length=50,
            num_beams=20,
            no_repeat_ngram_size=0,
            num_return_sequences=20,
            early_stopping=True
        )
        generated_sents = model.tokenizer.batch_decode(beam_outputs, skip_special_tokens=True)

        for j in range(len(batch['target_text'])):
            all_info.append({
                'target_item': batch['target_text'][j],
                'gen_item_list': generated_sents[j * 20: (j + 1) * 20]
            })

# 构造 GT 与评分字典
gt = {}
ui_scores = {}
for i, info in enumerate(all_info):
    gt[i] = [int(info['target_item'])]
    pred_dict = {}
    for j, pred in enumerate(info['gen_item_list']):
        try:
            pred_dict[int(pred)] = -(j + 1)
        except:
            pass
    ui_scores[i] = pred_dict

# 定义目标集合 & 计算指标
targeted_items = gt.copy()
msg1, res1 = evaluate_all(ui_scores, gt, topk=1, targeted_items=targeted_items)
msg5, res5 = evaluate_all(ui_scores, gt, topk=5, targeted_items=targeted_items)
msg10, res10 = evaluate_all(ui_scores, gt, topk=10, targeted_items=targeted_items)

print("\nMetrics @1:", msg1, f"ER@1: {res1['er']:.4f}")
print("Metrics @5:", msg5, f"ER@5: {res5['er']:.4f}")
print("Metrics @10:", msg10, f"ER@10: {res10['er']:.4f}")

# 构建保存目录
eval_dir = Path("/scratch/guanguowei/Code/MyWork/VIP5_Shadowcast_DPA/log") \
           / args.split / eval_date / "evaluation_logs"
eval_dir.mkdir(parents=True, exist_ok=True)

# 文件名保持与训练一致的前缀，然后替换为 eval_sequential
suffix = args.attack_mode
mr = args.mr
dataset = args.split
img_feat = args.image_feature_type
reduction = args.reduction_factor
epoch = args.epoch
base_name = f"{suffix}_{mr}_VIP5_{dataset}_{img_feat}_{reduction}_{epoch}"

sequential_filename = f"{base_name}_eval_sequential_{prompt}.txt"
sequential_log_path = eval_dir / sequential_filename

# 写入文件，开头加上 Dataset、AttackMode、MaliciousRatio 和 Prompt
with open(sequential_log_path, "w", encoding="utf-8") as f:
    f.write("Sequential Evaluation Results\n")
    f.write(f"Dataset: {dataset}\n")
    f.write(f"AttackMode: {args.attack_mode}\n")
    f.write(f"MaliciousRatio: {args.mr}\n")
    f.write(f"Prompt: {prompt}\n\n")
    f.write("=== Metrics @1 ===\n")
    f.write(msg1 + "\n")
    f.write(f"ER@1: {res1['er']:.4f}\n\n")
    f.write("=== Metrics @5 ===\n")
    f.write(msg5 + "\n")
    f.write(f"ER@5: {res5['er']:.4f}\n\n")
    f.write("=== Metrics @10 ===\n")
    f.write(msg10 + "\n")
    f.write(f"ER@10: {res10['er']:.4f}\n")

print(f"Sequential 结果已保存至: {sequential_log_path}")


The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'T5Tokenizer'. 
The class this function is called from is 'P5Tokenizer'.


Data sources:  ['toys']
[INFO] 加载扩展映射: ../data/toys/user_id2name_poisoned.pkl
compute_datum_info
攻击模式：DirectBoostingAttack，恶意比例：0.1
Sequential 任务 (Prompt: A-3) 数据量: 1214


100%|███████████████████████████████████████████████████████████| 1214/1214 [11:17<00:00,  1.79it/s]



NDCG@1	Rec@1	Hits@1	Prec@1	MAP@1	MRR@1	ER@1
0.0506	0.0506	0.0506	0.0506	0.0506	0.0506	0.0506

NDCG@5	Rec@5	Hits@5	Prec@5	MAP@5	MRR@5	ER@5
0.0619	0.0721	0.0721	0.0144	0.0586	0.0586	0.0721

NDCG@10	Rec@10	Hits@10	Prec@10	MAP@10	MRR@10	ER@10
0.0647	0.0805	0.0805	0.0081	0.0597	0.0597	0.0805

Metrics @1: 
NDCG@1	Rec@1	Hits@1	Prec@1	MAP@1	MRR@1	ER@1
0.0506	0.0506	0.0506	0.0506	0.0506	0.0506	0.0506 ER@1: 0.0506
Metrics @5: 
NDCG@5	Rec@5	Hits@5	Prec@5	MAP@5	MRR@5	ER@5
0.0619	0.0721	0.0721	0.0144	0.0586	0.0586	0.0721 ER@5: 0.0721
Metrics @10: 
NDCG@10	Rec@10	Hits@10	Prec@10	MAP@10	MRR@10	ER@10
0.0647	0.0805	0.0805	0.0081	0.0597	0.0597	0.0805 ER@10: 0.0805
Sequential 结果已保存至: /scratch/guanguowei/Code/MyWork/VIP5_Shadowcast_DPA/log/toys/0417/evaluation_logs/DirectBoostingAttack_0.1_VIP5_toys_vitb32_8_20_eval_sequential_A-3.txt
