# 2025全球AI攻防挑战赛-赛道一：图片全要素交互认证-生成赛

https://tianchi.aliyun.com/s/90a8875e183258623b8e8cb45967d505

## 任务说明

在生成式人工智能（GenAI）蓬勃发展的时代，图像生成技术既展现了令人惊叹的创新能力，也为社会诚信体系带来了前所未有的挑战。从金融、广告等新兴领域，到医疗、政府等传统行业，各类业务场景都依赖于用户提交的图片材料进行身份验证。然而，随着不法分子日益熟练地运用PS、AIGC等工具伪造高仿真图片，真实性鉴别已成为制约行业自动化审核发展的关键瓶颈。

为此，我们开设“图片全要素场景攻防”专项赛道，在生成阶段，参赛队伍需按照赛事规范生成高度逼真的模拟图片，挑战现有防伪系统的识别极限。我们期待通过这场攻防对抗，推动数字图片真伪鉴别技术的突破性发展，为构建可信的数字社会基础设施贡献力量。

## 比赛任务

本赛题分为四个子任务，包括：AIGC图片生成、自然场景图片编辑、视觉文本编辑和Deepfake。参赛者需要尝试使用不同的模型或优化算法，完成所有子任务。各任务具体定义如下：

- **AIGC图片生成**：根据给定的文本提示，生成真实且美观的图片。
- **自然场景图片编辑**：根据提供的图片和编辑指令，对原图指定区域的内容进行修改。
- **视觉文本编辑**：根据提供的图片和编辑指令，对原图中的文字信息进行编辑或替换。
- **Deepfake**：利用给定的人脸图片，对目标图片中的人脸进行替换处理。

上述任务允许参赛者灵活选择和调整模型或prompt，或通过优化算法提高任务完成的质量，以充分展示技术能力与创新水平。

## 数据集介绍

本赛题提供了1000条评测数据，每类任务约250条。评测数据及任务详情记录在 task.csv 文件中，而相关图片保存在 imgs 文件夹内。每张图片均有唯一的标识符（ID），图片名称的格式为 id.jpg。数据集的具体格式如下：

| index |        task_type        | prompt | ori_image | target_image |
| :---: | :---------------------: | :----: | :-------: | :----------: |
|   1   |   t2i（AIGC图片生成）   |  xxxx  |           |              |
|   2   | tie（自然场景图片编辑） |  xxxx  | xxxx.jpg  |              |
|   3   |  vttie（视觉文本编辑）  |  xxxx  | xxxx.jpg  |              |
|   4   |        deepfake         |        | xxxx.jpg  |   xxxx.jpg   |

1. 本赛题提供了1000条评测数据，每类任务约250条。评测数据及任务详情记录在 task.csv 文件中，而相关图片保存在 imgs 文件夹内。每张图片均有唯一的标识符（ID），图片名称的格式为 id.jpg。
2. 我们使用了开源数据集来构建本次实验的数据基础。具体而言，从emu_edit_test_set数据集和GEdit-Bench数据集中分别随机抽取了250条样本，用于图像编辑任务和视觉文本编辑任务。此外，为了支持deepfake任务，我们从人脸识别评测集LFW（Labeled Faces in the Wild）中选取了500张不同的人脸图像。所有数据均来源于公开可用的开源数据集，并严格遵守其使用许可协议。

## 评价指标

本赛题采用 VIEScore 作为评价指标，该指标由两个子分项组成：SC Score 和 PQ Score。其中，SC Score 用于评估生成内容的一致性，而 PQ Score 用于衡量生成内容的质量和真实性。评分范围为 0~10 分，得分越高表示生成效果越优。

$$
\text { VIEScore }=\sqrt{\text { sc\_score } * \text { pq\_score }}
$$


其中deepfake任务使用人脸特征余弦相似度替代原始SC Score。

$$
\text { VIEScore }=\sqrt{10 * \cos \left(f_{\text {target }}, f_{\text {gen }}\right) * p q \_ \text {score }}
$$

## 代码审核

- 参赛者需提交完整的代码，包括数据预处理、特征工程、模型训练和预测等步骤。代码需使用Python编写，并附上详细的注释，以便于评审团进行审核。代码需能够完整地复现参赛者的预测结果。
- 在比赛结束后，组织者将对入围决赛的参赛者的代码进行审核。代码需满足以下要求：
  - 代码逻辑清晰，易于理解；
  - 代码注释充分，方便他人阅读；
  - 代码运行无报错，能够复现预测结果。
- 为衡量模型的鲁棒性，排除掉其他滋扰因素。 因此，最终的基准评分受到数据的限制：
  - 不允许使用测试数据进行手动打标训练。
- 参赛期间，不得在不同团队之间私下共享代码，任何此类共享比赛代码的行为都将导致取消资格。

## Baseline

选手可以使用不同开源模型或生成接口API，例如通义万相、flux、Bagel等。

## 任务提交说明

比赛期间，参赛队伍需通过天池平台下载数据，并在本地进行算法调试，随后在线提交结果文件。结果文件需命名为 "参赛队名称_图片生成_result.zip"，其中 ZIP 包需要包含两个内容：result.csv 文件和 imgs 文件夹。ZIP包大小需要限制在2G以内。

- result.csv 文件需以 utf-8-sig 编码格式保存，每一行对应任务生成的结果，包含 "index" 和"gen_image" 两个字段：
  1. "index" 字段要求与评测集 CSV 的顺序和数量完全一致。
  2. "gen_image" 字段为生成图片的文件名称，格式需为 "index.jpg"。
- 所有生成的图片需保存在 imgs 文件夹中，整体与 result.csv 一同打包提交。

In [8]:
import pandas as pd
import requests
import glob
import time
import cv2
import dlib
import numpy as np
from matplotlib import pyplot as plt
import traceback

In [9]:
task = pd.read_csv("data/task.csv")
task

Unnamed: 0,index,task_type,prompt,ori_image,target_image
0,1,t2i,一棵令人惊叹的树从太空中的一颗小行星上生长出来，细节复杂，自然光线照射，呈现出超现实的3D渲...,,
1,2,t2i,宁静的天空中，以概念艺术风格绘制的平和蓬松的云朵，分辨率为4K。,,
2,3,t2i,一张逼真的照片，展示了一位皮肤光滑的男性机器人，它像雕像一样在工厂中摆出姿势。,,
3,4,t2i,这是一幅以平滑的数字肖像插画概念艺术风格描绘的身穿宇航服的年轻宇航员。,,
4,5,t2i,A vibrant cityscape at dawn with sharp focus a...,,
...,...,...,...,...,...
995,996,deepfake,,4c234474342546df9034a19dbb230bed.jpg,627b2ac2e0194fc89c015c769decfe6d.jpg
996,997,deepfake,,0aa8cb980be845a4a30b2248f9abe226.jpg,4d00dbabc9ff47fcba54c21d64bcc458.jpg
997,998,deepfake,,a701aec125ce459b9cff76eadd7e0263.jpg,2b8c21b179ec4c1187d60682f7fca808.jpg
998,999,deepfake,,70eff544233242dab682d2d209fcc2e3.jpg,77932521bc3c440eba39c34eafcdee25.jpg


In [10]:
import requests
import os
import time
from urllib.parse import urlparse

def download_image(image_url: str, save_dir: str = "./downloads", filename: str = None) -> str:
    os.makedirs(save_dir, exist_ok=True)
    
    if not filename:
        url_path = urlparse(image_url).path
        filename = os.path.basename(url_path) or "downloaded_image.jpg"
        
        if not os.path.splitext(filename)[1]:
            filename += ".jpg"
    
    response = requests.get(image_url, stream=True)
    response.raise_for_status()  # Raise error for bad status codes
    
    content_type = response.headers.get('content-type', '')
    if 'image/' not in content_type:
        raise ValueError(f"URL does not point to an image (content-type: {content_type})")
    
    file_path = os.path.join(save_dir, filename)
    with open(file_path, 'wb') as f:
        for chunk in response.iter_content(1024):
            f.write(chunk)
    
    return file_path

In [11]:
# 填入你的key
# https://bailian.console.aliyun.com/?tab=model#/api-key
DASHSCOPE_API_KEY = "sk-6XXX"


def convert_text_prompt_to_image(prompt: str, negative_prompt: str = None):
    """
    https://bailian.console.aliyun.com/?tab=api#/api/?type=model&url=https%3A%2F%2Fhelp.aliyun.com%2Fdocument_detail%2F2862677.html&renderType=iframe
    """
    
    url4generate = "https://dashscope.aliyuncs.com/api/v1/services/aigc/text2image/image-synthesis"
    headers = {
        "X-DashScope-Async": "enable",
        "Authorization": f"Bearer {DASHSCOPE_API_KEY}",
        "Content-Type": "application/json"
    }
    
    data = {
        "model": "wanx2.1-t2i-turbo",
        "input": {
            "prompt": prompt,
        },
        "parameters": {
            "size": "1024*1024",
            "n": 1
        }
    }

    if negative_prompt:
        data["input"]["negative_prompt"] = negative_prompt

    task_id = None
    for _ in range(3):
        try:
            response = requests.post(url4generate, headers=headers, json=data)
            task_id = response.json()["output"]["task_id"]
            break
        except:
            pass

    if not task_id:
        try:
            data["input"]["prompt"] = "穿着黄色衣服的小男孩"
            response = requests.post(url4generate, headers=headers, json=data)
            task_id = response.json()["output"]["task_id"]
        except:
            print(data)
            traceback.print_exc()
            raise Exception
    
    url4result = f"https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}"
    headers = {
        "Authorization": f"Bearer {DASHSCOPE_API_KEY}"
    }
    for _ in range(60):
        try:
            response = requests.get(url4result, headers=headers)
            if response.json()["output"]["task_status"] == "SUCCEEDED":
                return response.json()["output"]["results"][0]["url"]
            else:
               time.sleep(2)
        except:
            pass

    return None

def image_edit_by_prompt(function_str:str, prompt: str, base_image_url:str):
    """
    https://help.aliyun.com/zh/model-studio/wanx-image-edit-api-reference?spm=a2c4g.11186623.help-menu-2400256.d_2_2_2.471812fbpTxg7w&scm=20140722.H_2868981._.OR_help-T_cn~zh-V_1
    """
    url4generate = "https://dashscope.aliyuncs.com/api/v1/services/aigc/image2image/image-synthesis"
    headers = {
        "X-DashScope-Async": "enable",
        "Authorization": f"Bearer {DASHSCOPE_API_KEY}",
        "Content-Type": "application/json"
    }

    data = {
      "model": "wanx2.1-imageedit",
      "input": {
        "function": function_str,
        "prompt": prompt,
        "base_image_url": base_image_url
      },
      "parameters": {
        "n": 1
      }
    }

    try:
        response = requests.post(url4generate, headers=headers, json=data)
        task_id = response.json()["output"]["task_id"]
    except:
        pass

    
    url4result = f"https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}"
    headers = {
        "Authorization": f"Bearer {DASHSCOPE_API_KEY}"
    }
    for _ in range(60):
        try:
            response = requests.get(url4result, headers=headers)
            if response.json()["output"]["task_status"] == "SUCCEEDED":
                return response.json()["output"]["results"][0]["url"]
            else:
               time.sleep(2) 
        except:
            pass

    return None

In [12]:
detector = dlib.get_frontal_face_detector()
# https://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
predictor = dlib.shape_predictor("./shape_predictor_68_face_landmarks.dat")

def face_swap_using_dlib(source_img_path: str, aim_img_path: str, save_img_path: str):
    face = cv2.imread(source_img_path)
    body = cv2.imread(aim_img_path)

    face_gray = cv2.cvtColor(face, cv2.COLOR_BGR2GRAY)
    body_gray = cv2.cvtColor(body, cv2.COLOR_BGR2GRAY)
    
    # Create empty matrices in the images' shapes
    height, width = face_gray.shape
    mask = np.zeros((height, width), np.uint8)
    
    height, width, channels = body.shape

    rect = detector(face_gray)[0]
    
    # This creates a with 68 pairs of integer values — these values are the (x, y)-coordinates of the facial structures 
    landmarks = predictor(face_gray, rect)
    landmarks_points = [] 
    
    def get_landmarks(landmarks, landmarks_points):
      for n in range(68):
          x = landmarks.part(n).x
          y = landmarks.part(n).y
          landmarks_points.append((x, y))
    
    get_landmarks(landmarks, landmarks_points)
    
    points = np.array(landmarks_points, np.int32)


    convexhull = cv2.convexHull(points) 
    face_cp = face.copy()
    face_image_1 = cv2.bitwise_and(face, face, mask=mask)


    rect = cv2.boundingRect(convexhull)
    
    subdiv = cv2.Subdiv2D(rect) # Creates an instance of Subdiv2D
    subdiv.insert(landmarks_points) # Insert points into subdiv
    triangles = subdiv.getTriangleList()
    triangles = np.array(triangles, dtype=np.int32)
    
    indexes_triangles = []
    face_cp = face.copy()
    
    def get_index(arr):
        index = 0
        if arr[0]:
            index = arr[0][0]
        return index
    
    for triangle in triangles :
    
        # Gets the vertex of the triangle
        pt1 = (triangle[0], triangle[1])
        pt2 = (triangle[2], triangle[3])
        pt3 = (triangle[4], triangle[5])
        
        # Draws a line for each side of the triangle
        cv2.line(face_cp, pt1, pt2, (255, 255, 255), 3,  0)
        cv2.line(face_cp, pt2, pt3, (255, 255, 255), 3,  0)
        cv2.line(face_cp, pt3, pt1, (255, 255, 255), 3,  0)
    
        index_pt1 = np.where((points == pt1).all(axis=1))
        index_pt1 = get_index(index_pt1)
        index_pt2 = np.where((points == pt2).all(axis=1))
        index_pt2 = get_index(index_pt2)
        index_pt3 = np.where((points == pt3).all(axis=1))
        index_pt3 = get_index(index_pt3)
    
        # Saves coordinates if the triangle exists and has 3 vertices
        if index_pt1 is not None and index_pt2 is not None and index_pt3 is not None:
            vertices = [index_pt1, index_pt2, index_pt3]
            indexes_triangles.append(vertices)

    # Getting landmarks for the face that will have the first one swapped into
    rect2 = detector(body_gray)[0]
    
    # This creates a with 68 pairs of integer values — these values are the (x, y)-coordinates of the facial structures 
    landmarks_2 = predictor(body_gray, rect2)
    landmarks_points2 = []
    
    # Uses the function declared previously to get a list of the landmark coordinates
    get_landmarks(landmarks_2, landmarks_points2)
    
    # Generates a convex hull for the second person
    points2 = np.array(landmarks_points2, np.int32)
    convexhull2 = cv2.convexHull(points2)
    
    body_cp = body.copy()


    lines_space_new_face = np.zeros((height, width, channels), np.uint8)
    body_new_face = np.zeros((height, width, channels), np.uint8)
    
    height, width = face_gray.shape
    lines_space_mask = np.zeros((height, width), np.uint8)
    
    
    for triangle in indexes_triangles:
    
        # Coordinates of the first person's delaunay triangles
        pt1 = landmarks_points[triangle[0]]
        pt2 = landmarks_points[triangle[1]]
        pt3 = landmarks_points[triangle[2]]
    
        # Gets the delaunay triangles
        (x, y, widht, height) = cv2.boundingRect(np.array([pt1, pt2, pt3], np.int32))
        cropped_triangle = face[y: y+height, x: x+widht]
        cropped_mask = np.zeros((height, widht), np.uint8)
    
        # Fills triangle to generate the mask
        points = np.array([[pt1[0]-x, pt1[1]-y], [pt2[0]-x, pt2[1]-y], [pt3[0]-x, pt3[1]-y]], np.int32)
        cv2.fillConvexPoly(cropped_mask, points, 255)
    
        # Draws lines for the triangles
        cv2.line(lines_space_mask, pt1, pt2, 255)
        cv2.line(lines_space_mask, pt2, pt3, 255)
        cv2.line(lines_space_mask, pt1, pt3, 255)
    
        lines_space = cv2.bitwise_and(face, face, mask=lines_space_mask)
    
        # Calculates the delaunay triangles of the second person's face
    
        # Coordinates of the first person's delaunay triangles
        pt1 = landmarks_points2[triangle[0]]
        pt2 = landmarks_points2[triangle[1]]
        pt3 = landmarks_points2[triangle[2]]
    
        # Gets the delaunay triangles
        (x, y, widht, height) = cv2.boundingRect(np.array([pt1, pt2, pt3], np.int32))
        cropped_mask2 = np.zeros((height,widht), np.uint8)
    
        # Fills triangle to generate the mask
        points2 = np.array([[pt1[0]-x, pt1[1]-y], [pt2[0]-x, pt2[1]-y], [pt3[0]-x, pt3[1]-y]], np.int32)
        cv2.fillConvexPoly(cropped_mask2, points2, 255)
    
        # Deforms the triangles to fit the subject's face : https://docs.opencv.org/3.4/d4/d61/tutorial_warp_affine.html
        points =  np.float32(points)
        points2 = np.float32(points2)
        M = cv2.getAffineTransform(points, points2)  # Warps the content of the first triangle to fit in the second one
        dist_triangle = cv2.warpAffine(cropped_triangle, M, (widht, height))
        dist_triangle = cv2.bitwise_and(dist_triangle, dist_triangle, mask=cropped_mask2)
    
        # Joins all the distorted triangles to make the face mask to fit in the second person's features
        body_new_face_rect_area = body_new_face[y: y+height, x: x+widht]
        body_new_face_rect_area_gray = cv2.cvtColor(body_new_face_rect_area, cv2.COLOR_BGR2GRAY)
    
        # Creates a mask
        masked_triangle = cv2.threshold(body_new_face_rect_area_gray, 1, 255, cv2.THRESH_BINARY_INV)
        dist_triangle = cv2.bitwise_and(dist_triangle, dist_triangle, mask=masked_triangle[1])
    
        # Adds the piece to the face mask
        body_new_face_rect_area = cv2.add(body_new_face_rect_area, dist_triangle)
        body_new_face[y: y+height, x: x+widht] = body_new_face_rect_area

    body_face_mask = np.zeros_like(body_gray)
    body_head_mask = cv2.fillConvexPoly(body_face_mask, convexhull2, 255)
    body_face_mask = cv2.bitwise_not(body_head_mask)
    
    body_maskless = cv2.bitwise_and(body, body, mask=body_face_mask)
    result = cv2.add(body_maskless, body_new_face)

    # Gets the center of the face for the body
    (x, y, widht, height) = cv2.boundingRect(convexhull2)
    center_face2 = (int((x+x+widht)/2), int((y+y+height)/2))
    
    seamlessclone = cv2.seamlessClone(result, body, body_head_mask, center_face2, cv2.NORMAL_CLONE)    
    cv2.imwrite(save_img_path, seamlessclone)

In [13]:
os.makedirs("imgs", exist_ok=True)

In [22]:
for row in task.iterrows():
    if os.path.exists("imgs/" + str(row[1]['index']) + ".jpg"):
        continue

    url = "http://mirror.coggle.club/tianchi/532389/imgs/01309910e64f48caaffe5b3db8b344e3.jpg"
    download_image(url, "./imgs/", str(row[1]['index']) + ".jpg")

    if row[1].task_type == "t2i":
        url = convert_text_prompt_to_image(row[1].prompt)
        if not url:
            url = "http://mirror.coggle.club/tianchi/532389/imgs/01309910e64f48caaffe5b3db8b344e3.jpg"
        download_image(url, "./imgs/", str(row[1]['index']) + ".jpg")
    
    if row[1].task_type == "tie":
        url = image_edit_by_prompt("stylization_all", row[1].prompt, "http://mirror.coggle.club/tianchi/532389/imgs/" + row[1].ori_image)
        if not url:
            url = "http://mirror.coggle.club/tianchi/532389/imgs/01309910e64f48caaffe5b3db8b344e3.jpg"

        download_image(url, "./imgs/", str(row[1]['index']) + ".jpg")
    
    if row[1].task_type == "vttie":
        url = image_edit_by_prompt("description_edit", row[1].prompt, "http://mirror.coggle.club/tianchi/532389/imgs/" + row[1].ori_image)
        if not url:
            url = "http://mirror.coggle.club/tianchi/532389/imgs/01309910e64f48caaffe5b3db8b344e3.jpg"

        download_image(url, "./imgs/", str(row[1]['index']) + ".jpg")
    
    if row[1].task_type == "deepfake":
        try:
            face_swap_using_dlib(
                "./data/imgs/" + row[1]['ori_image'], 
                "./data/imgs/" + row[1]['target_image'],
                "./imgs/" + str(row[1]['index']) + ".jpg"
            )
        except:
            url = "http://mirror.coggle.club/tianchi/532389/imgs/01309910e64f48caaffe5b3db8b344e3.jpg"
            download_image(url, "./imgs/", str(row[1]['index']) + ".jpg")


In [23]:
!\rm -rf ./imgs/.ipynb_checkpoints/

In [31]:
task["gen_image"] = task["index"].astype(str) + ".jpg"

In [30]:
task[["index", "gen_image"]].to_csv("result.csv", index=None)

In [32]:
!zip -r result.zip result.csv imgs/

  adding: result.csv (deflated 63%)
  adding: imgs/ (stored 0%)
 (deflated 0%)/63.jpg
  adding: imgs/823.jpg (deflated 0%)
  adding: imgs/189.jpg (deflated 0%)
 (deflated 0%)/77.jpg
  adding: imgs/837.jpg (deflated 1%)
  adding: imgs/638.jpg (deflated 0%)
  adding: imgs/604.jpg (deflated 0%)
  adding: imgs/162.jpg (deflated 0%)
 (deflated 0%)/176.jpg
  adding: imgs/88.jpg (deflated 0%)
 (deflated 0%)/610.jpg
  adding: imgs/348.jpg (deflated 0%)
  adding: imgs/360.jpg (deflated 0%)
  adding: imgs/406.jpg (deflated 0%)
  adding: imgs/412.jpg (deflated 0%)
  adding: imgs/374.jpg (deflated 0%)
 (deflated 0%)/599.jpg
  adding: imgs/228.jpg (deflated 0%)
  adding: imgs/214.jpg (deflated 0%)
  adding: imgs/572.jpg (deflated 0%)
 (deflated 0%)/566.jpg
  adding: imgs/200.jpg (deflated 0%)
  adding: imgs/957.jpg (deflated 1%)
  adding: imgs/943.jpg (deflated 1%)
  adding: imgs/994.jpg (deflated 1%)
  adding: imgs/758.jpg (deflated 1%)
  adding: imgs/980.jpg (deflated 0%)
  adding: imgs/770.jpg (

In [34]:
!ls imgs/ | wc -l

    1000
