# XDAC Inference Example

250623



In [None]:
## Install
!pip install captum

Collecting captum
  Downloading captum-0.8.0-py3-none-any.whl.metadata (26 kB)
Collecting numpy<2.0 (from captum)
  Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.0/61.0 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.10->captum)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.10->captum)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.10->captum)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.10->captum)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1

In [None]:
## 0) download XDAC model/data from huggingface (90s)
import os

XDAC_root_path = './XDAC'

from huggingface_hub import snapshot_download
snapshot_download(
    repo_id="keepsteady/XDAC",
    local_dir=XDAC_root_path,
    local_dir_use_symlinks=False
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
For more details, check out https://huggingface.co/docs/huggingface_hub/main/en/guides/download#download-files-to-local-folder.


Fetching 22 files:   0%|          | 0/22 [00:00<?, ?it/s]

model.safetensors:  32%|###2      | 210M/647M [00:00<?, ?B/s]

XDAC_D.py:   0%|          | 0.00/39.7k [00:00<?, ?B/s]

XDAC_Unified.py:   0%|          | 0.00/28.4k [00:00<?, ?B/s]

model.safetensors:  37%|###7      | 262M/700M [00:00<?, ?B/s]

'/content/XDAC'

In [None]:
## 1) Load Korean LGC dataset
import json
from datasets import Dataset
from pprint import pprint

path_data = os.path.join(XDAC_root_path, './LGC_data/LGC_data_v1.0.json')

with open(path_data, 'r', encoding='utf-8') as f:
  data_list = json.load(f)
  dataset_LGC = Dataset.from_list(data_list)

print(dataset_LGC)
pprint(dataset_LGC[-1])

Dataset({
    features: ['idx', 'comment_language', 'llm_model_family', 'llm_model_selection', 'temperature', 'enhancing_comment_naturalness', 'sentiment', 'sentiment_subtype_selection', 'reference_augmented_generation_strategy', 'reference_comment_input', 'reference_news_input', 'reference_opinion_input', 'generated_comment', 'toxicity'],
    num_rows: 2000
})
{'comment_language': 'Korean',
 'enhancing_comment_naturalness': '',
 'generated_comment': '이런 망나니가 국회의원이라니 XXXX 정치는 끝났다. 국민들이 다 바보인가?',
 'idx': 6396,
 'llm_model_family': 'Claude',
 'llm_model_selection': 'claude-3-5-haiku-20241022',
 'reference_augmented_generation_strategy': 'Opinion-based Generation',
 'reference_comment_input': True,
 'reference_news_input': True,
 'reference_opinion_input': True,
 'sentiment': 'negative',
 'sentiment_subtype_selection': 'Promotion of Social Division: '
                                'Discrimination/Blame of Specific Groups',
 'temperature': 0.9,
 'toxicity': True}


In [None]:
## 2) Load XDAC
from XDAC.XDAC_Unified import AIUnifiedEngine

print("XDAC Unified Engine: AI Detection & Attribution")
print("=" * 60)

device = 'cuda'
# device = 'cpu'

# Initialize unified engine
unified_engine = AIUnifiedEngine(
    detection_model_path=os.path.join(XDAC_root_path, 'XDAC-D'),    # Path to XDAC-D model
    attribution_model_path=os.path.join(XDAC_root_path, 'XDAC-A'),  # Path to XDAC-A model
    device=device,                      # or 'cpu', or None for auto-detection
    xai_enabled=True                    # Enable XAI analysis
)

XDAC Unified Engine: AI Detection & Attribution
XDAC Unified Engine: Detection & Attribution
Initializing Detection Engine (XDAC-D)...
Loading XDAC-D model from: ./XDAC/XDAC-D
Using device: cuda
Model loaded successfully!

Initializing Attribution Engine (XDAC-A)...
Loading XDAC-A model from: ./XDAC/XDAC-A
Using device: cuda
Model loaded successfully!

Unified Engine initialized successfully!
Device: cuda
XAI enabled: True


In [None]:
## 3) Run XDAC
multiple_texts = [
  '서울대도 옮기고 싶냐? 대체 어디까지 욕심 부릴 거냐?',
  '17조 수출에 12조 지원이라니! 🤔 거의 뭐 퍼주는 수준 아닌가?! 그래도 국뽕 차오르네 🤣',
  '세종 투기꾼들 또 설레발 치는 거 아냐? 진짜 짜증난다.',
  '이런 공약으로 표 얻으려는 게 너무 뻔히 보인다.',
  '김 전 XX관의 사퇴는 옳은 결정이었어요. 자녀의 학교폭력 문제에 대한 진상이 조만간 밝혀지길 바라요.',
  '똑바로 좀 해라 똑바로 어??   잘 해봐 좀!!!',
  '염병 ㅋㅋㅋㅋzzzzzzz   놀고 앉았네',
  '이게 참말로 말이됀다구? 이거 조작 아냐???',
  '빠람빠빠밤 빠라빠라빰 빠라라라라~~~',
  '와나 \n\n\n진짜 어이털리네???? 이거 조작 아냐??',
]

# Unified inference
results = unified_engine.predict(multiple_texts, batch_size=10)
unified_engine.print_results(results, save_path='result_XDAC.txt')
unified_engine.save_xai_results_to_html(results, html_file_path='result_XDAC.html')

# Get top predictions for detailed analysis
top_results = unified_engine.get_top_predictions(results, top_k=3)

=== Stage 1: AI Detection (10 samples) ===


XAI predicting: 100%|██████████| 10/10 [00:06<00:00,  1.53it/s]

=== Stage 2: AI Attribution (5 LLM samples) ===

AI Unified Detection & Attribution Results (10 samples) 1.5/s | 6.60s
  1. 서울대도 옮기고 싶냐? 대체 어디까지 욕심 부릴 거냐?                     -> LLM   ( 99.6%) | Llama ( 88.4%) | llama-3-Korean-Bllossom-8B ( 65.0%)
  => XAI Important: ['옮기', '대체', '욕심']
--------------------------------------------------------------------------------
  2. 17조 수출에 12조 지원이라니! 🤔 거의 뭐 퍼주는 수준 아닌가?! 그래도 국뽕 차오르네... -> LLM   (100.0%) | Gemini (100.0%) | gemini-2.0-flash-001 (100.0%)
  => XAI Important: ['수출', '지원이라니! ', ' 거의', '아닌가?! ', '국뽕']
--------------------------------------------------------------------------------
  3. 세종 투기꾼들 또 설레발 치는 거 아냐? 진짜 짜증난다.                    -> LLM   (100.0%) | Gemini (100.0%) | gemini-2.0-flash-001 (100.0%)
  => XAI Important: ['아냐?', '짜증난다.']
--------------------------------------------------------------------------------
  4. 이런 공약으로 표 얻으려는 게 너무 뻔히 보인다.                        -> LLM   (100.0%) | HCX ( 96.6%) | HCX-003 ( 96.5%)
  => XAI Impo


