# 2025 DATA·AI 분석 경진대회 - 논문·데이터 추천 에이전트

이 노트북은 Qwen3-14B 기반 연구 데이터/논문 추천 시스템의 추론을 수행합니다.

**실행 환경**
- GPU: NVIDIA RTX 3080 이상 (INT8 양자화 시 14GB VRAM)
- CUDA: 11.8+
- Python: 3.10+3

## 추론 (Inference)

### 1. 환경 설정 및 라이브러리 임포트

In [1]:
import sys
import os
import logging

# TODO: 프로젝트 루트 경로를 명확하게 설정
# 이 노트북은 paper-reco-agent/notebooks/ 폴더에 위치
project_root = '/home/infidea/backup-data/paper-reco-agent'

# sys.path에 프로젝트 루트 추가
sys.path.insert(0, project_root)

print(f"프로젝트 루트: {project_root}")

# 로깅 설정 (주피터 셀에서 볼 수 있도록)
# 로그 디렉토리 생성
os.makedirs(os.path.join(project_root, 'logs'), exist_ok=True)

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.StreamHandler(sys.stdout),  # 주피터 셀에 출력
        logging.FileHandler(os.path.join(project_root, 'logs/app.log'))  # 파일에도 저장
    ],
    force=True  # 기존 로깅 설정 덮어쓰기
)

print(f"✅ 로깅 설정 완료 (콘솔 + 파일)")

# 필수 라이브러리 임포트
import asyncio
import json
from dotenv import load_dotenv

# 환경 변수 로드
env_path = os.path.join(project_root, '.env')
load_dotenv(env_path)
print(f"✅ 환경 변수 로드 완료")

프로젝트 루트: /home/infidea/backup-data/paper-reco-agent
✅ 로깅 설정 완료 (콘솔 + 파일)
✅ 환경 변수 로드 완료


### 2. GPU 및 CUDA 확인

In [2]:
import torch

# GPU 확인
print(f"PyTorch 버전: {torch.__version__}")
print(f"CUDA 사용 가능: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    print(f"CUDA 버전: {torch.version.cuda}")
    print(f"사용 가능한 GPU 수: {torch.cuda.device_count()}")
    print(f"현재 GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU 메모리: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")
else:
    print("⚠️  GPU를 사용할 수 없습니다. CPU 모드 또는 DEV_MODE로 실행됩니다.")

PyTorch 버전: 2.8.0+cu128
CUDA 사용 가능: True
CUDA 버전: 12.8
사용 가능한 GPU 수: 1
현재 GPU: NVIDIA H100 80GB HBM3
GPU 메모리: 79.1 GB


### 3. 추천 에이전트 초기화

**중요**: 이 단계에서 Qwen3-14B 모델이 FP16으로 로드됩니다. 
- FP16: ~28GB VRAM 필요

In [3]:
# 추천 에이전트 임포트 및 초기화
from src.agents.recommendation_agent import KoreanResearchRecommendationAgent
from src.config.settings import settings

print("모델 설정:")
print(f"  - 모델명: {settings.MODEL_NAME}")
print(f"  - 임베딩 모델: {settings.EMBEDDING_MODEL}")
print(f"  - 개발 모드: {settings.DEV_MODE}")
print("\n🚀 에이전트 초기화 중... (수 분 소요될 수 있습니다)")

agent = KoreanResearchRecommendationAgent()

print("\n✅ 에이전트 초기화 완료")
print(f"모델 정보: {json.dumps(agent.llm_model.get_model_info(), indent=2, ensure_ascii=False)}")

  from .autonotebook import tqdm as notebook_tqdm


2025-10-14 09:24:26,444 - sentence_transformers.SentenceTransformer - INFO - Use pytorch device_name: cuda:0
2025-10-14 09:24:26,446 - sentence_transformers.SentenceTransformer - INFO - Load pretrained SentenceTransformer: intfloat/multilingual-e5-large
모델 설정:
  - 모델명: Qwen/Qwen3-14B
  - 임베딩 모델: intfloat/multilingual-e5-large
  - 개발 모드: False

🚀 에이전트 초기화 중... (수 분 소요될 수 있습니다)
2025-10-14 09:24:36,682 - src.agents.recommendation_agent - INFO - 🚀 프로덕션 모드로 실행: 실제 Qwen 모델 사용
2025-10-14 09:24:36,687 - src.models.qwen_model - INFO - 🚀 Qwen 모델 로딩 시작: Qwen/Qwen3-14B
2025-10-14 09:24:36,688 - src.models.qwen_model - INFO -    - 디바이스: cuda
2025-10-14 09:24:37,414 - src.models.qwen_model - INFO -    - FP16 모드 (~28GB VRAM)


`torch_dtype` is deprecated! Use `dtype` instead!
Loading checkpoint shards: 100%|██████████| 8/8 [00:14<00:00,  1.83s/it]


2025-10-14 09:25:25,573 - src.models.qwen_model - INFO - ✅ Qwen 모델 로딩 완료

✅ 에이전트 초기화 완료
모델 정보: {
  "model_name": "Qwen/Qwen3-14B",
  "device": "cuda",
  "dtype": "float16",
  "max_tokens": 512,
  "temperature": 0.1,
  "parameters": "14.8B",
  "context_length": "32K (extendable to 128K)"
}


### 4. 테스트 데이터셋 ID 설정

DataON에 등록된 실제 데이터셋 ID를 입력하세요.

In [4]:
# 테스트용 데이터셋 ID
# 예시: KISTI DataON의 실제 데이터셋 ID를 입력하세요
test_dataset_id = "c7dc77b406795dcc332dcc733efb2261"  # TODO: 실제 데이터셋 ID로 변경

print(f"테스트 데이터셋 ID: {test_dataset_id}")

테스트 데이터셋 ID: c7dc77b406795dcc332dcc733efb2261


### 5. 추론 실행

에이전트가 다음 단계를 수행합니다:
1. 소스 데이터셋 메타데이터 조회 (DataON API)
2. LLM으로 검색 쿼리 생성
3. 후보 수집 (DataON + ScienceON API)
4. 하이브리드 유사도 계산 (E5 + BM25)
5. LLM으로 최종 추천 생성

In [5]:
# 추론 실행 (비동기)
import time

start_time = time.time()
print("🔍 추천 시작...\n")

# Jupyter에서 비동기 함수 실행
result = await agent.recommend(test_dataset_id)

elapsed_time = time.time() - start_time
print(f"\n✅ 추천 완료! (소요 시간: {elapsed_time:.2f}초)")

🔍 추천 시작...

2025-10-14 09:25:25,588 - src.agents.recommendation_agent - INFO - 추천 프로세스 시작: 데이터셋 ID c7dc77b406795dcc332dcc733efb2261
2025-10-14 09:25:25,710 - httpx - INFO - HTTP Request: GET https://dataon.kisti.re.kr/rest/api/search/dataset/c7dc77b406795dcc332dcc733efb2261?key=4936BC43D48603524DEDA2E2D56D6B46 "HTTP/1.1 200 200"
2025-10-14 09:25:25,712 - src.clients.dataon_client - INFO - Successfully retrieved metadata for dataset c7dc77b406795dcc332dcc733efb2261
2025-10-14 09:25:25,713 - src.clients.dataon_client - INFO - API Response:
{
  "response": {
    "elapsed time": "34 ms",
    "status": "200",
    "message": "OK",
    "total count": "1",
    "type": "json"
  },
  "records": {
    "svc_id": "c7dc77b406795dcc332dcc733efb2261",
    "ctlg_type": "02",
    "dataset_type": "01",
    "ctlg_type_pc": "dataset",
    "dataset_type_pc": "국내",
    "dataset_pub_dt_pc": "2025",
    "dataset_access_type_pc": "공개",
    "file_yn_pc": "랜딩페이지이동",
    "dataset_cc_license_pc": "none",
    "datas

Batches: 100%|██████████| 1/1 [00:00<00:00, 10.00it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 92.01it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 67.23it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 89.63it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 68.78it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 151.59it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 67.35it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 91.39it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 63.91it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 81.96it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 65.16it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 74.28it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 64.41it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 74.40it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 64.33it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 83.72it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 64.90it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 66.97it/s]
Batches: 

2025-10-14 09:25:40,370 - src.agents.recommendation_agent - INFO - 상위 15개 후보 순위 결정 완료
2025-10-14 09:25:40,371 - src.agents.recommendation_agent - INFO - LLM 추천 생성 시도 1/2





2025-10-14 09:25:47,576 - src.agents.recommendation_agent - INFO - 추출된 JSON:
{
  "recommendations": [
    {
      "candidate_number": 5,
      "reason": "Shares common terms like '대사체', '혈장', and focuses on metabolomics research similar to the source dataset.",
      "level": "참고"
    },
    {
      "candidate_number": 1,
      "reason": "Involves targeted metabolomics and shares keywords such as '연구', '분석', and '대사체', aligning with the source's focus on metabolomics.",
      "level": "참고"
    },
    {
      "candidate_number": 3,
      "reason": "Utilizes metabolomics techniques and discusses disease-related metabolic pathways, which is relevant to the source’s investigation of environmental pollutants’ effects.",
      "level": "참고"
    },
    {
      "candidate_number": 4,
      "reason": "Focuses on metabolomics and explores metabolic mechanisms related to diseases, similar to the source’s approach.",
      "level": "참고"
    },
    {
      "candidate_number": 6,
      "reason": "Di

### 6. 결과 확인

In [7]:
# 오류 확인
if 'error' in result:
    print(f"❌ 오류 발생: {result['error']}")
else:
    print("=" * 80)
    print("📊 추천 결과 요약")
    print("=" * 80)
    print(f"\n소스 데이터셋:")
    print(f"  ID: {result['source_dataset']['id']}")
    print(f"  제목: {result['source_dataset']['title']}")
    print(f"  키워드: {', '.join(result['source_dataset']['keywords'])}")
    
    print(f"\n추천 개수: {len(result['recommendations'])}개")
    print(f"분석 후보: {result['candidates_analyzed']}개")
    print(f"처리 시간: {result['processing_time_ms']}ms")
    
    print(f"\n모델 정보:")
    for key, value in result['model_info'].items():
        print(f"  {key}: {value}")
    
    print("\n" + "=" * 80)

    # 추천 목록 상세 출력
    print("📝 추천 목록")
    print("=" * 80)
    if 'recommendations' in result:
        for rec in result['recommendations']:
            print(f"\n[{rec['rank']}위] {rec['title']}")
            print(f"점수: {rec['score']:.3f}")
            print(f"타입: {rec['type'].upper()}")
            print(f"레벨: {rec['level']}")
            print(f"이유: {rec['reason']}")
            print(f"URL: {rec['url']}")
            print("-" * 80)

📊 추천 결과 요약

소스 데이터셋:
  ID: c7dc77b406795dcc332dcc733efb2261
  제목: 지속성 유기 오염물질 노출에 대한 인간 혈장의 NMR 기반 대사체 분석
  키워드: K-BDS, 대사체, 표적 대사체학

추천 개수: 5개
분석 후보: 15개
처리 시간: 21992ms

모델 정보:
  model_name: Qwen/Qwen3-14B
  device: cuda
  dtype: float16
  max_tokens: 512
  temperature: 0.1
  parameters: 14.8B
  context_length: 32K (extendable to 128K)

📝 추천 목록

[1위] Mass spectrometry-based metabolomics to discover candidate biomarkers for alopecia and cancer
점수: 0.518
타입: PAPER
레벨: 참고
이유: Shares common terms like '대사체', '혈장', and focuses on metabolomics research similar to the source dataset.
URL: http://click.ndsl.kr/servlet/OpenAPIDetailView?keyValue=05787966&target=DIKO&cn=DIKO0015913085
--------------------------------------------------------------------------------

[2위] 표적 대사체 분석 및 비표적 대사체학을 통한 암의 기저 메커니즘 연구
점수: 0.578
타입: PAPER
레벨: 참고
이유: Involves targeted metabolomics and shares keywords such as '연구', '분석', and '대사체', aligning with the source's focus on metabolomics.
URL: http://click.ndsl.kr/

### 7. JSON 파일로 결과 저장 (선택사항)

In [8]:
# 결과를 JSON 파일로 저장
from datetime import datetime

output_dir = os.path.join(project_root, 'data', 'inference_results')
os.makedirs(output_dir, exist_ok=True)

# 타임스탬프 생성 (년월일시분)
timestamp = datetime.now().strftime("%Y%m%d%H%M")
output_file = os.path.join(output_dir, f"single_result_{timestamp}.json")

with open(output_file, 'w', encoding='utf-8') as f:
    json.dump(result, f, ensure_ascii=False, indent=2)

print(f"✅ 결과 저장 완료: {output_file}")

✅ 결과 저장 완료: /home/infidea/backup-data/paper-reco-agent/data/inference_results/single_result_202510140928.json


### 8. 배치 추론

여러 데이터셋에 대해 배치 추론을 수행할 수 있습니다.

In [10]:
# 여러 데이터셋 ID 배치 추론 (병렬 처리)
from datetime import datetime
import asyncio

test_dataset_ids = [
    "a27774ddf0c702847a996cee9d660ba4",
    "c94e17ab632d04afe17beb9dbdc3496f",
    "a4baf597d993e908bc333cba31d4b458",
    "eb587504cc55f00372e05a6d2abb4dca",
    "07b3b3d6f6245f4fc51436edf3957a95",
    "c7dc77b406795dcc332dcc733efb2261"
]

print(f"📦 배치 추론 시작: {len(test_dataset_ids)}개 데이터셋 병렬 처리\n")
batch_start_time = time.time()

# 배치 추론 함수 정의
async def process_single_dataset(dataset_id):
    """단일 데이터셋 추론"""
    try:
        print(f"처리 중: {dataset_id}")
        result = await agent.recommend(dataset_id)
        print(f"✅ 완료: {dataset_id} - {len(result.get('recommendations', []))}개 추천")
        return {
            'dataset_id': dataset_id,
            'success': 'error' not in result,
            'result': result
        }
    except Exception as e:
        print(f"❌ 실패: {dataset_id} - {e}")
        return {
            'dataset_id': dataset_id,
            'success': False,
            'error': str(e)
        }

# 병렬 배치 추론 실행
batch_results = await asyncio.gather(*[process_single_dataset(dataset_id) for dataset_id in test_dataset_ids])

batch_elapsed_time = time.time() - batch_start_time
print(f"\n⏱️  배치 추론 총 소요 시간: {batch_elapsed_time:.2f}초")
print(f"📊 평균 처리 시간: {batch_elapsed_time / len(test_dataset_ids):.2f}초/데이터셋")

# 배치 결과 저장 (타임스탬프 포함)
timestamp = datetime.now().strftime("%Y%m%d%H%M")
batch_output_file = os.path.join(output_dir, f'batch_results_{timestamp}.json')
with open(batch_output_file, 'w', encoding='utf-8') as f:
    json.dump(batch_results, f, ensure_ascii=False, indent=2)

print(f"\n✅ 배치 결과 저장 완료: {batch_output_file}")

📦 배치 추론 시작: 6개 데이터셋 병렬 처리

처리 중: a27774ddf0c702847a996cee9d660ba4
2025-10-14 09:30:45,003 - src.agents.recommendation_agent - INFO - 추천 프로세스 시작: 데이터셋 ID a27774ddf0c702847a996cee9d660ba4
처리 중: c94e17ab632d04afe17beb9dbdc3496f
2025-10-14 09:30:45,019 - src.agents.recommendation_agent - INFO - 추천 프로세스 시작: 데이터셋 ID c94e17ab632d04afe17beb9dbdc3496f
처리 중: a4baf597d993e908bc333cba31d4b458
2025-10-14 09:30:45,029 - src.agents.recommendation_agent - INFO - 추천 프로세스 시작: 데이터셋 ID a4baf597d993e908bc333cba31d4b458
처리 중: eb587504cc55f00372e05a6d2abb4dca
2025-10-14 09:30:45,040 - src.agents.recommendation_agent - INFO - 추천 프로세스 시작: 데이터셋 ID eb587504cc55f00372e05a6d2abb4dca
처리 중: 07b3b3d6f6245f4fc51436edf3957a95
2025-10-14 09:30:45,050 - src.agents.recommendation_agent - INFO - 추천 프로세스 시작: 데이터셋 ID 07b3b3d6f6245f4fc51436edf3957a95
처리 중: c7dc77b406795dcc332dcc733efb2261
2025-10-14 09:30:45,061 - src.agents.recommendation_agent - INFO - 추천 프로세스 시작: 데이터셋 ID c7dc77b406795dcc332dcc733efb2261
2025-10-14 09:30:45

Batches: 100%|██████████| 1/1 [00:00<00:00, 102.83it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 92.33it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 131.57it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 95.00it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 130.68it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 79.63it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 131.05it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 95.83it/s]

2025-10-14 09:30:58,155 - src.agents.recommendation_agent - INFO - 상위 4개 후보 순위 결정 완료
2025-10-14 09:30:58,156 - src.agents.recommendation_agent - INFO - LLM 추천 생성 시도 1/2





2025-10-14 09:31:01,740 - src.agents.recommendation_agent - INFO - 추출된 JSON:
{
  "recommendations": [
    {
      "candidate_number": 1,
      "reason": "High semantic similarity with common keyword '좌표계' related to mapping services",
      "level": "참고"
    },
    {
      "candidate_number": 2,
      "reason": "Shared term 'wms' indicates potential relevance to web map service frameworks",
      "level": "참고"
    },
    {
      "candidate_number": 3,
      "reason": "Low similarity but relates to digital technologies in creative fields",
      "level": "참고"
    },
    {
      "candidate_number": 4,
      "reason": "Low similarity but discusses digital content management systems",
      "level": "참고"
    }
  ]
}
2025-10-14 09:31:01,743 - src.agents.recommendation_agent - INFO - ✅ JSON 파싱 성공
2025-10-14 09:31:01,744 - src.agents.recommendation_agent - INFO - 파싱된 타입: <class 'dict'>, 키: dict_keys(['recommendations'])
2025-10-14 09:31:01,745 - src.agents.recommendation_agent - INFO - recomm

Batches: 100%|██████████| 1/1 [00:00<00:00, 65.08it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 83.84it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 72.27it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 82.86it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 72.46it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 81.09it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 82.36it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 85.82it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 82.02it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 65.40it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 78.59it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 72.47it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 67.92it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 71.55it/s]

2025-10-14 09:31:02,081 - src.agents.recommendation_agent - INFO - 상위 14개 후보 순위 결정 완료
2025-10-14 09:31:02,082 - src.agents.recommendation_agent - INFO - LLM 추천 생성 시도 1/2





2025-10-14 09:31:09,390 - src.agents.recommendation_agent - INFO - 추출된 JSON:
{
  "recommendations": [
    {
      "candidate_number": 12,
      "reason": "Shares regional focus on East Asian horn-shaped cup culture and its spread across the Korean Peninsula and Japanese archipelago",
      "level": "참고"
    },
    {
      "candidate_number": 11,
      "reason": "Focuses on pottery production and cultural exchange during the Hanseong period in Baekje, relevant to archaeological studies of ancient Korea",
      "level": "참고"
    },
    {
      "candidate_number": 10,
      "reason": "Examines early iron age artifacts on the Korean Peninsula, aligning with archaeological interest in technological development",
      "level": "참고"
    },
    {
      "candidate_number": 6,
      "reason": "Analyzes Japanese scholar's research on Korean mythology under colonial context, relevant to historical and cultural studies",
      "level": "참고"
    },
    {
      "candidate_number": 9,
      "reason":

Batches: 100%|██████████| 1/1 [00:00<00:00, 55.16it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 103.99it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 64.93it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 89.94it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 47.22it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 94.52it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 59.68it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 89.83it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 63.85it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 88.41it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 62.42it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 76.41it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 64.48it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 75.55it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 64.73it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 89.12it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 64.97it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 69.41it/s]
Batches: 

2025-10-14 09:31:16,164 - src.agents.recommendation_agent - INFO - 상위 15개 후보 순위 결정 완료
2025-10-14 09:31:16,165 - src.agents.recommendation_agent - INFO - LLM 추천 생성 시도 1/2





2025-10-14 09:31:23,589 - src.agents.recommendation_agent - INFO - 추출된 JSON:
{
  "recommendations": [
    {
      "candidate_number": 5,
      "reason": "Shares common terms like '대사체', '혈장', and focuses on metabolomics research similar to the source dataset.",
      "level": "참고"
    },
    {
      "candidate_number": 1,
      "reason": "Involves targeted metabolomics and shares keywords such as '연구', '분석', and '대사체', aligning with the source's focus on metabolomic profiling.",
      "level": "참고"
    },
    {
      "candidate_number": 3,
      "reason": "Utilizes metabolomics techniques and discusses disease-related metabolic pathways, which parallels the source's investigation into environmental pollutants and metabolic associations.",
      "level": "참고"
    },
    {
      "candidate_number": 4,
      "reason": "Focuses on metabolomics and explores metabolic mechanisms related to diseases, similar to the source's approach in analyzing metabolic profiles linked to environmental fact

Batches: 100%|██████████| 1/1 [00:00<00:00, 135.65it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 159.23it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 164.05it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 106.48it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 68.47it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 109.20it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 189.11it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 108.62it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 187.73it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 174.13it/s]

2025-10-14 09:31:23,915 - src.agents.recommendation_agent - INFO - 상위 5개 후보 순위 결정 완료
2025-10-14 09:31:23,916 - src.agents.recommendation_agent - INFO - LLM 추천 생성 시도 1/2





2025-10-14 09:31:28,634 - src.agents.recommendation_agent - INFO - 추출된 JSON:
{
  "recommendations": [
    {
      "candidate_number": 1,
      "reason": "High semantic similarity with the topic of biological classification",
      "level": "참고"
    },
    {
      "candidate_number": 2,
      "reason": "Covers folk biological classification, which relates to broader classification concepts",
      "level": "참고"
    },
    {
      "candidate_number": 3,
      "reason": "Focuses on biological classification and organizing organisms, aligning closely with the source dataset's theme",
      "level": "참고"
    },
    {
      "candidate_number": 4,
      "reason": "Similar content to candidate 3, discussing biological classification and its organization",
      "level": "참고"
    },
    {
      "candidate_number": 5,
      "reason": "Explores the meaning and importance of biological classification, relevant to the source dataset’s focus",
      "level": "참고"
    }
  ]
}
2025-10-14 09:31:28,636 

Batches: 100%|██████████| 1/1 [00:00<00:00, 73.85it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 86.30it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 87.66it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 89.71it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 104.13it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 175.41it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 96.77it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 106.58it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 107.17it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 106.12it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 76.21it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 104.25it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 79.40it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 90.89it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 88.46it/s]

2025-10-14 09:31:29,069 - src.agents.recommendation_agent - INFO - 상위 15개 후보 순위 결정 완료
2025-10-14 09:31:29,071 - src.agents.recommendation_agent - INFO - LLM 추천 생성 시도 1/2





2025-10-14 09:31:29,404 - src.agents.recommendation_agent - INFO - 추출된 JSON:
{
  "recommendations": []
}
2025-10-14 09:31:29,405 - src.agents.recommendation_agent - INFO - ✅ JSON 파싱 성공
2025-10-14 09:31:29,405 - src.agents.recommendation_agent - INFO - 파싱된 타입: <class 'dict'>, 키: dict_keys(['recommendations'])
2025-10-14 09:31:29,406 - src.agents.recommendation_agent - INFO - recommendations 키 발견, 0개 항목
{"recommendations": []}...
2025-10-14 09:31:29,408 - src.agents.recommendation_agent - INFO - 간소화된 프롬프트로 재시도
2025-10-14 09:31:29,409 - src.agents.recommendation_agent - INFO - LLM 추천 생성 시도 2/2
2025-10-14 09:31:34,620 - src.agents.recommendation_agent - INFO - 추출된 JSON:
{
  "recommendations": [
    {
      "candidate_number": 1,
      "reason": "Related to health data and interdisciplinary approaches",
      "level": "참고"
    },
    {
      "candidate_number": 2,
      "reason": "Involves health sciences education and participant retention",
      "level": "참고"
    },
    {
      "candidat

### 9. 리소스 정리 (선택사항)

In [None]:
# GPU 메모리 정리
if hasattr(agent, 'llm_model') and agent.llm_model:
    agent.llm_model.cleanup()
    print("✅ 모델 리소스 정리 완료")

if torch.cuda.is_available():
    torch.cuda.empty_cache()
    print("✅ GPU 메모리 캐시 정리 완료")