# üè≠ CoT Factory Runner (Colab Edition)

Bu notebook, **Chain-of-Thought (CoT)** mimarisini kullanarak y√ºzlerce makaleyi toplu olarak i≈ülemek i√ßin tasarlanmƒ±≈ütƒ±r.

**√ñzellikler:**
- ‚òÅÔ∏è **Groq API** ile bulut tabanlƒ± akƒ±l y√ºr√ºtme.
- üñºÔ∏è **G√∂rsel Entegrasyon:** `visual_analysis.txt` (Grafik/Tablo verileri) varsa otomatik okur.
- ‚ö° **Paralel √áalƒ±≈üma:** Bu notebook'tan birden fazla kopya a√ßƒ±p farklƒ± aralƒ±klarƒ± (√ñrn: 0-50, 51-100) aynƒ± anda i≈üleyebilirsiniz.

## 1Ô∏è‚É£ Kurulum ve Baƒülantƒ±

In [None]:
# 1. Google Drive Baƒüla
from google.colab import drive
drive.mount('/content/drive')

# 2. K√ºt√ºphaneleri Y√ºkle
!pip install groq pyyaml -q

import os
import sys
import glob
import time
import json
from google.colab import userdata

# 3. Proje Yolunu Tanƒ±mla (Kendi Drive yolunuzu kontrol edin!)
PROJECT_ROOT = "/content/drive/MyDrive/AKU_WorkSpace/survey_fdgit/OISAC_PRISMA_COMST"
sys.path.append(PROJECT_ROOT)

# 4. Core Mod√ºl√º ƒ∞√ße Aktar
try:
    from analysis.cot_laboratory.core.assembler import CoTAssembler
    print("‚úÖ CoT Sistemi Ba≈üarƒ±yla Y√ºklendi!")
except ImportError as e:
    print(f"‚ùå HATA: Proje yolu yanlƒ±≈ü olabilir. L√ºtfen 'PROJECT_ROOT' deƒüi≈ükenini kontrol edin.\nDetay: {e}")

## 2Ô∏è‚É£ API Anahtarƒ±

In [None]:
# Sol men√ºdeki 'Secrets' (Anahtar ikonu) kƒ±smƒ±na GROQ_API_KEY eklediƒüinizden emin olun.
try:
    os.environ["GROQ_API_KEY"] = userdata.get('GROQ_API_KEY')
    print("‚úÖ API Anahtarƒ± Alƒ±ndƒ±.")
except Exception:
    # Fallback: Elle giri≈ü
    print("‚ö†Ô∏è Secrets'ta anahtar bulunamadƒ±, manuel giriniz:")
    os.environ["GROQ_API_KEY"] = input("Groq API Key (gsk_...): ")

## 3Ô∏è‚É£ Fabrika Ayarlarƒ± (Batch Config)

In [None]:
# ƒ∞≈ülenecek Tarifleri Se√ß
RECIPE_PATH = os.path.join(PROJECT_ROOT, "analysis/cot_laboratory/recipes/experiment_v1_full_analysis.yaml")

# ------------------------------------------------------------------
# ‚ö° PARALEL √áALI≈ûMA AYARLARI
# Bir notebook'ta START=0, END=50 yapƒ±p;
# Diƒüerinde START=51, END=100 yaparak hƒ±zƒ± 2 katƒ±na √ßƒ±karabilirsiniz.
# ------------------------------------------------------------------
BATCH_START_INDEX = 0   # Ba≈ülangƒ±√ß sƒ±rasƒ±
BATCH_END_INDEX = 200 # Biti≈ü sƒ±rasƒ± (√áok b√ºy√ºkse t√ºm√ºn√º alƒ±r)
# ------------------------------------------------------------------

## 4Ô∏è‚É£ √úretimi Ba≈ülat (Run Factory)

In [None]:
def get_pending_papers(root_dir):
    """ƒ∞≈ülenecek makaleleri bulur."""
    data_dir = os.path.join(root_dir, "data", "processed_markdowns")
    # Recursive arama: data/processed_markdowns/**/O_ISAC_*.md
    search_pattern = os.path.join(data_dir, "**", "O_ISAC_*.md")
    all_candidates = glob.glob(search_pattern, recursive=True)
    return sorted(all_candidates)

def run_batch():
    print("üèóÔ∏è Fabrika Hazƒ±rlanƒ±yor...")
    assembler = CoTAssembler(PROJECT_ROOT)
    
    all_papers = get_pending_papers(PROJECT_ROOT)
    
    # Slice list by batch settings
    target_papers = all_papers[BATCH_START_INDEX:BATCH_END_INDEX]
    
    print(f"üì¶ Toplam Aday Dosya: {len(all_papers)}")
    print(f"üéØ Hedef Aralƒ±ƒüƒ±: {BATCH_START_INDEX} - {BATCH_END_INDEX}")
    print(f"üöÄ ƒ∞≈ülenecek Dosya Sayƒ±sƒ±: {len(target_papers)}")
    print("="*60)

    success_count = 0
    fail_count = 0

    for index, paper_path in enumerate(target_papers):
        # Global index for logging clarity
        global_idx = BATCH_START_INDEX + index
        
        filename = os.path.basename(paper_path)
        paper_id = os.path.splitext(filename)[0]
        
        print(f"\n[{global_idx}] ü¶æ Processing {paper_id}...")
        
        try:
            # 1. Read Markdown
            with open(paper_path, 'r', encoding='utf-8') as f:
                content = f.read()
            
            if not content.strip():
                print(f"   ‚ö†Ô∏è SKIPPING: File empty.")
                continue

            # 2. Read Visual Analysis (DePlot/BLIP) if available
            paper_dir = os.path.dirname(paper_path)
            visual_path = os.path.join(paper_dir, "visual_analysis.txt")
            visual_content = None
            
            if os.path.exists(visual_path):
                with open(visual_path, 'r', encoding='utf-8') as vf:
                    visual_content = vf.read()
                print(f"   üñºÔ∏è Visual Data Loaded ({len(visual_content)} chars)")
            
            # 3. Execute CoT Extraction
            result = assembler.run_extraction(
                RECIPE_PATH, 
                content, 
                paper_id=paper_id, 
                visual_content=visual_content
            )
            
            if result.get("status") == "success":
                print(f"   ‚úÖ SUCCESS")
                success_count += 1
            else:
                print(f"   ‚ùå FAILED: {result.get('error_message')}")
                fail_count += 1
                
            # API Rate Limit Protection
            time.sleep(1)
            
        except Exception as e:
            print(f"   üí• CRITICAL ERROR: {str(e)}")
            fail_count += 1

    print("="*60)
    print(f"üèÅ Batch Completed.")
    print(f"‚úÖ Success: {success_count}")
    print(f"‚ùå Failed:  {fail_count}")

# Run the Factory
run_batch()