# Synthetic Data Generation & Fine-Tuning (QLoRA) Assignment

## Project Design

This week‚Äôs project involves creating a synthetic dataset and using it to fine-tune the model for better academic Q\&A performance. The plan is as follows:

1. **Data Sampling:** Select **100 academic papers** (e.g. using their abstracts and key sections). Ensure a diverse mix of subjects or paper types to provide a broad training base.
2. **Synthetic Q\&A Generation:** Use **GPT-4** to generate \~5 question-answer pairs for each paper. Craft a prompt that provides GPT-4 with the paper‚Äôs abstract or content and asks for informative Q\&A pairs. The questions should cover important points, definitions, or insights from the paper, and the answers should be correct summaries or explanations based on the text. This yields roughly **500 Q\&A pairs** in total.
3. **Include Edge-Case Examples:** Incorporate some **edge-case questions** among the above pairs ‚Äì for example, a question that reflects a misunderstanding or a **hallucinated detail** about the paper. For these, provide an answer that corrects the false premise or clarifies that the paper doesn‚Äôt contain that information. Including a few such Q\&A examples (e.g. *‚ÄúQ: According to the paper, what is the value of constant XYZ?‚Äù* when XYZ is not actually in the paper, and *‚ÄúA: The paper does not specify XYZ; in fact, that detail is not discussed.‚Äù*) will teach the model to handle incorrect or unanswerable queries gracefully.
4. **Format Data for Instruction Tuning:** Convert all the Q\&A pairs into the **instruction-tuning JSONL format** expected by our fine-tuning pipeline. Each line in the dataset should represent a complete prompt-response dialogue. We will use a chat-style format with explicit roles. For example, you can prepend a fixed system instruction (such as `"You are a helpful academic assistant."`) and then format each Q\&A as:

   ```
   <|system|> You are a helpful academic Q&A assistant specialized in scholarly content.
   <|user|> [Question from the dataset]
   <|assistant|> [Answer from the dataset]
   ```

   Structure each JSONL entry to contain this composite prompt. This ensures the model is trained in a conversational format where it receives a user question and produces an answer, following any system instructions (tone, style) you provided.
5. **Fine-Tune LLaMA¬†3 7B with QLoRA:** Run a fine-tuning job on **Google Colab** (or a local GPU) using **QLoRA** via the Unsloth library. QLoRA (Quantized LoRA) will load the 7B model in 4-bit precision and train low-rank adaptation weights. This drastically lowers memory usage, allowing even a 7B (and larger) model to be fine-tuned on a single GPU without out-of-memory errors. Using Unsloth‚Äôs tools, load the base LLaMA¬†3 (7B) model (preferably an instruct variant) and fine-tune it on your synthetic Q\&A dataset. We‚Äôll use LoRA adapters so the base model weights remain fixed; the training will produce a small set of adapted weights after 1‚Äì3 epochs over the dataset. *(Expect the fine-tuning to be relatively fast given \~500 examples ‚Äî on a T4 or similar GPU, a few epochs should only take minutes.)*
6. **Evaluation (Pre vs. Post-Tuning):** Finally, evaluate the model‚Äôs academic QA performance **before and after fine-tuning**. Prepare a set of **10 test questions** covering various papers or concepts (you can come up with these manually, ensuring they are challenging). Run the original base model and the fine-tuned model on each question, and compare the answers. Look for improvements such as: the fine-tuned model‚Äôs answers are more detailed, use terminology from the papers, correct mistakes the base model made, or cite relevant concepts from the training data. This comparison will let you quantify accuracy gains. You might measure accuracy as the number of questions answered correctly or with relevant info, or simply note qualitatively how the responses differ.

Throughout this design, the key idea is that **domain-aligned data** will make the model more knowledgeable in that domain. Instead of the agent relying solely on retrieval each time, the fine-tuned model will have *internalized* some academic knowledge and answer patterns. Fine-tuning on a well-structured QA dataset (as opposed to just dumping raw text) is crucial for the model to learn effectively.

## Setup

In [1]:
pip install arxiv openai tqdm pandas jsonlines dotenv pypdf


Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting arxiv
  Downloading arxiv-2.3.1-py3-none-any.whl.metadata (5.2 kB)
Collecting openai
  Downloading openai-2.14.0-py3-none-any.whl.metadata (29 kB)
Collecting jsonlines
  Downloading jsonlines-4.0.0-py3-none-any.whl.metadata (1.6 kB)
Collecting dotenv
  Downloading dotenv-0.9.9-py2.py3-none-any.whl.metadata (279 bytes)
Collecting pypdf
  Downloading pypdf-6.5.0-py3-none-any.whl.metadata (7.1 kB)
Collecting feedparser~=6.0.10 (from arxiv)
  Downloading feedparser-6.0.12-py3-none-any.whl.metadata (2.7 kB)
Collecting requests~=2.32.0 (from arxiv)
  Downloading requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)
Collecting sgmllib3k (from feedparser~=6.0.10->arxiv)
  Downloading sgmllib3k-1.0.0.tar.gz (5.8 kB)
  Preparing metadata (setup.py) ... [?25ldone
Collecting distro<2,>=1.7.0 (from openai)
  Downloading distro-1.9.0-py3-none

In [13]:
# Imports
import arxiv
from openai import OpenAI
import json
import jsonlines
import pandas as pd
from tqdm import tqdm
import random
from dotenv import load_dotenv
import os
import re
import json
import time

# üîë Set your OpenAI API key

# Load environment variables (make sure you have OPENAI_API_KEY in your .env file)
from pathlib import Path
from dotenv import load_dotenv

ENV_PATH = Path.cwd() / ".env"
load_dotenv(dotenv_path=ENV_PATH, override=True)

print("Using .env at:", ENV_PATH)


# Initialize client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Create output directory
os.makedirs("dataset_arxiv_json", exist_ok=True)

Using .env at: /home/jovyan/work/MLE_in_Gen_AI-Course/class7/Homework_Kevin/.env


## Data Sampling

In [None]:
import os, re, json, gzip, time, hashlib, tempfile
from pathlib import Path
from typing import Optional, Dict, Any, Iterable, Set

import arxiv
from tqdm import tqdm
from pypdf import PdfReader

# -----------------------------
# Config
# -----------------------------
CATEGORIES = [
    "q-fin.MF", "q-fin.TR", "q-fin.ST", "q-fin.RM", "q-fin.PR",
    "q-fin.CP", "q-fin", "q-fin.EC", "q-fin.GN", "q-fin.PM", 
    "econ.EM", "econ.GN", "econ.TH"
]
MAX_RESULTS_PER_CAT = 50

FIRST_N_PAGES = 10
MAX_BODY_CHARS = 25000
MAX_ABS_CHARS  = 3000
MIN_BODY_CHARS = 3000


OUT_DIR = Path("arxiv_finance_corpus")
OUT_DIR.mkdir(parents=True, exist_ok=True)

META_PATH = OUT_DIR / "papers_meta.jsonl.gz"
TEXT_PATH = OUT_DIR / "papers_text_p1_10.jsonl.gz"

SLEEP_BETWEEN_REQUESTS_SEC = 0.5  # be polite / avoid throttling

# -----------------------------
# Utilities
# -----------------------------
_ws = re.compile(r"\s+")
def clean_text(s: str) -> str:
    if not s:
        return ""
    s = s.replace("\x00", " ")
    s = _ws.sub(" ", s).strip()
    return s

def safe_trunc(s: str, max_chars: int) -> str:
    if len(s) <= max_chars:
        return s
    # cut on a boundary to avoid mid-word spam
    cut = s[:max_chars]
    last_space = cut.rfind(" ")
    return (cut[:last_space] if last_space > 0 else cut).rstrip()

def sha1(s: str) -> str:
    return hashlib.sha1(s.encode("utf-8", errors="ignore")).hexdigest()

def gzip_jsonl_append(path: Path, record: Dict[str, Any]) -> None:
    # append in text mode
    with gzip.open(path, "at", encoding="utf-8") as f:
        f.write(json.dumps(record, ensure_ascii=False) + "\n")

def extract_first_pages_text(pdf_path: Path, n_pages: int) -> str:
    reader = PdfReader(str(pdf_path))
    texts = []
    for i in range(min(n_pages, len(reader.pages))):
        try:
            t = reader.pages[i].extract_text() or ""
        except Exception:
            t = ""
        texts.append(t)
    return "\n".join(texts)

def download_pdf_to_temp(result: arxiv.Result) -> Path:
    # arxiv library can download via result.download_pdf()
    # We force temp dir and then delete after.
    tmpdir = Path(tempfile.mkdtemp(prefix="arxiv_pdf_"))
    pdf_path = Path(result.download_pdf(dirpath=str(tmpdir)))
    return pdf_path

def sizeof_dir(path: Path) -> int:
    total = 0
    for p in path.rglob("*"):
        if p.is_file():
            total += p.stat().st_size
    return total

# -----------------------------
# Main collection
# -----------------------------
seen_ids: Set[str] = set()
retrieved = 0
enriched = 0
failed_extract = 0

# fresh files each run (comment out if you want to append)
for p in [META_PATH, TEXT_PATH]:
    if p.exists():
        p.unlink()

for cat in tqdm(CATEGORIES, desc="Fetching arXiv categories"):
    search = arxiv.Search(
        query=f"cat:{cat}",
        max_results=MAX_RESULTS_PER_CAT,
        sort_by=arxiv.SortCriterion.SubmittedDate
    )

    for result in search.results():
        arxiv_id = (result.get_short_id() if hasattr(result, "get_short_id") else None) or result.entry_id
        arxiv_id = arxiv_id.replace("http://arxiv.org/abs/", "").replace("https://arxiv.org/abs/", "")
        if arxiv_id in seen_ids:
            continue
        seen_ids.add(arxiv_id)

        title = clean_text(result.title or "")
        abstract = clean_text(result.summary or "")
        abstract = safe_trunc(abstract, MAX_ABS_CHARS)

        authors = []
        try:
            authors = [a.name for a in (result.authors or [])][:8]
        except Exception:
            authors = []

        categories = []
        try:
            categories = list(getattr(result, "categories", []) or [])
        except Exception:
            categories = []

        published = None
        try:
            published = result.published.isoformat() if result.published else None
        except Exception:
            published = None

        pdf_url = ""
        try:
            pdf_url = result.pdf_url or ""
        except Exception:
            pdf_url = ""

        meta = {
            "id": arxiv_id,
            "source_category": cat,
            "primary_title": title,
            "authors": authors,
            "categories": categories,
            "published": published,
            "pdf_url": pdf_url,
            "abstract": abstract,
            "abstract_chars": len(abstract),
            "title_hash": sha1(title.lower()),
        }
        gzip_jsonl_append(META_PATH, meta)
        retrieved += 1

        # Enrich: first 5 pages text (then discard pdf)
        body_text = ""
        ok = True
        try:
            time.sleep(SLEEP_BETWEEN_REQUESTS_SEC)
            pdf_path = download_pdf_to_temp(result)
            raw = extract_first_pages_text(pdf_path, FIRST_N_PAGES)
            body_text = clean_text(raw)
            body_text = safe_trunc(body_text, MAX_BODY_CHARS)
        except Exception:
            ok = False
            failed_extract += 1
        finally:
            # delete temp dir
            try:
                if "pdf_path" in locals() and pdf_path.exists():
                    tmp = pdf_path.parent
                    for fp in tmp.glob("*"):
                        fp.unlink(missing_ok=True)
                    tmp.rmdir()
            except Exception:
                pass

        text_rec = {
            **meta,
            "body_p1_5": body_text,
            "body_chars": len(body_text),
            "extraction_ok": ok,
        }
        gzip_jsonl_append(TEXT_PATH, text_rec)
        enriched += 1

print(f"‚úÖ Retrieved (metadata): {retrieved}")
print(f"‚úÖ Enriched (p1-5 text): {enriched}")
print(f"‚ö†Ô∏è Extraction failed: {failed_extract}")

total_bytes = sizeof_dir(OUT_DIR)
print(f"üì¶ Output folder size: {total_bytes/1024/1024:.2f} MB")
print(f"- {META_PATH.name}: {META_PATH.stat().st_size/1024/1024:.2f} MB")
print(f"- {TEXT_PATH.name}: {TEXT_PATH.stat().st_size/1024/1024:.2f} MB")


  for result in search.results():
Fetching arXiv categories:  15%|‚ñà‚ñå        | 2/13 [02:56<16:01, 87.40s/it]Ignoring wrong pointing object 6 0 (offset 0)
Ignoring wrong pointing object 8 0 (offset 0)
Ignoring wrong pointing object 14 0 (offset 0)
Ignoring wrong pointing object 16 0 (offset 0)
Ignoring wrong pointing object 18 0 (offset 0)
Ignoring wrong pointing object 20 0 (offset 0)
Ignoring wrong pointing object 27 0 (offset 0)
Ignoring wrong pointing object 29 0 (offset 0)
Ignoring wrong pointing object 42 0 (offset 0)
Ignoring wrong pointing object 44 0 (offset 0)
Ignoring wrong pointing object 53 0 (offset 0)
Ignoring wrong pointing object 65 0 (offset 0)
Ignoring wrong pointing object 67 0 (offset 0)
Ignoring wrong pointing object 69 0 (offset 0)
Ignoring wrong pointing object 71 0 (offset 0)
Ignoring wrong pointing object 106 0 (offset 0)
Ignoring wrong pointing object 108 0 (offset 0)
Ignoring wrong pointing object 112 0 (offset 0)
Ignoring wrong pointing object 114 0 (offs

‚úÖ Retrieved (metadata): 455
‚úÖ Enriched (p1-5 text): 455
‚ö†Ô∏è Extraction failed: 3
üì¶ Output folder size: 4.26 MB
- papers_meta.jsonl.gz: 0.40 MB
- papers_text_p1_5.jsonl.gz: 3.86 MB





## Synthetic Data Generation

In [None]:
import os, json, gzip, re, time, sys
from pathlib import Path
from typing import List, Dict, Optional
from openai import OpenAI
from tqdm import tqdm

# -------------------------
# Paths
# -------------------------
CORPUS_DIR = Path("arxiv_finance_corpus")
IN_PATH = CORPUS_DIR / "papers_text_p1_10.jsonl.gz"
OUT_PATH = CORPUS_DIR / "synthetic_finance_qa.chatml.jsonl"

# -------------------------
# Controls
# -------------------------
MODEL = "gpt-5"
TEMPERATURE = 1
RETRIES = 3
BACKOFF = 2
SLEEP_BETWEEN_CALLS = 0.2

MIN_BODY_CHARS = 3000
MAX_CONTEXT_CHARS = 22000

SYSTEM_PROMPT = (
    "You are a finance and investing expert. You explain concepts clearly, "
    "use correct financial terminology, and avoid speculation. "
    "When answering, you only use information provided in my training knowledge."
)

PRINT_EVERY = 10          # per N papers processed, print status
SNIPPET_CHARS = 400       # show small snippet of model output on failure
FLUSH_EVERY = 20          # flush output file every N successful papers

# -------------------------
# Helpers
# -------------------------
_ws = re.compile(r"\s+")
def clean_text(s: str) -> str:
    s = (s or "").replace("\x00", " ")
    return _ws.sub(" ", s).strip()

def trunc(s: str, max_chars: int) -> str:
    s = s or ""
    if len(s) <= max_chars:
        return s
    cut = s[:max_chars]
    last_space = cut.rfind(" ")
    return (cut[:last_space] if last_space > 0 else cut).rstrip()

def make_generation_prompt(title: str, abstract: str, body: str) -> str:
    source = f"TITLE: {title}\n\nABSTRACT: {abstract}\n\nEXCERPT: {body}"
    source = trunc(source, MAX_CONTEXT_CHARS)
    return f"""
You will create training Q&A pairs to teach a model finance/investing expertise.

Rules:
- Use ONLY my training knowledge. Do not invent facts, datasets, results, or numbers.
- If details are missing, say so explicitly (e.g., "The abstract does not specify ...").
- Questions should be what a serious finance student or analyst would ask:
  mix of conceptual, methodology, implications, limitations, and definitions.
- Answers should be precise, finance-literate, and grounded.
- Return STRICT JSON with exactly 5 items.

Output JSON schema:
{{
  "qas": [
    {{"question": "...", "answer": "..."}},
    {{"question": "...", "answer": "..."}},
    {{"question": "...", "answer": "..."}},
    {{"question": "...", "answer": "..."}},
    {{"question": "...", "answer": "..."}}
  ]
}}

SOURCE:
\"\"\"{source}\"\"\"
""".strip()

def parse_qas(content: str) -> Optional[List[Dict[str, str]]]:
    try:
        obj = json.loads(content)
    except Exception:
        return None
    if not isinstance(obj, dict) or "qas" not in obj or not isinstance(obj["qas"], list):
        return None
    qas = []
    for item in obj["qas"]:
        if not isinstance(item, dict):
            continue
        q, a = item.get("question"), item.get("answer")
        if isinstance(q, str) and isinstance(a, str):
            qas.append({"question": q.strip(), "answer": a.strip()})
    return qas if len(qas) == 5 else None

def chatml_line(system_prompt: str, user_q: str, assistant_a: str) -> Dict[str, str]:
    return {"text": f"<|system|>{system_prompt}<|user|>{user_q}<|assistant|>{assistant_a}"}

# -------------------------
# OpenAI client
# -------------------------
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
if not os.getenv("OPENAI_API_KEY"):
    raise RuntimeError("Missing OPENAI_API_KEY in environment. Add it to .env or export it.")

# -------------------------
# Load count for tqdm
# -------------------------
def count_lines_gz(path: Path) -> int:
    n = 0
    with gzip.open(path, "rt", encoding="utf-8") as f:
        for _ in f:
            n += 1
    return n

print(f"IN_PATH: {IN_PATH} (exists={IN_PATH.exists()})")
print(f"OUT_PATH: {OUT_PATH}")
print("Counting papers (for progress bar)...")
total_lines = count_lines_gz(IN_PATH)
print(f"Total records in corpus file: {total_lines}\n")

# -------------------------
# Main
# -------------------------
OUT_PATH.parent.mkdir(parents=True, exist_ok=True)

total_papers = 0
used_papers = 0
total_qas = 0

skipped_extraction = 0
skipped_short = 0
skipped_empty_body_field = 0
model_failures = 0

start_time = time.time()

with gzip.open(IN_PATH, "rt", encoding="utf-8") as fin, open(OUT_PATH, "a", encoding="utf-8") as fout:
    for line in tqdm(fin, total=total_lines, desc="Generating QAs"):
        total_papers += 1
        rec = json.loads(line)

        # Check which body field exists
        # Your corpus writer used "body_p1_5" even when FIRST_N_PAGES=10.
        # We'll try both keys and print if missing.
        body = rec.get("body_p1_10")
        if body is None:
            body = rec.get("body_p1_5", "")

        if not rec.get("extraction_ok", True):
            skipped_extraction += 1
            continue

        if int(rec.get("body_chars", 0)) < MIN_BODY_CHARS:
            skipped_short += 1
            continue

        body = clean_text(body)
        if not body:
            skipped_empty_body_field += 1
            continue

        title = clean_text(rec.get("primary_title", ""))
        abstract = clean_text(rec.get("abstract", ""))

        prompt = make_generation_prompt(title, abstract, body)

        qas = None
        last_raw = None

        for attempt in range(1, RETRIES + 1):
            try:
                t0 = time.time()
                resp = client.chat.completions.create(
                    model=MODEL,
                    messages=[{"role": "user", "content": prompt}],
                    temperature=TEMPERATURE,
                    response_format={"type": "json_object"},
                )
                dt = time.time() - t0
                last_raw = resp.choices[0].message.content
                qas = parse_qas(last_raw)
                if qas is None:
                    raise ValueError("Invalid JSON or not exactly 5 qas")
                # success
                break
            except Exception as e:
                if attempt < RETRIES:
                    print(f"\n‚ö†Ô∏è Paper #{total_papers} attempt {attempt} failed: {e}")
                    if last_raw:
                        snip = clean_text(last_raw)[:SNIPPET_CHARS]
                        print(f"   model snippet: {snip!r}")
                    time.sleep(BACKOFF * attempt)
                else:
                    model_failures += 1
                    print(f"\n‚ùå Paper #{total_papers} FAILED after {RETRIES} retries. id={rec.get('id')}")
                    print(f"   title={title[:120]!r}")
                    if last_raw:
                        snip = clean_text(last_raw)[:SNIPPET_CHARS]
                        print(f"   last model snippet: {snip!r}")
                    qas = None

        if qas is None:
            continue

        # write 5 lines
        for qa in qas:
            row = chatml_line(SYSTEM_PROMPT, qa["question"], qa["answer"])
            fout.write(json.dumps(row, ensure_ascii=False) + "\n")
            total_qas += 1

        used_papers += 1

        if used_papers % FLUSH_EVERY == 0:
            fout.flush()

        if total_papers % PRINT_EVERY == 0:
            elapsed = time.time() - start_time
            print(
                f"\nüìå Progress: read={total_papers}/{total_lines} | used={used_papers} | qas={total_qas} | "
                f"skipped(extraction={skipped_extraction}, short={skipped_short}, empty_body={skipped_empty_body_field}) | "
                f"model_failures={model_failures} | elapsed={elapsed/60:.1f}m"
            )

        time.sleep(SLEEP_BETWEEN_CALLS)

print("\n‚úÖ DONE")
print(f"Read papers: {total_papers}")
print(f"Used papers: {used_papers}")
print(f"Total QAs written: {total_qas}")
print(f"Skipped - extraction_ok false: {skipped_extraction}")
print(f"Skipped - body_chars too small: {skipped_short}")
print(f"Skipped - empty body field: {skipped_empty_body_field}")
print(f"Model failures: {model_failures}")
print(f"Output: {OUT_PATH}")


IN_PATH: arxiv_finance_corpus/papers_text_p1_10.jsonl.gz (exists=True)
OUT_PATH: arxiv_finance_corpus/synthetic_finance_qa.chatml.jsonl
Counting papers (for progress bar)...
Total records in corpus file: 451



Generating QAs:   2%|‚ñè         | 10/451 [07:11<4:51:18, 39.63s/it]


üìå Progress: read=10/451 | used=10 | qas=50 | skipped(extraction=0, short=0, empty_body=0) | model_failures=0 | elapsed=7.2m


Generating QAs:   4%|‚ñç         | 19/451 [15:31<6:31:00, 54.31s/it]


üìå Progress: read=20/451 | used=20 | qas=100 | skipped(extraction=0, short=0, empty_body=0) | model_failures=0 | elapsed=17.0m


Generating QAs:   7%|‚ñã         | 30/451 [26:26<7:03:20, 60.33s/it]


üìå Progress: read=30/451 | used=30 | qas=150 | skipped(extraction=0, short=0, empty_body=0) | model_failures=0 | elapsed=26.4m


Generating QAs:   9%|‚ñâ         | 40/451 [35:25<5:36:59, 49.20s/it]


üìå Progress: read=40/451 | used=40 | qas=200 | skipped(extraction=0, short=0, empty_body=0) | model_failures=0 | elapsed=35.4m


Generating QAs:  11%|‚ñà         | 50/451 [46:32<6:41:36, 60.09s/it]


üìå Progress: read=50/451 | used=50 | qas=250 | skipped(extraction=0, short=0, empty_body=0) | model_failures=0 | elapsed=46.5m


Generating QAs:  13%|‚ñà‚ñé        | 60/451 [54:56<5:24:13, 49.75s/it]


üìå Progress: read=60/451 | used=60 | qas=300 | skipped(extraction=0, short=0, empty_body=0) | model_failures=0 | elapsed=54.9m


Generating QAs:  15%|‚ñà‚ñå        | 69/451 [1:01:45<5:18:12, 49.98s/it]


üìå Progress: read=70/451 | used=69 | qas=345 | skipped(extraction=1, short=0, empty_body=0) | model_failures=0 | elapsed=62.6m


Generating QAs:  18%|‚ñà‚ñä        | 79/451 [1:10:55<5:07:39, 49.62s/it]


üìå Progress: read=80/451 | used=79 | qas=395 | skipped(extraction=1, short=0, empty_body=0) | model_failures=0 | elapsed=71.6m


Generating QAs:  20%|‚ñà‚ñâ        | 90/451 [1:20:19<5:00:27, 49.94s/it]


üìå Progress: read=90/451 | used=89 | qas=445 | skipped(extraction=1, short=0, empty_body=0) | model_failures=0 | elapsed=80.3m


Generating QAs:  20%|‚ñà‚ñà        | 91/451 [1:20:54<4:32:47, 45.47s/it]


‚ö†Ô∏è Paper #92 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #92 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  20%|‚ñà‚ñà        | 92/451 [1:21:06<3:32:00, 35.43s/it]


‚ùå Paper #92 FAILED after 3 retries. id=2510.22341v1
   title='Understanding Carbon Trade Dynamics: A European Union Emissions Trading System Perspective'

‚ö†Ô∏è Paper #93 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #93 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  21%|‚ñà‚ñà        | 93/451 [1:21:17<2:47:58, 28.15s/it]


‚ùå Paper #93 FAILED after 3 retries. id=2510.22206v1
   title='Right Place, Right Time: Market Simulation-based RL for Execution Optimisation'

‚ö†Ô∏è Paper #94 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #94 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  21%|‚ñà‚ñà        | 94/451 [1:21:28<2:16:05, 22.87s/it]


‚ùå Paper #94 FAILED after 3 retries. id=2510.17165v1
   title='Trading with the Devil: Risk and Return in Foundation Model Strategies'

‚ö†Ô∏è Paper #95 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #95 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  21%|‚ñà‚ñà        | 95/451 [1:21:38<1:53:20, 19.10s/it]


‚ùå Paper #95 FAILED after 3 retries. id=2510.15612v1
   title='SoK: Market Microstructure for Decentralized Prediction Markets (DePMs)'

‚ö†Ô∏è Paper #96 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #96 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  21%|‚ñà‚ñà‚ñè       | 96/451 [1:21:49<1:38:06, 16.58s/it]


‚ùå Paper #96 FAILED after 3 retries. id=2510.15995v1
   title='The Invisible Handshake: Tacit Collusion between Adaptive Market Agents'

‚ö†Ô∏è Paper #97 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #97 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  22%|‚ñà‚ñà‚ñè       | 97/451 [1:21:59<1:26:39, 14.69s/it]


‚ùå Paper #97 FAILED after 3 retries. id=2512.20477v1
   title='Switching between states and the COVID-19 turbulence'

‚ö†Ô∏è Paper #98 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #98 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  22%|‚ñà‚ñà‚ñè       | 98/451 [1:22:09<1:18:18, 13.31s/it]


‚ùå Paper #98 FAILED after 3 retries. id=2512.20460v1
   title='The Aligned Economic Index & The State Switching Model'

‚ö†Ô∏è Paper #99 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #99 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  22%|‚ñà‚ñà‚ñè       | 99/451 [1:22:19<1:13:10, 12.47s/it]


‚ùå Paper #99 FAILED after 3 retries. id=2512.17225v1
   title='Modelling financial time series with $œÜ^{4}$ quantum field theory'

‚ö†Ô∏è Paper #100 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #100 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  22%|‚ñà‚ñà‚ñè       | 100/451 [1:22:30<1:09:38, 11.90s/it]


‚ùå Paper #100 FAILED after 3 retries. id=2512.17945v1
   title="What's the Price of Monotonicity? A Multi-Dataset Benchmark of Monotone-Constrained Gradient Boosting for Credit PD"

‚ö†Ô∏è Paper #102 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #102 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  23%|‚ñà‚ñà‚ñé       | 102/451 [1:22:41<51:45,  8.90s/it]  


‚ùå Paper #102 FAILED after 3 retries. id=2512.12334v1
   title='Extending the application of dynamic Bayesian networks in calculating market risk: Standard and stressed expected shortf'

‚ö†Ô∏è Paper #103 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #103 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  23%|‚ñà‚ñà‚ñé       | 103/451 [1:22:51<54:09,  9.34s/it]


‚ùå Paper #103 FAILED after 3 retries. id=2512.12054v1
   title='Universal Dynamics of Financial Bubbles in Isolated Markets: Evidence from the Iranian Stock Market'

‚ö†Ô∏è Paper #104 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #104 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  23%|‚ñà‚ñà‚ñé       | 104/451 [1:23:02<55:36,  9.61s/it]


‚ùå Paper #104 FAILED after 3 retries. id=2512.19705v1
   title='Generative AI for Analysts'

‚ö†Ô∏è Paper #105 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #105 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  23%|‚ñà‚ñà‚ñé       | 105/451 [1:23:12<56:38,  9.82s/it]


‚ùå Paper #105 FAILED after 3 retries. id=2512.17936v1
   title='Risk-Aware Financial Forecasting Enhanced by Machine Learning and Intuitionistic Fuzzy Multi-Criteria Decision-Making'

‚ö†Ô∏è Paper #106 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #106 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  24%|‚ñà‚ñà‚ñé       | 106/451 [1:23:23<58:05, 10.10s/it]


‚ùå Paper #106 FAILED after 3 retries. id=2512.17929v1
   title='Reinforcement Learning for Monetary Policy Under Macroeconomic Uncertainty: Analyzing Tabular and Function Approximation'

‚ö†Ô∏è Paper #107 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #107 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  24%|‚ñà‚ñà‚ñé       | 107/451 [1:23:33<58:26, 10.19s/it]


‚ùå Paper #107 FAILED after 3 retries. id=2512.17925v1
   title='Stylized Facts and Their Microscopic Origins: Clustering, Persistence, and Stability in a 2D Ising Framework'

‚ö†Ô∏è Paper #108 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #108 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  24%|‚ñà‚ñà‚ñç       | 108/451 [1:23:44<59:02, 10.33s/it]


‚ùå Paper #108 FAILED after 3 retries. id=2512.08000v1
   title="Analysis of Contagion in China's Stock Market: A Hawkes Process Perspective"

‚ö†Ô∏è Paper #109 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #109 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  24%|‚ñà‚ñà‚ñç       | 109/451 [1:23:55<59:31, 10.44s/it]


‚ùå Paper #109 FAILED after 3 retries. id=2512.17923v1
   title='Inferring Latent Market Forces: Evaluating LLM Detection of Gamma Exposure Patterns via Obfuscation Testing'

‚ö†Ô∏è Paper #110 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #110 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  24%|‚ñà‚ñà‚ñç       | 110/451 [1:24:05<59:03, 10.39s/it]


‚ùå Paper #110 FAILED after 3 retries. id=2512.15738v1
   title='Hybrid Quantum-Classical Ensemble Learning for S\\&P 500 Directional Prediction'

‚ö†Ô∏è Paper #111 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #111 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  25%|‚ñà‚ñà‚ñç       | 111/451 [1:24:16<59:20, 10.47s/it]


‚ùå Paper #111 FAILED after 3 retries. id=2512.06473v1
   title='Detrended cross-correlations and their random matrix limit: an example from the cryptocurrency market'

‚ö†Ô∏è Paper #112 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #112 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  25%|‚ñà‚ñà‚ñç       | 112/451 [1:24:26<59:26, 10.52s/it]


‚ùå Paper #112 FAILED after 3 retries. id=2512.06420v1
   title='Thermodynamic description of world GDP distribution over countries'

‚ö†Ô∏è Paper #113 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #113 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  25%|‚ñà‚ñà‚ñå       | 113/451 [1:24:37<58:49, 10.44s/it]


‚ùå Paper #113 FAILED after 3 retries. id=2512.07887v1
   title='Does it take two to tango: Interaction between Credit Default Swaps and National Stock Indices'

‚ö†Ô∏è Paper #114 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #114 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  25%|‚ñà‚ñà‚ñå       | 114/451 [1:24:48<59:15, 10.55s/it]


‚ùå Paper #114 FAILED after 3 retries. id=2512.07886v1
   title='The Endogenous Constraint: Hysteresis, Stagflation, and the Structural Inhibition of Monetary Velocity in the Bitcoin Ne'

‚ö†Ô∏è Paper #115 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #115 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  25%|‚ñà‚ñà‚ñå       | 115/451 [1:24:58<59:09, 10.56s/it]


‚ùå Paper #115 FAILED after 3 retries. id=2512.00893v1

‚ö†Ô∏è Paper #116 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #116 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  26%|‚ñà‚ñà‚ñå       | 116/451 [1:25:08<58:29, 10.47s/it]


‚ùå Paper #116 FAILED after 3 retries. id=2512.07860v1
   title='Integrating LSTM Networks with Neural Levy Processes for Financial Forecasting'

‚ö†Ô∏è Paper #117 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #117 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  26%|‚ñà‚ñà‚ñå       | 117/451 [1:25:19<58:33, 10.52s/it]


‚ùå Paper #117 FAILED after 3 retries. id=2511.18125v1
   title='Random processes for long-term market simulations'

‚ö†Ô∏è Paper #118 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #118 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  26%|‚ñà‚ñà‚ñå       | 118/451 [1:25:30<58:45, 10.59s/it]


‚ùå Paper #118 FAILED after 3 retries. id=2512.02037v1
   title='Statistical Arbitrage in Polish Equities Market Using Deep Learning Techniques'

‚ö†Ô∏è Paper #119 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #119 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  26%|‚ñà‚ñà‚ñã       | 119/451 [1:25:40<58:49, 10.63s/it]


‚ùå Paper #119 FAILED after 3 retries. id=2512.02029v1
   title='HODL Strategy or Fantasy? 480 Million Crypto Market Simulations and the Macro-Sentiment Effect'

‚ö†Ô∏è Paper #120 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #120 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  27%|‚ñà‚ñà‚ñã       | 120/451 [1:25:51<58:35, 10.62s/it]


‚ùå Paper #120 FAILED after 3 retries. id=2511.14408v1
   title='The Hidden Constant of Market Rhythms: How $1-1/e$ Defines Scaling in Intrinsic Time'

‚ö†Ô∏è Paper #121 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #121 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  27%|‚ñà‚ñà‚ñã       | 121/451 [1:26:02<58:15, 10.59s/it]


‚ùå Paper #121 FAILED after 3 retries. id=2511.13959v1
   title='Opportunity Cost in Insurance'

‚ö†Ô∏è Paper #122 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #122 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  27%|‚ñà‚ñà‚ñã       | 122/451 [1:26:12<58:01, 10.58s/it]


‚ùå Paper #122 FAILED after 3 retries. id=2511.13384v4
   title='CBDC Stress Test in a Dual-Currency Setting'

‚ö†Ô∏è Paper #123 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #123 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  27%|‚ñà‚ñà‚ñã       | 123/451 [1:26:23<57:58, 10.60s/it]


‚ùå Paper #123 FAILED after 3 retries. id=2511.12763v2
   title='Impact by design: translating Lead times in flux into an R handbook with code'

‚ö†Ô∏è Paper #124 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #124 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  27%|‚ñà‚ñà‚ñã       | 124/451 [1:26:33<57:51, 10.62s/it]


‚ùå Paper #124 FAILED after 3 retries. id=2511.08658v1
   title='"It Looks All the Same to Me": Cross-index Training for Long-term Financial Series Prediction'

‚ö†Ô∏è Paper #125 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #125 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  28%|‚ñà‚ñà‚ñä       | 125/451 [1:26:44<57:51, 10.65s/it]


‚ùå Paper #125 FAILED after 3 retries. id=2511.06224v1
   title='Bitcoin Forecasting with Classical Time Series Models on Prices and Volatility'

‚ö†Ô∏è Paper #126 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #126 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  28%|‚ñà‚ñà‚ñä       | 126/451 [1:26:55<57:22, 10.59s/it]


‚ùå Paper #126 FAILED after 3 retries. id=2511.08622v1
   title='Multi-period Learning for Financial Time Series Forecasting'

‚ö†Ô∏è Paper #127 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #127 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  28%|‚ñà‚ñà‚ñä       | 127/451 [1:27:05<57:05, 10.57s/it]


‚ùå Paper #127 FAILED after 3 retries. id=2511.08621v1
   title='The LLM Pro Finance Suite: Multilingual Large Language Models for Financial Applications'

‚ö†Ô∏è Paper #128 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #128 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  28%|‚ñà‚ñà‚ñä       | 128/451 [1:27:15<56:14, 10.45s/it]


‚ùå Paper #128 FAILED after 3 retries. id=2511.05030v2
   title='The Shape of Markets: Machine learning modeling and Prediction Using 2-Manifold Geometries'

‚ö†Ô∏è Paper #129 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #129 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  29%|‚ñà‚ñà‚ñä       | 129/451 [1:27:26<56:10, 10.47s/it]


‚ùå Paper #129 FAILED after 3 retries. id=2511.04784v1
   title='Insights into Tail-Based and Order Statistics'

‚ö†Ô∏è Paper #130 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #130 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  29%|‚ñà‚ñà‚ñâ       | 130/451 [1:27:36<56:17, 10.52s/it]


‚ùå Paper #130 FAILED after 3 retries. id=2511.08616v1
   title='Reasoning on Time-Series for Financial Technical Analysis'

‚ö†Ô∏è Paper #131 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #131 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  29%|‚ñà‚ñà‚ñâ       | 131/451 [1:27:47<56:09, 10.53s/it]


‚ùå Paper #131 FAILED after 3 retries. id=2511.03314v1
   title='Multifractality and sample size influence on Bitcoin volatility patterns'

‚ö†Ô∏è Paper #132 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #132 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  29%|‚ñà‚ñà‚ñâ       | 132/451 [1:27:57<55:49, 10.50s/it]


‚ùå Paper #132 FAILED after 3 retries. id=2511.08608v1
   title="When Reasoning Fails: Evaluating 'Thinking' LLMs for Stock Prediction"

‚ö†Ô∏è Paper #133 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #133 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  29%|‚ñà‚ñà‚ñâ       | 133/451 [1:28:08<56:08, 10.59s/it]


‚ùå Paper #133 FAILED after 3 retries. id=2511.01271v1
   title='High-Dimensional Spatial Arbitrage Pricing Theory with Heterogeneous Interactions'

‚ö†Ô∏è Paper #134 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #134 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  30%|‚ñà‚ñà‚ñâ       | 134/451 [1:28:19<56:04, 10.61s/it]


‚ùå Paper #134 FAILED after 3 retries. id=2510.26035v1
   title='Budget Forecasting and Integrated Strategic Planning for Leaders'

‚ö†Ô∏è Paper #135 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #135 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  30%|‚ñà‚ñà‚ñâ       | 135/451 [1:28:30<56:01, 10.64s/it]


‚ùå Paper #135 FAILED after 3 retries. id=2511.05523v1
   title='The Evolution of Probabilistic Price Forecasting Techniques: A Review of the Day-Ahead, Intra-Day, and Balancing Markets'

‚ö†Ô∏è Paper #136 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #136 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  30%|‚ñà‚ñà‚ñà       | 136/451 [1:28:40<55:52, 10.64s/it]


‚ùå Paper #136 FAILED after 3 retries. id=2512.21092v1
   title='Portfolio Optimization for Index Tracking with Constraints on Downside Risk and Carbon Footprint'

‚ö†Ô∏è Paper #137 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #137 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  30%|‚ñà‚ñà‚ñà       | 137/451 [1:28:51<55:21, 10.58s/it]


‚ùå Paper #137 FAILED after 3 retries. id=2512.20515v1
   title='Modeling Bank Systemic Risk of Emerging Markets under Geopolitical Shocks: Empirical Evidence from BRICS Countries'

‚ö†Ô∏è Paper #138 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #138 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  31%|‚ñà‚ñà‚ñà       | 138/451 [1:29:01<55:05, 10.56s/it]


‚ùå Paper #138 FAILED after 3 retries. id=2512.20190v1
   title='Pricing of wrapped Bitcoin and Ethereum on-chain options'

‚ö†Ô∏è Paper #139 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #139 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  31%|‚ñà‚ñà‚ñà       | 139/451 [1:29:12<54:43, 10.52s/it]


‚ùå Paper #139 FAILED after 3 retries. id=2512.19821v1
   title='How to choose my stochastic volatility parameters? A review'

‚ö†Ô∏è Paper #140 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #140 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  31%|‚ñà‚ñà‚ñà       | 140/451 [1:29:22<54:54, 10.59s/it]


‚ùå Paper #140 FAILED after 3 retries. id=2512.19621v1
   title='Counterexamples for FX Options Interpolations -- Part I'

‚ö†Ô∏è Paper #141 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #141 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  31%|‚ñà‚ñà‚ñà‚ñè      | 141/451 [1:29:33<54:39, 10.58s/it]


‚ùå Paper #141 FAILED after 3 retries. id=2512.18790v1
   title='Optimal Catastrophe Risk Pooling'

‚ö†Ô∏è Paper #142 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #142 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  31%|‚ñà‚ñà‚ñà‚ñè      | 142/451 [1:29:44<54:28, 10.58s/it]


‚ùå Paper #142 FAILED after 3 retries. id=2512.17185v1

‚ö†Ô∏è Paper #143 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #143 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  32%|‚ñà‚ñà‚ñà‚ñè      | 143/451 [1:29:54<54:22, 10.59s/it]


‚ùå Paper #143 FAILED after 3 retries. id=2512.13562v1
   title='Disability insurance with collective health claims: A mean-field approach'

‚ö†Ô∏è Paper #144 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #144 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  32%|‚ñà‚ñà‚ñà‚ñè      | 144/451 [1:30:05<54:31, 10.66s/it]


‚ùå Paper #144 FAILED after 3 retries. id=2512.12420v1
   title='Deep Hedging with Reinforcement Learning: A Practical Framework for Option Risk Management'

‚ö†Ô∏è Paper #145 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #145 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  32%|‚ñà‚ñà‚ñà‚ñè      | 145/451 [1:30:15<53:42, 10.53s/it]


‚ùå Paper #145 FAILED after 3 retries. id=2512.11976v1
   title='Institutionalizing risk curation in decentralized credit'

‚ö†Ô∏è Paper #146 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #146 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  32%|‚ñà‚ñà‚ñà‚ñè      | 146/451 [1:30:26<53:15, 10.48s/it]


‚ùå Paper #146 FAILED after 3 retries. id=2512.11666v2
   title='Risk Limited Asset Allocation with a Budget Threshold Utility Function and Leptokurtotic Distributions of Returns'

‚ö†Ô∏è Paper #147 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #147 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  33%|‚ñà‚ñà‚ñà‚ñé      | 147/451 [1:30:36<53:15, 10.51s/it]


‚ùå Paper #147 FAILED after 3 retries. id=2512.11430v1
   title='Pareto-optimal reinsurance under dependence uncertainty'

‚ö†Ô∏è Paper #148 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #148 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  33%|‚ñà‚ñà‚ñà‚ñé      | 148/451 [1:30:47<53:05, 10.51s/it]


‚ùå Paper #148 FAILED after 3 retries. id=2512.08851v1

‚ö†Ô∏è Paper #149 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #149 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  33%|‚ñà‚ñà‚ñà‚ñé      | 149/451 [1:30:57<52:44, 10.48s/it]


‚ùå Paper #149 FAILED after 3 retries. id=2512.07787v2
   title='VaR at Its Extremes: Impossibilities and Conditions for One-Sided Random Variables'

‚ö†Ô∏è Paper #150 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #150 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  33%|‚ñà‚ñà‚ñà‚ñé      | 150/451 [1:31:08<52:46, 10.52s/it]


‚ùå Paper #150 FAILED after 3 retries. id=2512.07526v1
   title='The Suicide Region: Option Games and the Race to Artificial General Intelligence'

‚ö†Ô∏è Paper #151 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #151 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  33%|‚ñà‚ñà‚ñà‚ñé      | 151/451 [1:31:18<52:38, 10.53s/it]


‚ùå Paper #151 FAILED after 3 retries. id=2512.06639v1
   title='Learning to Hedge Swaptions'

‚ö†Ô∏è Paper #152 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #152 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  34%|‚ñà‚ñà‚ñà‚ñé      | 152/451 [1:31:29<52:26, 10.52s/it]


‚ùå Paper #152 FAILED after 3 retries. id=2512.15739v1
   title='Bayesian Modeling for Uncertainty Management in Financial Risk Forecasting and Compliance'

‚ö†Ô∏è Paper #153 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #153 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  34%|‚ñà‚ñà‚ñà‚ñç      | 153/451 [1:31:39<52:27, 10.56s/it]


‚ùå Paper #153 FAILED after 3 retries. id=2512.05661v1
   title='Standard and stressed value at risk forecasting using dynamic Bayesian networks'

‚ö†Ô∏è Paper #154 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #154 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  34%|‚ñà‚ñà‚ñà‚ñç      | 154/451 [1:31:50<51:56, 10.49s/it]


‚ùå Paper #154 FAILED after 3 retries. id=2512.03267v1
   title='Orlicz-Lorentz premia and distortion Haezendonck-Goovaerts risk measures'

‚ö†Ô∏è Paper #155 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #155 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  34%|‚ñà‚ñà‚ñà‚ñç      | 155/451 [1:32:00<51:46, 10.49s/it]


‚ùå Paper #155 FAILED after 3 retries. id=2512.03242v1
   title='A Theoretical Framework Bridging Model Validation and Loss Ratio in Insurance'

‚ö†Ô∏è Paper #156 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #156 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  35%|‚ñà‚ñà‚ñà‚ñç      | 156/451 [1:32:11<51:26, 10.46s/it]


‚ùå Paper #156 FAILED after 3 retries. id=2512.01623v1
   title='Monopoly Pricing of Weather Index Insurance'

‚ö†Ô∏è Paper #157 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #157 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  35%|‚ñà‚ñà‚ñà‚ñç      | 157/451 [1:32:21<51:47, 10.57s/it]


‚ùå Paper #157 FAILED after 3 retries. id=2512.00916v1
   title='An Imbalance-Robust Evaluation Framework for Extreme Risk Forecasts'

‚ö†Ô∏è Paper #158 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #158 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  35%|‚ñà‚ñà‚ñà‚ñå      | 158/451 [1:32:32<51:31, 10.55s/it]


‚ùå Paper #158 FAILED after 3 retries. id=2511.22782v1
   title='Factors Influencing Cryptocurrency Prices: Evidence from Bitcoin, Ethereum, Dash, Litecoin, and Monero'

‚ö†Ô∏è Paper #159 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #159 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  35%|‚ñà‚ñà‚ñà‚ñå      | 159/451 [1:32:42<51:16, 10.54s/it]


‚ùå Paper #159 FAILED after 3 retries. id=2511.22272v1
   title='Statistics of Extremes for the Insurance Industry'

‚ö†Ô∏è Paper #160 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #160 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  35%|‚ñà‚ñà‚ñà‚ñå      | 160/451 [1:32:53<51:25, 10.60s/it]


‚ùå Paper #160 FAILED after 3 retries. id=2511.21975v1
   title='The Risk-Adjusted Intelligence Dividend: A Quantitative Framework for Measuring AI Return on Investment Integrating ISO '

‚ö†Ô∏è Paper #161 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #161 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  36%|‚ñà‚ñà‚ñà‚ñå      | 161/451 [1:33:04<51:18, 10.62s/it]


‚ùå Paper #161 FAILED after 3 retries. id=2511.21929v1
   title='Extended Convolution Bounds on the Fr√©chet Problem: Robust Risk Aggregation and Risk Sharing'

‚ö†Ô∏è Paper #162 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #162 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  36%|‚ñà‚ñà‚ñà‚ñå      | 162/451 [1:33:15<51:13, 10.63s/it]


‚ùå Paper #162 FAILED after 3 retries. id=2511.21901v1
   title='Standardized Threat Taxonomy for AI Security, Governance, and Regulatory Compliance'

‚ö†Ô∏è Paper #163 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #163 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  36%|‚ñà‚ñà‚ñà‚ñå      | 163/451 [1:33:25<51:04, 10.64s/it]


‚ùå Paper #163 FAILED after 3 retries. id=2512.07867v1
   title='LLM-Generated Counterfactual Stress Scenarios for Portfolio Risk Simulation via Hybrid Prompt-RAG Pipeline'

‚ö†Ô∏è Paper #164 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #164 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  36%|‚ñà‚ñà‚ñà‚ñã      | 164/451 [1:33:36<51:19, 10.73s/it]


‚ùå Paper #164 FAILED after 3 retries. id=2511.21556v1
   title='Informative Risk Measures in the Banking Industry: A Proposal based on the Magnitude-Propensity Approach'

‚ö†Ô∏è Paper #165 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #165 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  37%|‚ñà‚ñà‚ñà‚ñã      | 165/451 [1:33:47<51:19, 10.77s/it]


‚ùå Paper #165 FAILED after 3 retries. id=2511.21515v2
   title='The Quantum Network of Assets: A Non-Classical Framework for Market Correlation and Structural Risk'

‚ö†Ô∏è Paper #166 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #166 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  37%|‚ñà‚ñà‚ñà‚ñã      | 166/451 [1:33:58<51:57, 10.94s/it]


‚ùå Paper #166 FAILED after 3 retries. id=2511.19701v1
   title='Optimal dividend and capital injection under self-exciting claims'

‚ö†Ô∏è Paper #167 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #167 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  37%|‚ñà‚ñà‚ñà‚ñã      | 167/451 [1:34:09<51:23, 10.86s/it]


‚ùå Paper #167 FAILED after 3 retries. id=2511.18614v1
   title='A calibrated model of debt recycling with interest costs and tax shields: viability under different fiscal regimes and j'

‚ö†Ô∏è Paper #168 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #168 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  37%|‚ñà‚ñà‚ñà‚ñã      | 168/451 [1:34:20<50:52, 10.78s/it]


‚ùå Paper #168 FAILED after 3 retries. id=2511.18169v1
   title='Superhedging under Proportional Transaction Costs in Continuous Time'

‚ö†Ô∏è Paper #169 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #169 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  37%|‚ñà‚ñà‚ñà‚ñã      | 169/451 [1:34:30<50:41, 10.78s/it]


‚ùå Paper #169 FAILED after 3 retries. id=2511.17954v1
   title='A multi-view contrastive learning framework for spatial embeddings in risk modelling'

‚ö†Ô∏è Paper #170 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #170 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  38%|‚ñà‚ñà‚ñà‚ñä      | 170/451 [1:34:41<50:01, 10.68s/it]


‚ùå Paper #170 FAILED after 3 retries. id=2511.16302v2
   title='Dynamic Risk Assessment of Wildland-Urban Interface Fires'

‚ö†Ô∏è Paper #171 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #171 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  38%|‚ñà‚ñà‚ñà‚ñä      | 171/451 [1:34:51<49:43, 10.66s/it]


‚ùå Paper #171 FAILED after 3 retries. id=2511.13878v1
   title='HSBC 1950 to 2025: Conquering the world from British Hong Kong and London'

‚ö†Ô∏è Paper #172 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #172 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  38%|‚ñà‚ñà‚ñà‚ñä      | 172/451 [1:35:02<49:09, 10.57s/it]


‚ùå Paper #172 FAILED after 3 retries. id=2511.12391v3
   title='Sharpening Shapley Allocation: from Basel 2.5 to FRTB'

‚ö†Ô∏è Paper #173 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #173 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  38%|‚ñà‚ñà‚ñà‚ñä      | 173/451 [1:35:12<48:46, 10.53s/it]


‚ùå Paper #173 FAILED after 3 retries. id=2511.12292v1
   title='Mean Field Analysis of Mutual Insurance Market'

‚ö†Ô∏è Paper #174 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #174 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  39%|‚ñà‚ñà‚ñà‚ñä      | 174/451 [1:35:23<48:47, 10.57s/it]


‚ùå Paper #174 FAILED after 3 retries. id=2511.11416v1
   title='Enhancing Efficiency of Pension Schemes through Effective Risk Governance: A Kenyan Perspective'

‚ö†Ô∏è Paper #175 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #175 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  39%|‚ñà‚ñà‚ñà‚ñâ      | 175/451 [1:35:34<49:00, 10.66s/it]


‚ùå Paper #175 FAILED after 3 retries. id=2511.11364v1
   title='Loss given default after default'

‚ö†Ô∏è Paper #176 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #176 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  39%|‚ñà‚ñà‚ñà‚ñâ      | 176/451 [1:35:45<48:59, 10.69s/it]


‚ùå Paper #176 FAILED after 3 retries. id=2512.20027v1
   title='GIFfluence: A Visual Approach to Investor Sentiment and the Stock Market'

‚ö†Ô∏è Paper #177 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #177 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  39%|‚ñà‚ñà‚ñà‚ñâ      | 177/451 [1:35:55<48:47, 10.68s/it]


‚ùå Paper #177 FAILED after 3 retries. id=2512.19625v1
   title='Counterexamples for FX Options Interpolations -- Part II'

‚ö†Ô∏è Paper #178 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #178 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  39%|‚ñà‚ñà‚ñà‚ñâ      | 178/451 [1:36:06<48:23, 10.64s/it]


‚ùå Paper #178 FAILED after 3 retries. id=2512.16251v2
   title='Interpretable Deep Learning for Stock Returns: A Consensus-Bottleneck Asset Pricing Model'

‚ö†Ô∏è Paper #179 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #179 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  40%|‚ñà‚ñà‚ñà‚ñâ      | 179/451 [1:36:16<48:00, 10.59s/it]


‚ùå Paper #179 FAILED after 3 retries. id=2512.12871v1
   title='CapOptix: An Options-Framework for Capacity Market Pricing'

‚ö†Ô∏è Paper #180 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #180 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  40%|‚ñà‚ñà‚ñà‚ñâ      | 180/451 [1:36:27<48:02, 10.64s/it]


‚ùå Paper #180 FAILED after 3 retries. id=2512.10823v1
   title='Option-Implied Zero-Coupon Yields: Unifying Bond and Equity Markets'

‚ö†Ô∏è Paper #181 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #181 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  40%|‚ñà‚ñà‚ñà‚ñà      | 181/451 [1:36:38<47:49, 10.63s/it]


‚ùå Paper #181 FAILED after 3 retries. id=2512.08890v1
   title='Modelling and valuation of catastrophe bonds across multiple regions'

‚ö†Ô∏è Paper #182 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #182 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  40%|‚ñà‚ñà‚ñà‚ñà      | 182/451 [1:36:48<47:31, 10.60s/it]


‚ùå Paper #182 FAILED after 3 retries. id=2512.05326v1
   title='Convolution-FFT for option pricing in the Heston model'

‚ö†Ô∏è Paper #183 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #183 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  41%|‚ñà‚ñà‚ñà‚ñà      | 183/451 [1:36:58<47:04, 10.54s/it]


‚ùå Paper #183 FAILED after 3 retries. id=2512.05301v1
   title='Differential ML with a Difference'

‚ö†Ô∏è Paper #184 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #184 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  41%|‚ñà‚ñà‚ñà‚ñà      | 184/451 [1:37:09<47:17, 10.63s/it]


‚ùå Paper #184 FAILED after 3 retries. id=2512.03922v1
   title='A Co-evolutionary Approach for Heston Calibration'

‚ö†Ô∏è Paper #185 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #185 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  41%|‚ñà‚ñà‚ñà‚ñà      | 185/451 [1:37:20<47:06, 10.63s/it]


‚ùå Paper #185 FAILED after 3 retries. id=2512.00280v2
   title='Retail Investor Horizon and Earnings Announcements'

‚ö†Ô∏è Paper #186 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #186 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  41%|‚ñà‚ñà‚ñà‚ñà      | 186/451 [1:37:30<46:41, 10.57s/it]


‚ùå Paper #186 FAILED after 3 retries. id=2511.21873v1
   title='A3T-GCN for FTSE100 Components Price Forecasting'

‚ö†Ô∏è Paper #187 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #187 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  41%|‚ñà‚ñà‚ñà‚ñà‚ñè     | 187/451 [1:37:41<46:38, 10.60s/it]


‚ùå Paper #187 FAILED after 3 retries. id=2511.18578v1
   title='Re(Visiting) Time Series Foundation Models in Finance'

‚ö†Ô∏è Paper #188 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #188 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  42%|‚ñà‚ñà‚ñà‚ñà‚ñè     | 188/451 [1:37:52<46:47, 10.67s/it]


‚ùå Paper #188 FAILED after 3 retries. id=2511.09061v1
   title='Generative Pricing of Basket Options via Signature-Conditioned Mixture Density Networks'

‚ö†Ô∏è Paper #189 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #189 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  42%|‚ñà‚ñà‚ñà‚ñà‚ñè     | 189/451 [1:38:02<46:16, 10.60s/it]


‚ùå Paper #189 FAILED after 3 retries. id=2511.06274v1
   title="Why `Fair Market Valuations' are Inappropirate for Employee-Owned Firms and Partnerships"

‚ö†Ô∏è Paper #190 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #190 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  42%|‚ñà‚ñà‚ñà‚ñà‚ñè     | 190/451 [1:38:13<45:43, 10.51s/it]


‚ùå Paper #190 FAILED after 3 retries. id=2511.04412v3
   title='On the Estimation of Own Funds for Life Insurers: A Study of Direct, Indirect, and Control Variate Methods in a Risk-Neu'

‚ö†Ô∏è Paper #191 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #191 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  42%|‚ñà‚ñà‚ñà‚ñà‚ñè     | 191/451 [1:38:23<45:36, 10.53s/it]


‚ùå Paper #191 FAILED after 3 retries. id=2511.00308v1
   title='Asset Pricing in the Presence of Market Microstructure Noise'

‚ö†Ô∏è Paper #192 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #192 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  43%|‚ñà‚ñà‚ñà‚ñà‚ñé     | 192/451 [1:38:34<45:38, 10.57s/it]


‚ùå Paper #192 FAILED after 3 retries. id=2510.27277v1
   title='Black-Scholes Model, comparison between Analytical Solution and Numerical Analysis'

‚ö†Ô∏è Paper #193 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #193 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  43%|‚ñà‚ñà‚ñà‚ñà‚ñé     | 193/451 [1:38:45<45:49, 10.66s/it]


‚ùå Paper #193 FAILED after 3 retries. id=2510.26228v1
   title='ChatGPT in Systematic Investing -- Enhancing Risk-Adjusted Returns with LLMs'

‚ö†Ô∏è Paper #194 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #194 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  43%|‚ñà‚ñà‚ñà‚ñà‚ñé     | 194/451 [1:38:55<45:32, 10.63s/it]


‚ùå Paper #194 FAILED after 3 retries. id=2510.21297v1
   title='Jump risk premia in the presence of clustered jumps'

‚ö†Ô∏è Paper #195 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #195 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  43%|‚ñà‚ñà‚ñà‚ñà‚ñé     | 195/451 [1:39:06<45:24, 10.64s/it]


‚ùå Paper #195 FAILED after 3 retries. id=2510.19126v1
   title='An Efficient Calibration Framework for Volatility Derivatives under Rough Volatility with Jumps'

‚ö†Ô∏è Paper #196 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #196 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  43%|‚ñà‚ñà‚ñà‚ñà‚ñé     | 196/451 [1:39:17<45:20, 10.67s/it]


‚ùå Paper #196 FAILED after 3 retries. id=2510.18159v1
   title='Semi-analytical pricing of American options with hybrid dividends via integral equations and the GIT method'

‚ö†Ô∏è Paper #197 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #197 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  44%|‚ñà‚ñà‚ñà‚ñà‚ñé     | 197/451 [1:39:27<44:45, 10.57s/it]


‚ùå Paper #197 FAILED after 3 retries. id=2510.17221v1
   title='Design and valuation of multi-region CoCoCat bonds'

‚ö†Ô∏è Paper #198 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #198 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  44%|‚ñà‚ñà‚ñà‚ñà‚ñç     | 198/451 [1:39:38<44:45, 10.61s/it]


‚ùå Paper #198 FAILED after 3 retries. id=2510.15984v2
   title='Berms without Calibration'

‚ö†Ô∏è Paper #199 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

‚ö†Ô∏è Paper #199 attempt 2 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Generating QAs:  44%|‚ñà‚ñà‚ñà‚ñà‚ñç     | 199/451 [1:39:48<44:36, 10.62s/it]


‚ùå Paper #199 FAILED after 3 retries. id=2510.10343v2
   title='Learning the Exact SABR Model'


Generating QAs:  44%|‚ñà‚ñà‚ñà‚ñà‚ñç     | 199/451 [1:39:50<2:06:25, 30.10s/it]


‚ö†Ô∏è Paper #200 attempt 1 failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}





KeyboardInterrupt: 

## Ran out of Chatgpt API Balance. The below is an addition

In [None]:
# followup_generate_qas_from_paper92.py
# Continues generation starting at paper index 92 (1-based),
# generates 2 QA pairs per paper, and APPENDS to OUT_PATH (no overwrite).

import os, json, gzip, re, time
from pathlib import Path
from typing import List, Dict, Optional
from openai import OpenAI
from tqdm import tqdm

# -------------------------
# Paths
# -------------------------
CORPUS_DIR = Path("arxiv_finance_corpus")
IN_PATH = CORPUS_DIR / "papers_text_p1_10.jsonl.gz"
OUT_PATH = CORPUS_DIR / "synthetic_finance_qa.chatml.jsonl"

# -------------------------
# Controls
# -------------------------
MODEL = "gpt-5"
TEMPERATURE = 1
RETRIES = 3
BACKOFF = 2
SLEEP_BETWEEN_CALLS = 0.2

MIN_BODY_CHARS = 3000
MAX_CONTEXT_CHARS = 22000

# Continue from paper 92 (1-based). If you want to start at paper #92 exactly,
# set START_PAPER_NUM = 92. If you want to start AFTER paper 91, also 92.
START_PAPER_NUM = 92  # 1-based inclusive

# We generate exactly 2 QAs per paper now.
QAS_PER_PAPER = 2

SYSTEM_PROMPT = (
    "You are a finance and investing expert. You explain concepts clearly, "
    "use correct financial terminology, and avoid speculation. "
    "When answering, you only use information provided in my training knowledge."
)

PRINT_EVERY = 10
SNIPPET_CHARS = 400
FLUSH_EVERY = 20

# -------------------------
# Helpers
# -------------------------
_ws = re.compile(r"\s+")
def clean_text(s: str) -> str:
    s = (s or "").replace("\x00", " ")
    return _ws.sub(" ", s).strip()

def trunc(s: str, max_chars: int) -> str:
    s = s or ""
    if len(s) <= max_chars:
        return s
    cut = s[:max_chars]
    last_space = cut.rfind(" ")
    return (cut[:last_space] if last_space > 0 else cut).rstrip()

def make_generation_prompt(title: str, abstract: str, body: str) -> str:
    source = f"TITLE: {title}\n\nABSTRACT: {abstract}\n\nEXCERPT: {body}"
    source = trunc(source, MAX_CONTEXT_CHARS)
    return f"""
You will create training Q&A pairs to teach a model finance/investing expertise.

Rules:
- Use ONLY my training knowledge. Do not invent facts, datasets, results, or numbers.
- If details are missing, say so explicitly (e.g., "The abstract does not specify ...").
- Questions should be what a serious finance student or analyst would ask:
  mix of conceptual, methodology, implications, limitations, and definitions.
- Answers should be precise, finance-literate, and grounded.
- Return STRICT JSON with exactly {QAS_PER_PAPER} items.

Output JSON schema:
{{
  "qas": [
    {{"question": "...", "answer": "..."}},
    {{"question": "...", "answer": "..."}}
  ]
}}

SOURCE:
\"\"\"{source}\"\"\"
""".strip()

def parse_qas(content: str) -> Optional[List[Dict[str, str]]]:
    try:
        obj = json.loads(content)
    except Exception:
        return None
    if not isinstance(obj, dict) or "qas" not in obj or not isinstance(obj["qas"], list):
        return None

    qas: List[Dict[str, str]] = []
    for item in obj["qas"]:
        if not isinstance(item, dict):
            continue
        q, a = item.get("question"), item.get("answer")
        if isinstance(q, str) and isinstance(a, str):
            qas.append({"question": q.strip(), "answer": a.strip()})

    return qas if len(qas) == QAS_PER_PAPER else None

def chatml_line(system_prompt: str, user_q: str, assistant_a: str) -> Dict[str, str]:
    return {"text": f"<|system|>{system_prompt}<|user|>{user_q}<|assistant|>{assistant_a}"}

# -------------------------
# OpenAI client
# -------------------------
if not os.getenv("OPENAI_API_KEY"):
    raise RuntimeError("Missing OPENAI_API_KEY in environment. Add it to .env or export it.")
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# -------------------------
# Load count for tqdm
# -------------------------
def count_lines_gz(path: Path) -> int:
    n = 0
    with gzip.open(path, "rt", encoding="utf-8") as f:
        for _ in f:
            n += 1
    return n

print(f"IN_PATH: {IN_PATH} (exists={IN_PATH.exists()})")
print(f"OUT_PATH: {OUT_PATH} (append mode, will NOT overwrite)")
print("Counting papers (for progress bar)...")
total_lines = count_lines_gz(IN_PATH)
print(f"Total records in corpus file: {total_lines}")

# Convert desired start paper number (1-based) to zero-based index
start_index = max(0, START_PAPER_NUM - 1)
if start_index >= total_lines:
    raise RuntimeError(f"START_PAPER_NUM={START_PAPER_NUM} is beyond total_lines={total_lines}")

print(f"Starting at paper #{START_PAPER_NUM} (1-based), line index {start_index} (0-based).")
print(f"Will generate {QAS_PER_PAPER} QA pairs per usable paper.\n")

# -------------------------
# Main
# -------------------------
OUT_PATH.parent.mkdir(parents=True, exist_ok=True)

total_papers_read = 0          # papers iterated over after start_index
global_paper_num = 0           # absolute 1-based paper number in corpus
used_papers = 0
total_qas = 0

skipped_extraction = 0
skipped_short = 0
skipped_empty_body_field = 0
model_failures = 0

start_time = time.time()

with gzip.open(IN_PATH, "rt", encoding="utf-8") as fin, open(OUT_PATH, "a", encoding="utf-8") as fout:
    # Advance to start_index without loading all lines
    for i, _ in enumerate(fin):
        if i + 1 >= START_PAPER_NUM:
            # We‚Äôve just consumed START_PAPER_NUM-1 lines; rewind one line by keeping it
            # But gzip streams can't seek back; instead we break and handle by reopening.
            break

# Reopen and skip precisely start_index lines
with gzip.open(IN_PATH, "rt", encoding="utf-8") as fin, open(OUT_PATH, "a", encoding="utf-8") as fout:
    for _ in range(start_index):
        next(fin)

    # tqdm total is remaining lines
    remaining = total_lines - start_index

    for line in tqdm(fin, total=remaining, desc=f"Generating QAs from paper #{START_PAPER_NUM}+"):
        total_papers_read += 1
        global_paper_num = start_index + total_papers_read  # 1-based absolute paper number

        rec = json.loads(line)

        # Body field selection
        body = rec.get("body_p1_10")
        if body is None:
            body = rec.get("body_p1_5", "")

        if not rec.get("extraction_ok", True):
            skipped_extraction += 1
            continue

        if int(rec.get("body_chars", 0)) < MIN_BODY_CHARS:
            skipped_short += 1
            continue

        body = clean_text(body)
        if not body:
            skipped_empty_body_field += 1
            continue

        title = clean_text(rec.get("primary_title", ""))
        abstract = clean_text(rec.get("abstract", ""))

        prompt = make_generation_prompt(title, abstract, body)

        qas = None
        last_raw = None

        for attempt in range(1, RETRIES + 1):
            try:
                resp = client.chat.completions.create(
                    model=MODEL,
                    messages=[{"role": "user", "content": prompt}],
                    temperature=TEMPERATURE,
                    response_format={"type": "json_object"},
                )
                last_raw = resp.choices[0].message.content
                qas = parse_qas(last_raw)
                if qas is None:
                    raise ValueError(f"Invalid JSON or not exactly {QAS_PER_PAPER} qas")
                break
            except Exception as e:
                if attempt < RETRIES:
                    print(f"\n‚ö†Ô∏è Paper #{global_paper_num} attempt {attempt} failed: {e}")
                    if last_raw:
                        snip = clean_text(last_raw)[:SNIPPET_CHARS]
                        print(f"   model snippet: {snip!r}")
                    time.sleep(BACKOFF * attempt)
                else:
                    model_failures += 1
                    print(f"\n‚ùå Paper #{global_paper_num} FAILED after {RETRIES} retries. id={rec.get('id')}")
                    print(f"   title={title[:120]!r}")
                    if last_raw:
                        snip = clean_text(last_raw)[:SNIPPET_CHARS]
                        print(f"   last model snippet: {snip!r}")
                    qas = None

        if qas is None:
            continue

        # Append QAS_PER_PAPER lines to the existing OUT_PATH
        for qa in qas:
            row = chatml_line(SYSTEM_PROMPT, qa["question"], qa["answer"])
            fout.write(json.dumps(row, ensure_ascii=False) + "\n")
            total_qas += 1

        used_papers += 1

        if used_papers % FLUSH_EVERY == 0:
            fout.flush()

        if used_papers % PRINT_EVERY == 0:
            elapsed = time.time() - start_time
            print(
                f"\nüìå Progress: start_paper={START_PAPER_NUM} | current_paper={global_paper_num} | "
                f"used={used_papers} | qas={total_qas} | "
                f"skipped(extraction={skipped_extraction}, short={skipped_short}, empty_body={skipped_empty_body_field}) | "
                f"model_failures={model_failures} | elapsed={elapsed/60:.1f}m"
            )

        time.sleep(SLEEP_BETWEEN_CALLS)

print("\n‚úÖ DONE (follow-up run)")
print(f"Started at paper #: {START_PAPER_NUM}")
print(f"Papers read (from start): {total_papers_read}")
print(f"Used papers: {used_papers}")
print(f"Total QAs appended: {total_qas}")
print(f"Skipped - extraction_ok false: {skipped_extraction}")
print(f"Skipped - body_chars too small: {skipped_short}")
print(f"Skipped - empty body field: {skipped_empty_body_field}")
print(f"Model failures: {model_failures}")
print(f"Output (appended): {OUT_PATH}")


IN_PATH: arxiv_finance_corpus/papers_text_p1_10.jsonl.gz (exists=True)
OUT_PATH: arxiv_finance_corpus/synthetic_finance_qa.chatml.jsonl (append mode, will NOT overwrite)
Counting papers (for progress bar)...
Total records in corpus file: 451
Starting at paper #92 (1-based), line index 91 (0-based).
Will generate 2 QA pairs per usable paper.



Generating QAs from paper #92+:   2%|‚ñé         | 9/360 [07:45<5:06:55, 52.46s/it]


üìå Progress: start_paper=92 | current_paper=102 | used=10 | qas=20 | skipped(extraction=1, short=0, empty_body=0) | model_failures=0 | elapsed=8.6m


Generating QAs from paper #92+:   6%|‚ñå         | 21/360 [15:09<3:28:06, 36.83s/it]


üìå Progress: start_paper=92 | current_paper=112 | used=20 | qas=40 | skipped(extraction=1, short=0, empty_body=0) | model_failures=0 | elapsed=15.2m


Generating QAs from paper #92+:   9%|‚ñä         | 31/360 [21:57<3:14:48, 35.53s/it]


üìå Progress: start_paper=92 | current_paper=122 | used=30 | qas=60 | skipped(extraction=1, short=0, empty_body=0) | model_failures=0 | elapsed=21.9m


Generating QAs from paper #92+:  11%|‚ñà         | 40/360 [29:27<4:12:19, 47.31s/it]


üìå Progress: start_paper=92 | current_paper=132 | used=40 | qas=80 | skipped(extraction=1, short=0, empty_body=0) | model_failures=0 | elapsed=30.3m


Generating QAs from paper #92+:  14%|‚ñà‚ñç        | 51/360 [38:33<4:10:08, 48.57s/it]


üìå Progress: start_paper=92 | current_paper=142 | used=50 | qas=100 | skipped(extraction=1, short=0, empty_body=0) | model_failures=0 | elapsed=38.6m


Generating QAs from paper #92+:  17%|‚ñà‚ñã        | 61/360 [45:39<3:27:11, 41.58s/it]


üìå Progress: start_paper=92 | current_paper=152 | used=60 | qas=120 | skipped(extraction=1, short=0, empty_body=0) | model_failures=0 | elapsed=45.7m


Generating QAs from paper #92+:  19%|‚ñà‚ñâ        | 70/360 [55:09<4:02:04, 50.08s/it]


üìå Progress: start_paper=92 | current_paper=162 | used=70 | qas=140 | skipped(extraction=1, short=0, empty_body=0) | model_failures=0 | elapsed=55.8m


Generating QAs from paper #92+:  22%|‚ñà‚ñà‚ñè       | 80/360 [1:01:11<2:51:19, 36.71s/it]


üìå Progress: start_paper=92 | current_paper=172 | used=80 | qas=160 | skipped(extraction=1, short=0, empty_body=0) | model_failures=0 | elapsed=61.6m


Generating QAs from paper #92+:  24%|‚ñà‚ñà‚ñç       | 88/360 [1:06:24<3:25:15, 45.28s/it]


KeyboardInterrupt: 

## Add edge cases

In [19]:
# add_20_edgecase_qas_refusal.py
# Appends 20 edge-case QAs (false-logic / irrelevant) to your ChatML JSONL.
# Each answer explicitly refuses and explains why (not grounded in provided source text).

import json
from pathlib import Path

CORPUS_DIR = Path("arxiv_finance_corpus")
OUT_PATH = CORPUS_DIR / "synthetic_finance_qa.chatml.jsonl"

SYSTEM_PROMPT = (
    "You are a finance and investing expert. You explain concepts clearly, "
    "use correct financial terminology, and avoid speculation. "
    "When answering, you only use information provided in the source text."
)

def chatml_line(system_prompt: str, user_q: str, assistant_a: str) -> dict:
    return {"text": f"<|system|>{system_prompt}<|user|>{user_q}<|assistant|>{assistant_a}"}

# 20 edge-case prompts that are either:
# - logically invalid / contradictory
# - not finance-related
# - require external info not present in any provided source text
# Each answer MUST refuse and state why.
EDGE_CASE_QAS = [
    (
        "If inflation is 2% and 3% at the same time, what is the exact real return?",
        "I can‚Äôt give an exact real return because the question is internally contradictory (inflation can‚Äôt be both 2% and 3% for the same period in a single calculation), and no source text is provided to clarify which value applies."
    ),
    (
        "Prove that every stock that goes down must go back up eventually.",
        "I can‚Äôt answer because the claim is false as stated (there is no guarantee a stock must recover), and there‚Äôs no source text provided that defines assumptions or a model under which such a statement could be evaluated."
    ),
    (
        "What is Warren Buffett‚Äôs current cash balance to the dollar?",
        "I can‚Äôt answer because that requires up-to-date external information that is not included in my training knowledge provided."
    ),
    (
        "What‚Äôs the best crypto to buy tomorrow for guaranteed profit?",
        "I can‚Äôt answer because it asks for guaranteed future profits (which isn‚Äôt possible to guarantee), and it requires speculation and market prediction not supported by any provided source text."
    ),
    (
        "If P/E is negative, does that mean the company is infinitely cheap and therefore a buy?",
        "I can‚Äôt answer as framed because the conclusion doesn‚Äôt follow from the premise (negative P/E usually reflects negative earnings and doesn‚Äôt imply ‚Äúinfinitely cheap‚Äù), and no source text is provided to define the context or metrics being used."
    ),
    (
        "Explain the plot twist in Inception and how it impacts bond yields.",
        "I can‚Äôt answer because the question is not related to the finance source text and mixes an unrelated film discussion with bond yields without any provided evidence or context."
    ),
    (
        "If a portfolio has 0% risk, why not lever it 1000x to become rich?",
        "I can‚Äôt answer because the premise is unrealistic/undefined (0% risk isn‚Äôt established here), and no source text is provided to specify constraints like borrowing costs, margin rules, tail risk, or model assumptions."
    ),
    (
        "What was the exact closing price of AAPL on my birthday?",
        "I can‚Äôt answer because it requires external historical market data and your birthday date, neither of which are included in the provided source text."
    ),
    (
        "Since diversification reduces risk, does adding infinitely many stocks reduce risk to exactly zero?",
        "I can‚Äôt give a definitive numeric answer because the question assumes a specific risk model and correlation structure that are not provided in my training knowledge; in general, diversification reduces idiosyncratic risk but does not eliminate systematic risk."
    ),
    (
        "In the paper, what is the author‚Äôs political opinion about taxation?",
        "I can‚Äôt answer because political opinions are not finance methodology/results, and no relevant statement is provided in my training knowledge to support an answer."
    ),
    (
        "If GDP goes up, does that always mean every stock must rise the same day?",
        "I can‚Äôt answer as stated because the logic is invalid (macro growth doesn‚Äôt force same-day uniform stock moves), and no source text is provided that claims or models such a deterministic relationship."
    ),
    (
        "What is the secret algorithm hedge funds use to beat the market?",
        "I can‚Äôt answer because that asks for unspecified proprietary information and isn‚Äôt grounded in any provided source text."
    ),
    (
        "If interest rates are 5%, how many years until my money doubles exactly (no assumptions needed)?",
        "I can‚Äôt give an exact answer because doubling time depends on compounding conventions (simple vs compound, frequency), and no source text specifies which assumptions to use."
    ),
    (
        "Summarize the entire dataset you trained on and list the top 10 sources.",
        "I can‚Äôt answer because that information isn‚Äôt included in the provided source text, and I don‚Äôt have access to your full training dataset or its sources from this prompt."
    ),
    (
        "Is the statement 'all arbitrage is risk-free' always true? Answer yes or no.",
        "I can‚Äôt answer with a simple yes/no because the statement depends on definitions and market frictions; without source text defining arbitrage conditions (execution risk, latency, funding, settlement), a binary answer would be unsupported."
    ),
    (
        "What is my portfolio‚Äôs Sharpe ratio if I don‚Äôt tell you returns or volatility?",
        "I can‚Äôt compute it because the required inputs (return, risk-free rate, volatility) are not provided in my training knowledge or the question."
    ),
    (
        "If a stablecoin depegs, does that mean it‚Äôs illegal?",
        "I can‚Äôt answer because legality depends on jurisdiction and facts not provided in my training knowledge; the question also mixes a market event (depeg) with legal status without evidence."
    ),
    (
        "How many LPs lost money on Uniswap v3 last week, exactly?",
        "I can‚Äôt answer because it requires specific time-bounded on-chain analysis data not included in the provided source text."
    ),
    (
        "If two assets have correlation 0, does that prove they are independent and safe to pair trade?",
        "I can‚Äôt answer as stated because zero correlation does not prove independence, and the safety of a pair trade depends on assumptions and data not provided in my training knowledge."
    ),
    (
        "What is the capital of France and how does it affect the yield curve?",
        "I can‚Äôt answer because the first part is general trivia unrelated to the finance source text, and the second part asserts a relationship without any provided evidence or context to support it."
    ),
]

def main():
    if not OUT_PATH.exists():
        raise FileNotFoundError(
            f"OUT_PATH does not exist: {OUT_PATH}\n"
            f"Create it first (run your main generator once), or correct CORPUS_DIR/OUT_PATH."
        )

    OUT_PATH.parent.mkdir(parents=True, exist_ok=True)

    with open(OUT_PATH, "a", encoding="utf-8") as f:
        for q, a in EDGE_CASE_QAS:
            row = chatml_line(SYSTEM_PROMPT, q, a)
            f.write(json.dumps(row, ensure_ascii=False) + "\n")

    print(f"‚úÖ Appended {len(EDGE_CASE_QAS)} edge-case refusal QAs to: {OUT_PATH}")

if __name__ == "__main__":
    main()


‚úÖ Appended 20 edge-case refusal QAs to: arxiv_finance_corpus/synthetic_finance_qa.chatml.jsonl


## Clean/ Normalize the dataset.jsonl

In [24]:
import json
import unicodedata

IN  = "arxiv_finance_corpus/synthetic_finance_qa.chatml.jsonl"
OUT = "arxiv_finance_corpus/synthetic_finance_qa.chatml.clean.jsonl"

kept = 0
dropped = 0

with open(IN, "r", encoding="utf-8", errors="replace") as fin, \
     open(OUT, "w", encoding="utf-8") as fout:
    for i, line in enumerate(fin, 1):
        line = line.strip()
        if not line:
            dropped += 1
            continue
        try:
            obj = json.loads(line)
            if "text" not in obj or not isinstance(obj["text"], str):
                dropped += 1
                continue

            # Normalize unicode (this removes ambiguous characters)
            clean_text = unicodedata.normalize("NFKC", obj["text"])
            fout.write(json.dumps({"text": clean_text}, ensure_ascii=False) + "\n")
            kept += 1
        except Exception:
            dropped += 1

print(f"‚úÖ Cleaned dataset written to: {OUT}")
print(f"Kept: {kept}, Dropped: {dropped}")


‚úÖ Cleaned dataset written to: arxiv_finance_corpus/synthetic_finance_qa.chatml.clean.jsonl
Kept: 644, Dropped: 0


## Fine tuning setup

In [2]:
pip install unsloth transformers peft bitsandbytes datasets torch

Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting unsloth
  Downloading unsloth-2025.12.9-py3-none-any.whl.metadata (65 kB)
Collecting transformers
  Downloading transformers-4.57.3-py3-none-any.whl.metadata (43 kB)
Collecting peft
  Downloading peft-0.18.0-py3-none-any.whl.metadata (14 kB)
Collecting bitsandbytes
  Downloading bitsandbytes-0.49.0-py3-none-manylinux_2_24_x86_64.whl.metadata (10 kB)
Collecting datasets
  Downloading datasets-4.4.2-py3-none-any.whl.metadata (19 kB)
Collecting unsloth_zoo>=2025.12.7 (from unsloth)
  Downloading unsloth_zoo-2025.12.7-py3-none-any.whl.metadata (32 kB)
Collecting torch
  Downloading torch-2.9.1-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (30 kB)
Collecting tyro (from unsloth)
  Downloading tyro-1.0.3-py3-none-any.whl.metadata (12 kB)
Collecting xformers>=0.0.27.post2 (from unsloth)
  Downloading xformers-0.0.33.post2-cp39-abi3-m

In [34]:
# Finetune Llama-3.1-8B-Instruct (Unsloth 4-bit) on ChatML JSONL
# WITHOUT HuggingFace datasets/pyarrow (fixes DatasetGenerationError).
#
# Input JSONL format (one per line):
# {"text": "<|system|>...<|user|>...<|assistant|>..."}
#
# Output:
# - LoRA adapters saved to OUTPUT_DIR
# - tokenizer saved to OUTPUT_DIR

import os
import json
import torch
from torch.utils.data import Dataset
from transformers import TrainingArguments, Trainer
from unsloth import FastLanguageModel

# -------------------------
# Config
# -------------------------
DATA_PATH = "arxiv_finance_corpus/synthetic_finance_qa.chatml.clean.jsonl"
OUTPUT_DIR = "llama31-8b-finance-qlora-lora"

MODEL_NAME = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"
MAX_SEQ_LENGTH = 2048

PER_DEVICE_BATCH = 2
GRAD_ACCUM = 8
EPOCHS = 2
LR = 1e-4
WARMUP_RATIO = 0.03
LOGGING_STEPS = 25
SAVE_STEPS = 50

# LoRA
LORA_R = 16
LORA_ALPHA = 32
LORA_DROPOUT = 0.05

# -------------------------
# Sanity checks
# -------------------------
if not os.path.exists(DATA_PATH):
    raise FileNotFoundError(f"Dataset not found at: {DATA_PATH}")

if not torch.cuda.is_available():
    raise RuntimeError("CUDA GPU not detected. QLoRA fine-tuning requires a GPU.")

print("‚úÖ CUDA:", torch.cuda.get_device_name(0))
print("‚úÖ DATA_PATH:", os.path.abspath(DATA_PATH))

# -------------------------
# Load model + tokenizer (4-bit)
# -------------------------
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=MODEL_NAME,
    max_seq_length=MAX_SEQ_LENGTH,
    dtype=None,
    load_in_4bit=True,
)

# Important: ensure tokenizer has a pad token for padding="max_length"
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# -------------------------
# Attach LoRA adapters (QLoRA)
# -------------------------
model = FastLanguageModel.get_peft_model(
    model,
    r=LORA_R,
    lora_alpha=LORA_ALPHA,
    lora_dropout=LORA_DROPOUT,
    bias="none",
    target_modules=[
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj",
    ],
    use_gradient_checkpointing="unsloth",
    random_state=42,
)

# -------------------------
# Read JSONL into memory (650-ish rows is fine)
# -------------------------
texts = []
with open(DATA_PATH, "r", encoding="utf-8") as f:
    for line in f:
        line = line.strip()
        if not line:
            continue
        obj = json.loads(line)
        t = obj.get("text", None)
        if isinstance(t, str) and t.strip():
            texts.append(t)

if len(texts) == 0:
    raise RuntimeError("Loaded 0 training examples. Check your JSONL content.")

print(f"‚úÖ Loaded {len(texts)} examples")
print("üîé Example snippet:", texts[0][:200].replace("\n", "\\n"))

# -------------------------
# Torch Dataset (no pyarrow)
# -------------------------
class ChatMLJsonlDataset(Dataset):
    def __init__(self, texts, tokenizer, max_length):
        self.texts = texts
        self.tokenizer = tokenizer
        self.max_length = max_length

    def __len__(self):
        return len(self.texts)

    def __getitem__(self, idx):
        enc = self.tokenizer(
            self.texts[idx],
            truncation=True,
            max_length=self.max_length,
            padding="max_length",
            return_tensors="pt",
        )
        input_ids = enc["input_ids"][0]
        attention_mask = enc["attention_mask"][0]
        labels = input_ids.clone()  # causal LM labels

        return {
            "input_ids": input_ids,
            "attention_mask": attention_mask,
            "labels": labels,
        }

train_dataset = ChatMLJsonlDataset(texts, tokenizer, MAX_SEQ_LENGTH)

# -------------------------
# Training args + Trainer
# -------------------------
args = TrainingArguments(
    output_dir=OUTPUT_DIR,
    per_device_train_batch_size=PER_DEVICE_BATCH,
    gradient_accumulation_steps=GRAD_ACCUM,
    num_train_epochs=EPOCHS,
    learning_rate=LR,
    warmup_ratio=WARMUP_RATIO,
    lr_scheduler_type="cosine",
    logging_steps=LOGGING_STEPS,
    save_steps=SAVE_STEPS,
    save_total_limit=2,
    optim="paged_adamw_8bit",
    weight_decay=0.01,
    fp16=not torch.cuda.is_bf16_supported(),
    bf16=torch.cuda.is_bf16_supported(),
    report_to="none",
    seed=42,
    remove_unused_columns=False,  # IMPORTANT for custom datasets
)

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=train_dataset,
)

# -------------------------
# Train
# -------------------------
trainer.train()

# -------------------------
# Save LoRA adapters + tokenizer
# -------------------------
model.save_pretrained(OUTPUT_DIR)
tokenizer.save_pretrained(OUTPUT_DIR)
print(f"‚úÖ Saved LoRA adapters + tokenizer to: {OUTPUT_DIR}")

# -------------------------
# OPTIONAL: Merge LoRA -> 16-bit full model (often OOM; skip if unsure)
# -------------------------
# MERGED_DIR = "llama31-8b-finance-merged"
# model.save_pretrained_merged(MERGED_DIR, tokenizer, save_method="merged_16bit")
# print(f"‚úÖ Saved merged 16-bit model to: {MERGED_DIR}")



‚úÖ CUDA: inference-ai GPU cuda
‚úÖ DATA_PATH: /home/jovyan/work/MLE_in_Gen_AI-Course/class7/Homework_Kevin/arxiv_finance_corpus/synthetic_finance_qa.chatml.clean.jsonl
==((====))==  Unsloth 2025.12.9: Fast Llama patching. Transformers: 4.57.3.
   \\   /|    inference-ai GPU cuda. Num GPUs = 1. Max memory: 47.988 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.9.1+cu128. CUDA: 8.6. CUDA Toolkit: 12.8. Triton: 3.5.1
\        /    Bfloat16 = TRUE. FA [Xformers = None. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
‚úÖ Loaded 644 examples
üîé Example snippet: <|system|>You are a finance and investing expert. You explain concepts clearly, use correct financial terminology, and avoid speculation. When answering, you only use information provided in the sourc


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 644 | Num Epochs = 2 | Total steps = 82
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 8
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 8 x 1) = 16
 "-____-"     Trainable parameters = 41,943,040 of 8,072,204,288 (0.52% trained)


Unsloth: Will smartly offload gradients to save VRAM!


Step,Training Loss
25,56.829
50,40.3541
75,39.0338


‚úÖ Saved LoRA adapters + tokenizer to: llama31-8b-finance-qlora-lora


## Evaluation

In [3]:
pip install -U unsloth transformers peft accelerate bitsandbytes torch


Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [None]:
# Finetuning eval: compare your FT LoRA (llama31-8b-finance-qlora-lora)
# vs the original Unsloth base (Meta-Llama-3.1-8B-Instruct-bnb-4bit)
#
# Works in a notebook or a .py file.
#
# If you haven't installed deps:
# pip install -U unsloth transformers peft accelerate bitsandbytes torch

import os
import time
import torch
from unsloth import FastLanguageModel
from peft import PeftModel

# -----------------------------
# Config
# -----------------------------
BASE_MODEL_ID = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"
# Set this to your local adapter folder OR a HF repo id containing the LoRA adapter weights
FT_ADAPTER_PATH = "llama31-8b-finance-qlora-lora"  # e.g., "./llama31-8b-finance-qlora-lora"

MAX_NEW_TOKENS = 384
TEMPERATURE = 0.2
TOP_P = 0.9
SEED = 42

torch.manual_seed(SEED)
if torch.cuda.is_available():
    torch.cuda.manual_seed_all(SEED)

# -----------------------------
# 5 finance/investing questions (picked from your list)
# -----------------------------
test_questions = [
    "Explain look-ahead bias in backtesting and how to avoid it.",
    "What is survivorship bias and why does it matter in equity datasets?",
    "Compare VaR and Expected Shortfall conceptually. When is ES preferred?",
    "Why might a model have high AUC but poor calibration?",
    "Why can a high Sharpe ratio be misleading in small samples?",
]

# -----------------------------
# Helpers
# -----------------------------
def load_base_model():
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name=BASE_MODEL_ID,
        max_seq_length=4096,
        load_in_4bit=True,
        dtype=None,  # let unsloth choose
    )
    model = FastLanguageModel.for_inference(model)
    return model, tokenizer

def load_ft_model(tokenizer):
    # Load the same base again (clean weights), then attach the LoRA adapter
    model, _tok = FastLanguageModel.from_pretrained(
        model_name=BASE_MODEL_ID,
        max_seq_length=4096,
        load_in_4bit=True,
        dtype=None,
    )
    # Attach adapter robustly across PEFT versions:
    # - Prefer PeftModel.from_pretrained (most compatible)
    # - If you have a merged model instead of an adapter, load that as a normal HF model instead.
    model = PeftModel.from_pretrained(model, FT_ADAPTER_PATH)
    model = FastLanguageModel.for_inference(model)
    return model

def format_chat(tokenizer, user_text):
    messages = [
        {"role": "system", "content": "You are a helpful finance and investing assistant. Be accurate, concise, and practical."},
        {"role": "user", "content": user_text},
    ]
    # Unsloth tokenizers support apply_chat_template
    input_ids = tokenizer.apply_chat_template(
        messages,
        tokenize=True,
        add_generation_prompt=True,
        return_tensors="pt",
    )
    return input_ids

@torch.no_grad()
def generate_answer(model, tokenizer, question):
    input_ids = format_chat(tokenizer, question).to(model.device)
    attention_mask = torch.ones_like(input_ids, device=model.device)

    gen = model.generate(
        input_ids=input_ids,
        attention_mask=attention_mask,
        max_new_tokens=MAX_NEW_TOKENS,
        temperature=TEMPERATURE,
        do_sample=(TEMPERATURE > 0),
        top_p=TOP_P,
        use_cache=True,
    )

    text = tokenizer.decode(gen[0], skip_special_tokens=False)

    # Extract only the assistant completion if possible
    # (Works for most ChatML-ish templates)
    # Fallback: return full decoded text.
    for marker in ["<|assistant|>", "assistant\n", "### Assistant:"]:
        if marker in text:
            return text.split(marker, 1)[-1].strip()
    return text.strip()

def judge_with_base(base_model, tokenizer, question, base_answer, ft_answer):
    """
    Optional: LLM-as-judge using the BASE model only.
    Produces a score (1-10) for each answer + short rationale.
    """
    judge_prompt = f"""
You are grading two answers to the same finance/investing question.

Question:
{question}

Answer A (BASE):
{base_answer}

Answer B (FT):
{ft_answer}

Rubric (most important first):
1) Factual correctness (no hallucinations)
2) Finance-specific insight / domain depth
3) Practical usefulness (how-to, pitfalls, examples)
4) Clarity and concision

Return STRICT JSON only:
{{
  "winner": "A" or "B" or "tie",
  "score_A": 1-10,
  "score_B": 1-10,
  "rationale": "1-3 sentences"
}}
""".strip()

    input_ids = format_chat(tokenizer, judge_prompt).to(base_model.device)
    attention_mask = torch.ones_like(input_ids, device=base_model.device)
    gen = base_model.generate(
        input_ids=input_ids,
        attention_mask=attention_mask,
        max_new_tokens=256,
        temperature=0.0,  # deterministic judging
        do_sample=False,
        use_cache=True,
    )
    out = tokenizer.decode(gen[0], skip_special_tokens=False)
    # best-effort extract JSON-ish tail
    if "{" in out and "}" in out:
        out = out[out.find("{"): out.rfind("}") + 1]
    return out.strip()

# -----------------------------
# Run evaluation
# -----------------------------
print("Loading BASE model...")
base_model, tokenizer = load_base_model()

print("Loading FT (LoRA) model...")
ft_model = load_ft_model(tokenizer)

results = []
start = time.time()

for i, q in enumerate(test_questions, 1):
    print(f"\n{'='*90}\nQ{i}: {q}\n{'='*90}")

    base_ans = generate_answer(base_model, tokenizer, q)
    ft_ans   = generate_answer(ft_model, tokenizer, q)

    print("\n--- BASE ANSWER ------------------------------------------------------------")
    print(base_ans)

    print("\n--- FT ANSWER --------------------------------------------------------------")
    print(ft_ans)

    judge_json = judge_with_base(base_model, tokenizer, q, base_ans, ft_ans)
    print("\n--- JUDGE (BASE as judge, JSON) -------------------------------------------")
    print(judge_json)

    results.append({
        "question": q,
        "base_answer": base_ans,
        "ft_answer": ft_ans,
        "judge_json": judge_json,
    })

elapsed = time.time() - start
print(f"\nDone. Evaluated {len(test_questions)} questions in {elapsed:.1f}s.")

# If you want to save results:
import json
with open("eval_results_sft_ft_vs_base.json", "w") as f:
    json.dump(results, f, indent=2)
print("Saved: eval_results_sft_ft_vs_base.json")


NotImplementedError: Unsloth cannot find any torch accelerator? You need a GPU.

In [None]:
# -----------------------------
# NEW 15 QUESTIONS ONLY
# -----------------------------
test_questions = [
    "What is the difference between market impact and slippage?",
    "How does return autocorrelation affect performance evaluation?",
    "What is selection bias in empirical finance studies?",
    "When would you use log returns instead of simple returns, and why?",
    "What does overfitting look like in time-series forecasting models?",
    "Explain data snooping / multiple testing in factor research. How do you control it?",
    "What is walk-forward (rolling) validation and why is it preferred for time series?",
    "Explain leakage in feature engineering for financial ML, with 2 concrete examples.",
    "What does it mean for a strategy to be 'capacity constrained'?",
    "Why can Sharpe be gamed by smoothing returns? What metrics help detect this?",
    "Explain volatility targeting and when it can improve risk-adjusted returns.",
    "What is regime dependence? How can you test if a strategy is regime-dependent?",
    "Compare equal-weight vs value-weight portfolios. When does each make sense?",
    "What is liquidity risk and how can it show up in backtests?",
    "Explain the difference between in-sample, validation, and out-of-sample in quant research.",
]

import time
import json
from pathlib import Path

OUT_FILE = "eval_results_sft_ft_vs_base.json"

print("Loading BASE model...")
base_model, tokenizer = load_base_model()

print("Loading FT model...")
ft_model = load_ft_model(tokenizer)

results = []
start = time.time()

for i, q in enumerate(test_questions, 1):
    print(f"\n{'='*90}\nQ{i}: {q}\n{'='*90}")

    base_ans = generate_answer(base_model, tokenizer, q)
    ft_ans = generate_answer(ft_model, tokenizer, q)

    print("\n--- BASE ANSWER ---")
    print(base_ans)

    print("\n--- FT ANSWER ---")
    print(ft_ans)

    judge_json = judge_with_base(base_model, tokenizer, q, base_ans, ft_ans)
    print("\n--- JUDGE ---")
    print(judge_json)

    results.append({
        "question": q,
        "base_answer": base_ans,
        "ft_answer": ft_ans,
        "judge_json": judge_json,
    })

elapsed = time.time() - start
print(f"\nEvaluated {len(results)} questions in {elapsed:.1f}s.")

# -----------------------------
# Append results to existing JSON
# -----------------------------
if Path(OUT_FILE).exists():
    with open(OUT_FILE, "r") as f:
        existing_results = json.load(f)
else:
    existing_results = []

existing_results.extend(results)

with open(OUT_FILE, "w") as f:
    json.dump(existing_results, f, indent=2)

print(f"Appended {len(results)} entries ‚Üí total {len(existing_results)}")
print(f"Saved to {OUT_FILE}")


Loading BASE model...
==((====))==  Unsloth 2025.12.9: Fast Llama patching. Transformers: 4.57.3.
   \\   /|    inference-ai GPU cuda. Num GPUs = 1. Max memory: 47.988 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.9.1+cu128. CUDA: 8.6. CUDA Toolkit: 12.8. Triton: 3.5.1
\        /    Bfloat16 = TRUE. FA [Xformers = None. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
Loading FT model...
==((====))==  Unsloth 2025.12.9: Fast Llama patching. Transformers: 4.57.3.
   \\   /|    inference-ai GPU cuda. Num GPUs = 1. Max memory: 47.988 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.9.1+cu128. CUDA: 8.6. CUDA Toolkit: 12.8. Triton: 3.5.1
\        /    Bfloat16 = TRUE. FA [Xformers = None. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!

Q1: What is the difference between 

## Alignment

In [4]:
!pip install -U unsloth peft accelerate openai python-dotenv


Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m


## Create preference dataset first

In [2]:
questions = [
    "What is the difference between risk and uncertainty in investing?",
    "How does inflation erode real returns over time?",
    "What is the equity risk premium and why does it exist?",
    "How do nominal and real interest rates differ?",
    "What role do expectations play in asset pricing?",
    "What is time value of money and why is it fundamental?",
    "How do opportunity costs affect investment decisions?",
    "What is diversification and when does it fail?",
    "What does risk-adjusted return mean in practice?",
    "How does leverage amplify both gains and losses?",
    "What is modern portfolio theory and its key assumptions?",
    "What is the efficient frontier?",
    "How does correlation affect portfolio risk?",
    "What is beta and how is it used?",
    "When is beta a poor measure of risk?",
    "What is idiosyncratic versus systematic risk?",
    "How does rebalancing improve portfolio discipline?",
    "What are the tradeoffs of frequent rebalancing?",
    "What is downside risk and how is it measured?",
    "How does drawdown differ from volatility?",
    "What does the Sharpe ratio measure?",
    "When can the Sharpe ratio be misleading?",
    "What is the Sortino ratio and when is it preferred?",
    "How does skewness affect performance metrics?",
    "What is maximum drawdown and why does it matter?",
    "What does return autocorrelation imply?",
    "How does serial correlation distort Sharpe ratios?",
    "What is backtest overfitting?",
    "What is walk-forward validation?",
    "Why is out-of-sample testing critical?",
    "What is loss aversion?",
    "How does overconfidence affect investor outcomes?",
    "What is confirmation bias in investing?",
    "How does herd behavior impact markets?",
    "What is mental accounting?",
    "Why do investors chase past performance?",
    "What is anchoring in price decisions?",
    "How does framing change risk perception?",
    "What is the disposition effect?",
    "How can behavioral biases be mitigated?",
    "What is liquidity risk?",
    "How does bid-ask spread affect returns?",
    "What is slippage?",
    "What is market impact?",
    "How do limit orders differ from market orders?",
    "What is price discovery?",
    "How do dark pools affect markets?",
    "What is high-frequency trading?",
    "How does volatility clustering occur?",
    "What is tail risk?",
    "What distinguishes stocks from bonds?",
    "How do bond prices react to interest rate changes?",
    "What is duration?",
    "What is credit risk?",
    "How do equities hedge inflation?",
    "When do bonds fail as diversifiers?",
    "What is real estate‚Äôs role in a portfolio?",
    "How do commodities behave during inflation?",
    "What are REITs?",
    "What risks are unique to private equity?",
    "What is intrinsic value?",
    "How does discounted cash flow work?",
    "What assumptions drive DCF models?",
    "What is multiple expansion?",
    "Why do valuation multiples differ across sectors?",
    "What is margin of safety?",
    "How do growth assumptions affect valuation?",
    "When are relative valuations misleading?",
    "What is terminal value?",
    "How do interest rates affect equity valuations?",
    "What is Value at Risk?",
    "What is Expected Shortfall?",
    "When is Expected Shortfall preferred over VaR?",
    "What is stress testing?",
    "What is scenario analysis?",
    "How do stop-losses help or hurt?",
    "What is position sizing?",
    "How do correlations change in crises?",
    "What is tail hedging?",
    "How does volatility targeting work?",
    "What causes business cycles?",
    "How do interest rate hikes affect equities?",
    "What is yield curve inversion?",
    "Why does the yield curve matter?",
    "How does monetary policy impact asset prices?",
    "What is fiscal dominance?",
    "How does currency risk affect returns?",
    "What is purchasing power parity?",
    "How do capital flows affect markets?",
    "What is stagflation?",
    "What is survivorship bias?",
    "What is look-ahead bias?",
    "What is selection bias?",
    "What is data snooping?",
    "Why is multiple testing dangerous?",
    "What is leakage in financial machine learning?",
    "Why are financial time series hard to model?",
    "What is non-stationarity?",
    "What is regime dependence?",
    "How do structural breaks affect models?",
    "What defines value investing?",
    "What defines growth investing?",
    "What is momentum investing?",
    "Why does momentum persist?",
    "What is factor investing?",
    "What are common equity factors?",
    "When do factors underperform?",
    "What is smart beta?",
    "What is active versus passive investing?",
    "Why do most active managers underperform?",
    "How should emergency funds be sized?",
    "What is the role of insurance in financial planning?",
    "How does risk tolerance differ from risk capacity?",
    "How should young investors allocate assets?",
    "How does human capital affect asset allocation?",
    "When should debt be paid down versus invested?",
    "How does compounding benefit long-term savers?",
    "What mistakes do first-time investors make?",
    "How should low-income households approach investing?",
    "What is the importance of fee minimization?",
    "What is capacity constraint in strategies?",
    "Why do some strategies fail at scale?",
    "How does turnover affect net returns?",
    "What is transaction cost analysis?",
    "How do taxes affect investment outcomes?",
    "What is tax-loss harvesting?",
    "How do ETFs maintain price alignment?",
    "What risks do leveraged ETFs carry?",
    "How does volatility decay work?",
    "Why is liquidity pro-cyclical?",
    "What is ESG investing?",
    "What are critiques of ESG metrics?",
    "How does short-termism harm markets?",
    "What is fiduciary duty?",
    "How should advisors manage conflicts of interest?",
    "What is financial inclusion?",
    "How can investing widen inequality?",
    "How can investing reduce inequality?",
    "What is sustainable investing?",
    "How should uncertainty be communicated to clients?",
    "Why do machine learning models struggle in finance?",
    "What is overfitting in trading models?",
    "How does regularization help?",
    "Why does cross-validation fail for time series?",
    "What is feature leakage?",
    "Why is explainability important in finance machine learning?",
    "How does model drift occur?",
    "What is concept drift?",
    "How should models be monitored post-deployment?",
    "When should models be retrained?",
    "What drives long-term equity returns?",
    "How does demographic change affect markets?",
    "Why does patience matter in investing?",
    "What is sequence-of-returns risk?",
    "How does retirement timing affect risk?",
    "What is glide-path allocation?",
    "How should portfolios change with age?",
    "Why is consistency more important than timing?",
    "What is behavioral alpha?",
    "How do expectations shape long-term outcomes?",
    "Why is humility important in investing?",
    "How should uncertainty be priced?",
    "What distinguishes speculation from investing?",
    "How do narratives influence markets?",
    "Why do bubbles form?",
    "What causes bubbles to burst?",
    "What is reflexivity in markets?",
    "How does leverage create fragility?",
    "Why is simplicity powerful in finance?",
    "How should investors think probabilistically?",
    "What does margin of error mean in forecasts?",
    "Why do forecasts fail?",
    "How can investors prepare for unknown risks?",
    "What is antifragility in portfolios?",
    "How should investors respond to drawdowns?",
    "What role does discipline play in returns?",
    "How do incentives shape financial behavior?",
    "Why is long-term thinking rare?",
    "How should advice differ for rich versus poor households?",
    "What makes a financial explanation truly helpful?",
    "Why does buying a stock guarantee profits over time?",
    "How can investors eliminate all risk completely?",
    "Why do stocks always go up every year?",
    "Is diversification useless since markets are random?",
    "How does astrology improve portfolio performance?",
    "Why should everyone day-trade to get rich quickly?",
    "How does owning more stocks reduce total market risk?",
    "Why is past performance a perfect predictor of future returns?",
    "How can leverage make losses impossible?",
    "Why do interest rates only affect bonds and not stocks?",
    "Is inflation always bad for all investors?",
    "Why should low-income families avoid investing entirely?",
    "How does printing money create wealth without consequences?",
    "Why are financial markets controlled by a single hidden entity?",
    "How does buying at the top maximize long-term returns?",
    "Why do charts work better if you believe in them?",
    "How can technical indicators predict exact future prices?",
    "Why does higher volatility always mean higher returns?",
    "Why should investors ignore risk if returns are high?",
    "How does luck replace skill in all investing outcomes?"
] 

In [4]:
import os
import json
import re
import time
import asyncio
from typing import Dict, Any, Tuple, Optional
from pathlib import Path

from dotenv import load_dotenv
from openai import AsyncOpenAI

import torch
from unsloth import FastLanguageModel
from peft import PeftModel

load_dotenv()

# -----------------------------
# Config (single source of truth)
# -----------------------------
FT_ADAPTER_PATH = "llama31-8b-finance-qlora-lora"
BASE_MODEL_ID = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"

OUT_FILE = "DPO_preference_data.jsonl"   # JSONL (1 JSON per line)
DEBUG_EVERY = 1

# Primary judge gates
SKIP_IF_JUDGE_ERROR = True
REQUIRE_JUDGE_CONFIDENCE = True
MIN_JUDGE_CONF = 0.70

# Generation
MAX_NEW_TOKENS = 512
TEMPERATURE = 0.7
TOP_P = 0.9
GEN_ATTEMPTS = 3

SYSTEM_A = "You are a finance expert. Be concise, direct, and effective. Prefer short bullets. Avoid filler."
SYSTEM_B = "You are a finance expert. Be clear, professional, and helpful. Explain simply. Use a brief example if helpful."

# Judge (preference)
JUDGE_MODEL = "gpt-4.1-mini"
JUDGE_CONCURRENCY = 12
JUDGE_RETRIES = 3

# Fact-check gate (targeted)
FACTCHECK_MODEL = "gpt-4.1-mini"
FACTCHECK_RETRIES = 2
FACTCHECK_CONCURRENCY = 12
FACTCHECK_ENABLED = True
FACTCHECK_MIN_CONF = 0.70        # skip if fact-check confidence < this
FACTCHECK_TRIGGER_CONF = 0.80    # only fact-check if judge_conf >= this AND risk flagged

# Quality gates
MIN_CHARS = 180
MAX_SIMILARITY = 0.92
REQUIRE_END_PUNCT = True

# -----------------------------
# Required: questions list
# -----------------------------
# If you already defined questions in another cell, do NOT reassign here.
# questions = [...]

assert isinstance(questions, list) and len(questions) > 0, "questions must be a non-empty list[str]"

# -----------------------------
# Output helpers (create + streaming append)
# -----------------------------
def ensure_outfile(path: str) -> str:
    p = Path(path)
    p.parent.mkdir(parents=True, exist_ok=True)
    if not p.exists():
        p.write_text("", encoding="utf-8")
        print(f"[INIT] Created output file: {p.resolve()}")
    else:
        print(f"[INIT] Using existing output file: {p.resolve()}")
    return str(p)

OUT_FILE = ensure_outfile(OUT_FILE)

def append_jsonl(path: str, record: Dict[str, Any]) -> None:
    line = json.dumps(record, ensure_ascii=False)
    with open(path, "a", encoding="utf-8") as f:
        f.write(line + "\n")
        f.flush()
        os.fsync(f.fileno())

# -----------------------------
# Load FT model once
# -----------------------------
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=BASE_MODEL_ID,
    max_seq_length=4096,
    load_in_4bit=True,
)
model = PeftModel.from_pretrained(model, FT_ADAPTER_PATH)
model = FastLanguageModel.for_inference(model)

def _extract_assistant(text: str) -> str:
    marker = "<|start_header_id|>assistant<|end_header_id|>"
    if marker in text:
        text = text.split(marker, 1)[-1]
    return text.replace("<|eot_id|>", "").strip()

def _looks_truncated(ans: str) -> bool:
    if not ans:
        return True
    s = ans.strip()
    if REQUIRE_END_PUNCT and (s[-1] not in ".!?\"')]}"):
        last_line = s.splitlines()[-1].strip()
        if len(last_line) < 25 or (last_line[-1] not in ".!?\"')]}"):
            return True
    bad_endings = (" To", " And", " Or", " Because", " When", " Which", " With", " Such as", " For example", " e.g.")
    return any(s.endswith(x) for x in bad_endings)

def _simple_similarity(a: str, b: str) -> float:
    ta = set(re.findall(r"\w+", a.lower()))
    tb = set(re.findall(r"\w+", b.lower()))
    if not ta or not tb:
        return 1.0
    return len(ta & tb) / len(ta | tb)

@torch.no_grad()
def gen_answer(system_prompt: str, user_prompt: str) -> str:
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt},
    ]
    input_ids = tokenizer.apply_chat_template(
        messages,
        tokenize=True,
        add_generation_prompt=True,
        return_tensors="pt",
    ).to(model.device)

    out = model.generate(
        input_ids=input_ids,
        max_new_tokens=MAX_NEW_TOKENS,
        temperature=TEMPERATURE,
        top_p=TOP_P,
        do_sample=True,
        use_cache=True,
    )
    return _extract_assistant(tokenizer.decode(out[0], skip_special_tokens=False))

def generate_pair(question: str) -> Tuple[str, str]:
    last_a, last_b = "", ""
    for _ in range(GEN_ATTEMPTS):
        a = gen_answer(SYSTEM_A, question)
        b = gen_answer(SYSTEM_B, question)
        last_a, last_b = a, b

        if len(a) < MIN_CHARS or len(b) < MIN_CHARS:
            continue
        if _looks_truncated(a) or _looks_truncated(b):
            continue
        sim = _simple_similarity(a, b)
        if sim >= MAX_SIMILARITY:
            continue
        return a, b

    return last_a, last_b

# -----------------------------
# OpenAI client
# -----------------------------
client = AsyncOpenAI()

# -----------------------------
# Preference judge (async) + confidence + rationale
# -----------------------------
JUDGE_SYSTEM = (
    "You are a strict finance/investing expert advisor who helps both wealthy and low-income families. "
    "You prioritize factual correctness and sound reasoning over verbosity. "
    "Penalize math/logic errors, hallucinated claims, and unfinished/truncated answers. "
    "If both are similar, choose the one that is more correct and more actionable with fewer words."
)

async def judge(question: str, a: str, b: str) -> Dict[str, Any]:
    user = f"""
Question:
{question}

Answer A:
{a}

Answer B:
{b}

Choose the better answer by this rubric (in order):
1) Factual correctness / no hallucinations
2) Sound finance reasoning + correct caveats
3) Practical usefulness for both rich and low-income families (actionable, risk-aware)
4) Clarity and concision (prefer shorter if equally correct)

Return ONLY strict JSON:
{{
  "winner": "A" or "B",
  "confidence": 0.0-1.0,
  "rationale": "max 1 sentence"
}}
""".strip()

    last_err = None
    for attempt in range(1, JUDGE_RETRIES + 1):
        try:
            r = await client.chat.completions.create(
                model=JUDGE_MODEL,
                messages=[
                    {"role": "system", "content": JUDGE_SYSTEM},
                    {"role": "user", "content": user},
                ],
                temperature=0,
            )
            raw = (r.choices[0].message.content or "").strip()

            parsed = None
            try:
                parsed = json.loads(raw)
            except Exception:
                m = re.search(r'\{[\s\S]*?\}', raw)
                if m:
                    try:
                        parsed = json.loads(m.group(0))
                    except Exception:
                        parsed = None

            if parsed and parsed.get("winner") in ("A", "B"):
                return {
                    "winner": parsed["winner"],
                    "confidence": parsed.get("confidence", 0.0),
                    "rationale": parsed.get("rationale", ""),
                    "raw": raw,
                }

            last_err = f"Could not parse judge JSON. Raw head: {raw[:200]}"
        except Exception as e:
            last_err = str(e)

        await asyncio.sleep(0.5 * attempt)

    return {"winner": "A", "confidence": 0.0, "rationale": "", "raw": "", "error": last_err}

# -----------------------------
# Fact-check gate (async)
# Returns: {"action": "keep"|"flip"|"skip", "confidence": 0-1, "issues": [...], "raw": "..."}
# -----------------------------
FACTCHECK_SYSTEM = (
    "You are a meticulous finance expert and fact-checker. "
    "Your job is to prevent incorrect preferred answers from entering a DPO dataset. "
    "Be strict about math, definitions, causal claims, and unstable numeric claims. "
    "If both answers are flawed or too speculative, choose SKIP."
)

def _risk_flag(question: str, a: str, b: str, judge_rationale: str) -> bool:
    q = question.lower()
    ra = a.lower()
    rb = b.lower()
    rr = (judge_rationale or "").lower()

    # theory-heavy / easy-to-hallucinate topics
    risky_terms = [
        "risk-neutral", "stochastic discount factor", "sdf", "state price", "no-arbitrage",
        "capm", "consumption", "expectations hypothesis", "term structure", "duration", "convexity",
        "var", "expected shortfall", "cvar", "garch", "newey", "hansen-hodrick", "autocorrelation",
        "sharpe", "sortino", "beta", "alpha", "regression", "leakage", "overfitting", "backtest"
    ]
    if any(t in q for t in risky_terms):
        return True

    # numbers / ranges / % (unstable claims)
    if re.search(r"\b\d+(\.\d+)?\s*%|\b\d+(\.\d+)?\s*(bp|bps)\b", a) or re.search(r"\b\d+(\.\d+)?\s*%|\b\d+(\.\d+)?\s*(bp|bps)\b", b):
        return True

    # judge rationale mentions math/inaccuracy
    if "math" in rr or "incorrect" in rr or "inaccurate" in rr or "formula" in rr:
        return True

    return False

async def fact_check(question: str, a: str, b: str, winner: str) -> Dict[str, Any]:
    chosen = a if winner == "A" else b
    rejected = b if winner == "A" else a

    user = f"""
We are building DPO preference data for a finance expert assistant.

Question:
{question}

Chosen answer (currently preferred):
{chosen}

Rejected answer:
{rejected}

Task:
1) Check the CHOSEN answer for factual correctness and sound finance reasoning.
2) If CHOSEN is clearly better and correct, return action="keep".
3) If REJECTED is actually more correct/helpful, return action="flip".
4) If BOTH are flawed/speculative/contain math mistakes or unstable numeric claims, return action="skip".

Return ONLY strict JSON:
{{
  "action": "keep" or "flip" or "skip",
  "confidence": 0.0-1.0,
  "issues": ["short bullet points of concrete problems, if any (max 3)"]
}}
""".strip()

    last_err = None
    for attempt in range(1, FACTCHECK_RETRIES + 1):
        try:
            r = await client.chat.completions.create(
                model=FACTCHECK_MODEL,
                messages=[
                    {"role": "system", "content": FACTCHECK_SYSTEM},
                    {"role": "user", "content": user},
                ],
                temperature=0,
            )
            raw = (r.choices[0].message.content or "").strip()

            parsed = None
            try:
                parsed = json.loads(raw)
            except Exception:
                m = re.search(r'\{[\s\S]*?\}', raw)
                if m:
                    try:
                        parsed = json.loads(m.group(0))
                    except Exception:
                        parsed = None

            if parsed and parsed.get("action") in ("keep", "flip", "skip"):
                conf = parsed.get("confidence", 0.0)
                try:
                    conf = float(conf) if conf is not None else 0.0
                except Exception:
                    conf = 0.0
                return {
                    "action": parsed["action"],
                    "confidence": conf,
                    "issues": parsed.get("issues", []),
                    "raw": raw,
                }

            last_err = f"Could not parse fact-check JSON. Raw head: {raw[:200]}"
        except Exception as e:
            last_err = str(e)

        await asyncio.sleep(0.4 * attempt)

    return {"action": "skip", "confidence": 0.0, "issues": ["fact_check_parse_failed"], "raw": "", "error": last_err}

# -----------------------------
# Main loop (sequential gen; judge/fact-check async; streaming write)
# -----------------------------
async def run():
    total = len(questions)
    saved = 0
    skipped = 0
    flipped = 0
    factchecked = 0

    # dedup by prompt from existing file
    seen_prompts = set()
    with open(OUT_FILE, "r", encoding="utf-8") as f:
        for line in f:
            line = line.strip()
            if not line:
                continue
            try:
                obj = json.loads(line)
                if "prompt" in obj:
                    seen_prompts.add(obj["prompt"])
            except Exception:
                continue

    print(f"[START] cwd={Path.cwd().resolve()}")
    print(f"[START] total_questions={total} already_in_file={len(seen_prompts)} out={Path(OUT_FILE).resolve()}")

    judge_sem = asyncio.Semaphore(JUDGE_CONCURRENCY)
    fact_sem = asyncio.Semaphore(FACTCHECK_CONCURRENCY)

    async def judge_with_sem(q, a, b):
        async with judge_sem:
            return await judge(q, a, b)

    async def fact_with_sem(q, a, b, winner):
        async with fact_sem:
            return await fact_check(q, a, b, winner)

    for i, q in enumerate(questions, 1):
        if q in seen_prompts:
            skipped += 1
            if DEBUG_EVERY:
                print(f"Q{i}/{total} ‚Äì SKIP duplicate")
            continue

        # 1) Generate A/B sequentially (CUDA-safe)
        t0 = time.time()
        a, b = generate_pair(q)
        sim = _simple_similarity(a, b)

        reasons = []
        if len(a) < MIN_CHARS or len(b) < MIN_CHARS:
            reasons.append("too_short")
        if _looks_truncated(a) or _looks_truncated(b):
            reasons.append("truncated")
        if sim >= MAX_SIMILARITY:
            reasons.append(f"too_similar({sim:.2f})")

        if reasons:
            skipped += 1
            if DEBUG_EVERY:
                print(f"Q{i}/{total} ‚Äì SKIP gen: {', '.join(reasons)}")
            continue

        # 2) Judge (I/O bound)
        j = await judge_with_sem(q, a, b)
        winner = j.get("winner", "A")
        conf = j.get("confidence", 0.0)
        judge_err = j.get("error", None)

        if SKIP_IF_JUDGE_ERROR and judge_err:
            skipped += 1
            if DEBUG_EVERY:
                print(f"Q{i}/{total} ‚Äì SKIP judge_error: {judge_err}")
            continue

        try:
            conf_val = float(conf) if conf is not None else 0.0
        except Exception:
            conf_val = 0.0

        if REQUIRE_JUDGE_CONFIDENCE and conf_val < MIN_JUDGE_CONF:
            skipped += 1
            if DEBUG_EVERY:
                print(f"Q{i}/{total} ‚Äì SKIP low_confidence: {conf_val:.2f}")
            continue

        # 3) Targeted fact-check (only if risk flagged)
        fc = None
        if FACTCHECK_ENABLED and conf_val >= FACTCHECK_TRIGGER_CONF and _risk_flag(q, a, b, j.get("rationale", "")):
            factchecked += 1
            fc = await fact_with_sem(q, a, b, winner)
            action = fc.get("action", "skip")
            fc_conf = float(fc.get("confidence", 0.0) or 0.0)

            if action == "skip" or fc_conf < FACTCHECK_MIN_CONF:
                skipped += 1
                if DEBUG_EVERY:
                    print(f"Q{i}/{total} ‚Äì SKIP factcheck action={action} conf={fc_conf:.2f} issues={fc.get('issues')}")
                continue

            if action == "flip":
                winner = "B" if winner == "A" else "A"
                flipped += 1

        chosen = a if winner == "A" else b
        rejected = b if winner == "A" else a

        record = {
            "prompt": q,
            "chosen": chosen,
            "rejected": rejected,
            "meta": {
                "winner": winner,
                "confidence": conf_val,
                "rationale": j.get("rationale", ""),
                "system_A": SYSTEM_A,
                "system_B": SYSTEM_B,
                "judge_model": JUDGE_MODEL,
                "judge_raw": j.get("raw", ""),
                "judge_error": judge_err,
                "factchecked": bool(fc),
                "factcheck_action": (fc or {}).get("action", None),
                "factcheck_confidence": (fc or {}).get("confidence", None),
                "factcheck_issues": (fc or {}).get("issues", None),
                "factcheck_raw": (fc or {}).get("raw", None),
                "len_A": len(a),
                "len_B": len(b),
                "similarity": sim,
                "gen_seconds": round(time.time() - t0, 2),
            },
        }

        # 4) STREAM write immediately
        append_jsonl(OUT_FILE, record)
        seen_prompts.add(q)
        saved += 1

        if DEBUG_EVERY:
            extra = ""
            if fc:
                extra = f" fc={record['meta']['factcheck_action']} fc_conf={float(record['meta']['factcheck_confidence'] or 0):.2f}"
            print(
                f"Q{i}/{total} ‚Äì {winner} saved={saved} skipped={skipped} flipped={flipped} "
                f"conf={conf_val:.2f} sim={sim:.2f}{extra}"
            )

    print(
        f"\n[DONE] saved={saved} skipped={skipped} flipped={flipped} "
        f"factchecked={factchecked} out={Path(OUT_FILE).resolve()}"
    )

await run()




[INIT] Using existing output file: /home/jovyan/work/MLE_in_Gen_AI-Course/class7/Homework_Kevin/DPO_preference_data.jsonl
==((====))==  Unsloth 2025.12.9: Fast Llama patching. Transformers: 4.57.3.
   \\   /|    inference-ai GPU cuda. Num GPUs = 1. Max memory: 47.988 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.9.1+cu128. CUDA: 8.6. CUDA Toolkit: 12.8. Triton: 3.5.1
\        /    Bfloat16 = TRUE. FA [Xformers = None. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
[START] cwd=/home/jovyan/work/MLE_in_Gen_AI-Course/class7/Homework_Kevin
[START] total_questions=200 already_in_file=8 out=/home/jovyan/work/MLE_in_Gen_AI-Course/class7/Homework_Kevin/DPO_preference_data.jsonl
Q1/200 ‚Äì SKIP duplicate
Q2/200 ‚Äì SKIP duplicate
Q3/200 ‚Äì SKIP duplicate
Q4/200 ‚Äì SKIP duplicate
Q5/200 ‚Äì SKIP duplicate
Q6/200 ‚Äì SKIP duplicate
Q7/200 ‚Äì SKIP duplicate
Q8/200 ‚Äì SKIP duplicate

In [8]:
# ‚úÖ BEST full solution: Unsloth 4-bit base + your SFT LoRA + TRL DPO (uses DPOConfig)
# - Loads your JSONL preferences: {"prompt","chosen","rejected",...}
# - Policy = base + SFT adapter (trainable)
# - Ref    = base + SFT adapter (frozen)
# - Saves DPO adapter to: llama31-8b-finance-dpo-lora-kevin
#
# If you hit an import error for DPOConfig (older TRL), tell me your `pip show trl` version.

import os
import torch
from datasets import load_dataset
from peft import PeftModel
from trl import DPOTrainer, DPOConfig
from unsloth import FastLanguageModel

# -----------------------------
# Config
# -----------------------------
BASE_MODEL_ID      = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"
SFT_ADAPTER_PATH   = "llama31-8b-finance-qlora-lora"
DATA_FILE          = "DPO_preference_data.jsonl"   # <- make sure this is .jsonl
OUT_DIR            = "llama31-8b-finance-dpo-lora-kevin"

MAX_SEQ_LEN        = 4096
MAX_LENGTH         = 2048
MAX_PROMPT_LENGTH  = 1024
BETA               = 0.1

# Training
PER_DEVICE_BS      = 1
GRAD_ACCUM         = 8
LR                 = 5e-6
EPOCHS             = 1
LOGGING_STEPS      = 10
SAVE_STEPS         = 200
WARMUP_RATIO       = 0.03
JIT_COMPILE        = False  # set True only if you know it works in your env

# -----------------------------
# Safety checks
# -----------------------------
assert os.path.exists(DATA_FILE), f"Missing data file: {DATA_FILE}"
assert os.path.exists(SFT_ADAPTER_PATH), f"Missing adapter dir: {SFT_ADAPTER_PATH}"

# -----------------------------
# Load dataset
# -----------------------------
dataset = load_dataset("json", data_files=DATA_FILE)["train"]
required_cols = {"prompt", "chosen", "rejected"}
missing = required_cols - set(dataset.column_names)
assert not missing, f"Dataset missing required columns: {missing}"

# (Optional) light cleanup if you accidentally stored empty lines
dataset = dataset.filter(lambda x: isinstance(x["prompt"], str) and len(x["prompt"].strip()) > 0)

# -----------------------------
# Load tokenizer via Unsloth
# -----------------------------
_, tokenizer = FastLanguageModel.from_pretrained(
    model_name=BASE_MODEL_ID,
    max_seq_length=MAX_SEQ_LEN,
    load_in_4bit=True,
)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# -----------------------------
# Policy model (trainable)
# -----------------------------
policy, _ = FastLanguageModel.from_pretrained(
    model_name=BASE_MODEL_ID,
    max_seq_length=MAX_SEQ_LEN,
    load_in_4bit=True,
)
policy = PeftModel.from_pretrained(policy, SFT_ADAPTER_PATH, is_trainable=True)

# -----------------------------
# Reference model (frozen)
# -----------------------------
ref, _ = FastLanguageModel.from_pretrained(
    model_name=BASE_MODEL_ID,
    max_seq_length=MAX_SEQ_LEN,
    load_in_4bit=True,
)
ref = PeftModel.from_pretrained(ref, SFT_ADAPTER_PATH, is_trainable=False)
ref.eval()
for p in ref.parameters():
    p.requires_grad = False

# -----------------------------
# DPO config (TRL expects this, not TrainingArguments)
# -----------------------------
dpo_args = DPOConfig(
    output_dir=OUT_DIR,
    per_device_train_batch_size=PER_DEVICE_BS,
    gradient_accumulation_steps=GRAD_ACCUM,
    learning_rate=LR,
    num_train_epochs=EPOCHS,
    logging_steps=LOGGING_STEPS,
    save_steps=SAVE_STEPS,
    report_to="none",

    # precision
    bf16=torch.cuda.is_bf16_supported(),
    fp16=not torch.cuda.is_bf16_supported(),

    # optimizer good for 4-bit + LoRA
    optim="paged_adamw_8bit",

    lr_scheduler_type="cosine",
    warmup_ratio=WARMUP_RATIO,

    # DPO-specific
    beta=BETA,
    max_length=MAX_LENGTH,
    max_prompt_length=MAX_PROMPT_LENGTH,
)

# -----------------------------
# Trainer
# -----------------------------
trainer = DPOTrainer(
    model=policy,
    ref_model=ref,
    args=dpo_args,
    train_dataset=dataset,
    tokenizer=tokenizer,
)

# (Optional) compile
if JIT_COMPILE and hasattr(torch, "compile"):
    trainer.model = torch.compile(trainer.model)

# -----------------------------
# Train + save
# -----------------------------
trainer.train()
trainer.save_model(OUT_DIR)
tokenizer.save_pretrained(OUT_DIR)

print("‚úÖ DPO-aligned model saved to:", OUT_DIR)



Filter: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 190/190 [00:00<00:00, 5435.26 examples/s]


==((====))==  Unsloth 2025.12.9: Fast Llama patching. Transformers: 4.57.3.
   \\   /|    inference-ai GPU cuda. Num GPUs = 1. Max memory: 47.988 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.9.1+cu128. CUDA: 8.6. CUDA Toolkit: 12.8. Triton: 3.5.1
\        /    Bfloat16 = TRUE. FA [Xformers = None. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
==((====))==  Unsloth 2025.12.9: Fast Llama patching. Transformers: 4.57.3.
   \\   /|    inference-ai GPU cuda. Num GPUs = 1. Max memory: 47.988 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.9.1+cu128. CUDA: 8.6. CUDA Toolkit: 12.8. Triton: 3.5.1
\        /    Bfloat16 = TRUE. FA [Xformers = None. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
==((====))==  Unsloth 2025.12.9: Fast Llama patching. Transformers: 4.57.3.
  

Extracting prompt in train dataset (num_proc=64): 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 190/190 [00:08<00:00, 22.74 examples/s] 
Applying chat template to train dataset (num_proc=64): 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 190/190 [00:25<00:00,  7.31 examples/s]
Tokenizing train dataset (num_proc=64): 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 190/190 [00:26<00:00,  7.25 examples/s]
The model is already on multiple devices. Skipping the move to device specified in `args`.
==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 190 | Num Epochs = 1 | Total steps = 24
O^O/ \_/ \    Batch size per device = 1 | Gradient accumulation steps = 8
\        /    Data Parallel GPUs = 1 | Total batch size (1 x 8 x 1) = 8
 "-____-"     Trainable parameters = 41,943,040 of 8,072,204,288 (0.52% trained)


Unsloth: Will smartly offload gradients to save VRAM!


Step,Training Loss,rewards / chosen,rewards / rejected,rewards / accuracies,rewards / margins,logps / chosen,logps / rejected,logits / chosen,logits / rejected,eval_logits / chosen,eval_logits / rejected,nll_loss
10,0.6781,0.104784,0.067752,0.5875,0.037032,-172.858185,-158.079453,-0.568802,-0.586924,0,0,0
20,0.6522,0.26853,0.165223,0.6375,0.103307,-165.902496,-158.191895,-0.564892,-0.57785,No Log,No Log,No Log


‚úÖ DPO-aligned model saved to: llama31-8b-finance-dpo-lora-kevin


In [None]:
# Finetuning eval (3-way):
# Compare:
#  1) ORIGINAL base: unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit
#  2) QLoRA adapter: llama31-8b-finance-qlora-lora
#  3) DPO LoRA adapter: llama31-8b-finance-dpo-lora-kevin
#
# Outputs:
#  - per-question answers for all 3
#  - pairwise judge results (Base-as-judge): Original vs QLoRA, Original vs DPO, QLoRA vs DPO
#  - simple aggregate win counts + average scores
#  - saves JSON: eval_3way_results.json
#
# Deps:
# pip install -U unsloth transformers peft accelerate bitsandbytes torch

import time, json
import torch
from unsloth import FastLanguageModel
from peft import PeftModel

# -----------------------------
# Config
# -----------------------------
BASE_MODEL_ID = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"

ADAPTERS = {
    "qlora": "llama31-8b-finance-qlora-lora",        # local folder OR HF repo id
    "dpo":   "llama31-8b-finance-dpo-lora-kevin",    # local folder OR HF repo id
}

MAX_NEW_TOKENS = 384
TEMPERATURE = 0.2
TOP_P = 0.9
SEED = 42
MAX_SEQ_LEN = 4096

torch.manual_seed(SEED)
if torch.cuda.is_available():
    torch.cuda.manual_seed_all(SEED)

# -----------------------------
# Test questions
# -----------------------------
test_questions = [
    "Explain look-ahead bias in backtesting and how to avoid it.",
    "What is survivorship bias and why does it matter in equity datasets?",
    "Compare VaR and Expected Shortfall conceptually. When is ES preferred?",
    "Why might a model have high AUC but poor calibration?",
    "Why can a high Sharpe ratio be misleading in small samples?",
]

# -----------------------------
# Helpers
# -----------------------------
def load_base_model():
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name=BASE_MODEL_ID,
        max_seq_length=MAX_SEQ_LEN,
        load_in_4bit=True,
        dtype=None,
    )
    model = FastLanguageModel.for_inference(model)
    return model, tokenizer

def load_model_with_adapter(adapter_path: str):
    """
    Load a fresh base model instance, attach a PEFT LoRA adapter, then set for inference.
    """
    model, _ = FastLanguageModel.from_pretrained(
        model_name=BASE_MODEL_ID,
        max_seq_length=MAX_SEQ_LEN,
        load_in_4bit=True,
        dtype=None,
    )
    model = PeftModel.from_pretrained(model, adapter_path)
    model = FastLanguageModel.for_inference(model)
    return model

def format_chat(tokenizer, user_text: str):
    messages = [
        {"role": "system", "content": "You are a helpful finance and investing assistant. Be accurate, concise, and practical."},
        {"role": "user", "content": user_text},
    ]
    input_ids = tokenizer.apply_chat_template(
        messages,
        tokenize=True,
        add_generation_prompt=True,
        return_tensors="pt",
    )
    return input_ids

@torch.no_grad()
def generate_answer(model, tokenizer, question: str):
    input_ids = format_chat(tokenizer, question).to(model.device)
    attention_mask = torch.ones_like(input_ids, device=model.device)

    gen = model.generate(
        input_ids=input_ids,
        attention_mask=attention_mask,
        max_new_tokens=MAX_NEW_TOKENS,
        temperature=TEMPERATURE,
        do_sample=(TEMPERATURE > 0),
        top_p=TOP_P,
        use_cache=True,
    )

    text = tokenizer.decode(gen[0], skip_special_tokens=False)

    # Best-effort extraction of assistant content
    for marker in ["<|assistant|>", "assistant\n", "### Assistant:"]:
        if marker in text:
            return text.split(marker, 1)[-1].strip()
    return text.strip()

def judge_pair_with_base(judge_model, tokenizer, question: str, ans_A: str, ans_B: str, label_A="A", label_B="B"):
    """
    Base-model-as-judge. Returns a JSON string (best-effort).
    """
    judge_prompt = f"""
You are grading two answers to the same finance/investing question.

Question:
{question}

Answer {label_A}:
{ans_A}

Answer {label_B}:
{ans_B}

Rubric (most important first):
1) Factual correctness (no hallucinations)
2) Finance-specific insight / domain depth
3) Practical usefulness (how-to, pitfalls, examples)
4) Clarity and concision

Return STRICT JSON only:
{{
  "winner": "{label_A}" or "{label_B}" or "tie",
  "score_{label_A}": 1-10,
  "score_{label_B}": 1-10,
  "rationale": "1-3 sentences"
}}
""".strip()

    input_ids = format_chat(tokenizer, judge_prompt).to(judge_model.device)
    attention_mask = torch.ones_like(input_ids, device=judge_model.device)

    gen = judge_model.generate(
        input_ids=input_ids,
        attention_mask=attention_mask,
        max_new_tokens=256,
        temperature=0.0,
        do_sample=False,
        use_cache=True,
    )

    out = tokenizer.decode(gen[0], skip_special_tokens=False)

    # best-effort extract JSON
    if "{" in out and "}" in out:
        out = out[out.find("{"): out.rfind("}") + 1]
    return out.strip()

def safe_parse_json(s: str):
    try:
        return json.loads(s)
    except Exception:
        return None

# -----------------------------
# Load models
# -----------------------------
print("Loading ORIGINAL base (also used as judge)...")
base_model, tokenizer = load_base_model()

print("Loading QLoRA adapter model...")
qlora_model = load_model_with_adapter(ADAPTERS["qlora"])

print("Loading DPO adapter model...")
dpo_model = load_model_with_adapter(ADAPTERS["dpo"])

models = {
    "original": base_model,
    "qlora": qlora_model,
    "dpo": dpo_model,
}

# -----------------------------
# Run 3-way evaluation
# -----------------------------
results = []
start = time.time()

# aggregate stats
win_counts = {
    "original": 0,
    "qlora": 0,
    "dpo": 0,
    "tie": 0,
}
score_sums = {"original": 0.0, "qlora": 0.0, "dpo": 0.0}
score_counts = {"original": 0, "qlora": 0, "dpo": 0}

pairings = [
    ("original", "qlora"),
    ("original", "dpo"),
    ("qlora", "dpo"),
]

for i, q in enumerate(test_questions, 1):
    print(f"\n{'='*100}\nQ{i}: {q}\n{'='*100}")

    answers = {}
    for name, m in models.items():
        ans = generate_answer(m, tokenizer, q)
        answers[name] = ans
        print(f"\n--- {name.upper()} ANSWER --------------------------------------------------")
        print(ans)

    judges = {}
    for a, b in pairings:
        # Keep labels consistent so parsing is easy
        judge_json = judge_pair_with_base(
            judge_model=base_model,
            tokenizer=tokenizer,
            question=q,
            ans_A=answers[a],
            ans_B=answers[b],
            label_A=a,
            label_B=b,
        )
        judges[f"{a}_vs_{b}"] = judge_json
        print(f"\n--- JUDGE: {a} vs {b} (JSON) ----------------------------------------------")
        print(judge_json)

        parsed = safe_parse_json(judge_json)
        if parsed:
            winner = parsed.get("winner", "tie")
            if winner in win_counts:
                win_counts[winner] += 1
            else:
                win_counts["tie"] += 1

            # accumulate scores if present
            for name in [a, b]:
                k = f"score_{name}"
                if k in parsed:
                    try:
                        score_sums[name] += float(parsed[k])
                        score_counts[name] += 1
                    except Exception:
                        pass

    results.append({
        "question": q,
        "answers": answers,
        "judges": judges,
    })

elapsed = time.time() - start

# -----------------------------
# Summary
# -----------------------------
avg_scores = {}
for name in ["original", "qlora", "dpo"]:
    avg_scores[name] = (score_sums[name] / score_counts[name]) if score_counts[name] else None

print(f"\n\n{'='*100}")
print("SUMMARY")
print(f"Ran {len(test_questions)} questions with {len(pairings)} pairwise judgments each "
      f"({len(test_questions)*len(pairings)} total judge calls) in {elapsed:.1f}s.\n")

print("Win counts (pairwise):")
for k, v in win_counts.items():
    print(f"  {k:>8}: {v}")

print("\nAverage judge scores (if parsable):")
for k, v in avg_scores.items():
    print(f"  {k:>8}: {v if v is not None else 'n/a'}")

# -----------------------------
# Save
# -----------------------------
out = {
    "base_model_id": BASE_MODEL_ID,
    "adapters": ADAPTERS,
    "gen_config": {
        "max_new_tokens": MAX_NEW_TOKENS,
        "temperature": TEMPERATURE,
        "top_p": TOP_P,
        "seed": SEED,
        "max_seq_len": MAX_SEQ_LEN,
    },
    "pairings": pairings,
    "results": results,
    "summary": {
        "elapsed_sec": elapsed,
        "win_counts": win_counts,
        "avg_scores": avg_scores,
        "score_counts": score_counts,
    },
}

with open("eval_results_alignment_ft_vs_base_vs_dpo.json", "w") as f:
    json.dump(out, f, indent=2)

print("\nSaved: eval_results_alignment_ft_vs_base_vs_dpo.json")


Loading ORIGINAL base (also used as judge)...
==((====))==  Unsloth 2025.12.9: Fast Llama patching. Transformers: 4.57.3.
   \\   /|    inference-ai GPU cuda. Num GPUs = 1. Max memory: 47.988 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.9.1+cu128. CUDA: 8.6. CUDA Toolkit: 12.8. Triton: 3.5.1
\        /    Bfloat16 = TRUE. FA [Xformers = None. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
Loading QLoRA adapter model...
==((====))==  Unsloth 2025.12.9: Fast Llama patching. Transformers: 4.57.3.
   \\   /|    inference-ai GPU cuda. Num GPUs = 1. Max memory: 47.988 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.9.1+cu128. CUDA: 8.6. CUDA Toolkit: 12.8. Triton: 3.5.1
\        /    Bfloat16 = TRUE. FA [Xformers = None. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
L

In [None]:
# 3-way Finetuning eval (OOM-safe, single-model, adapter swapping)
# Compare:
#  1) ORIGINAL base: unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit
#  2) QLoRA adapter: llama31-8b-finance-qlora-lora
#  3) DPO LoRA adapter: llama31-8b-finance-dpo-lora-kevin
#
# Judge: GPT-4.1-mini (recommended). No extra local judge model loaded.
#
# Output:
#  - per-question answers for all 3
#  - pairwise judge results: original_vs_qlora, original_vs_dpo, qlora_vs_dpo
#  - aggregates: win counts + avg scores
#  - saves: eval_3way_results.json
#
# pip install -U unsloth transformers peft accelerate bitsandbytes torch openai python-dotenv

import os, json, re, time, math, asyncio
from typing import Dict, Any, Optional, Tuple

import torch
from unsloth import FastLanguageModel
from peft import PeftModel
from dotenv import load_dotenv
from openai import AsyncOpenAI

load_dotenv()

# -----------------------------
# Config
# -----------------------------
BASE_MODEL_ID = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"

ADAPTERS = {
    "qlora": "llama31-8b-finance-qlora-lora",        # local folder OR HF repo id
    "dpo":   "llama31-8b-finance-dpo-lora-kevin",    # local folder OR HF repo id
}

OUT_JSON = "eval_results_alignment_ft_vs_base_vs_dpo.json"

MAX_SEQ_LEN = 4096
MAX_NEW_TOKENS_ANS = 384
TEMPERATURE = 0.2
TOP_P = 0.9
SEED = 42

JUDGE_MODEL = "gpt-4.1-mini"
JUDGE_MAX_TOKENS = 220
JUDGE_CONCURRENCY = 8
JUDGE_RETRIES = 3

SYSTEM_ANSWER = "You are a helpful finance and investing assistant. Be accurate, concise, and practical."
JUDGE_SYSTEM = (
    "You are a strict evaluator for finance/investing answers. "
    "Prioritize: (1) factual correctness, (2) finance depth, (3) practical usefulness, (4) clarity. "
    "Return ONLY valid JSON, no extra text."
)

test_questions = [
    "Explain look-ahead bias in backtesting and how to avoid it.",
    "What is survivorship bias and why does it matter in equity datasets?",
    "Compare VaR and Expected Shortfall conceptually. When is ES preferred?",
    "Why might a model have high AUC but poor calibration?",
    "Why can a high Sharpe ratio be misleading in small samples?",
]

torch.manual_seed(SEED)
if torch.cuda.is_available():
    torch.cuda.manual_seed_all(SEED)

# -----------------------------
# Model load (ONE TIME)
# -----------------------------
print("Loading base model once...")
base_model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=BASE_MODEL_ID,
    max_seq_length=MAX_SEQ_LEN,
    load_in_4bit=True,
    dtype=None,
)
base_model = FastLanguageModel.for_inference(base_model)

# Turn base into a PEFT-wrapped model once so we can load adapters into it
# (this avoids loading multiple base copies)
model = PeftModel.from_pretrained(base_model, ADAPTERS["qlora"])  # temporary
model = FastLanguageModel.for_inference(model)
model.set_adapter("default")  # adapter name in PEFT is often "default"

# Load second adapter into same model under a different name
# If load_adapter isn't available in your peft version, we'll fallback to loading via from_pretrained into the SAME wrapper.
if hasattr(model, "load_adapter"):
    model.load_adapter(ADAPTERS["dpo"], adapter_name="dpo")
else:
    # Fallback: attach second adapter using peft's loading API
    # This is a safe, common workaround if your peft is older.
    from peft import PeftConfig
    # This will still reuse the same base weights; adapter weights are added.
    model = PeftModel.from_pretrained(model, ADAPTERS["dpo"], adapter_name="dpo")
    model = FastLanguageModel.for_inference(model)

# Rename the first adapter to "qlora" if it's still "default"
# (Some PEFT versions keep it as "default". We'll map both cases.)
available = list(getattr(model, "peft_config", {}).keys())
if "qlora" not in available:
    if "default" in available:
        qlora_adapter_name = "default"
    else:
        # take first adapter as qlora
        qlora_adapter_name = available[0] if available else "default"
else:
    qlora_adapter_name = "qlora"

dpo_adapter_name = "dpo"

print("Adapters available:", list(getattr(model, "peft_config", {}).keys()))
print("Using qlora adapter name:", qlora_adapter_name)
print("Using dpo adapter name:", dpo_adapter_name)

# -----------------------------
# Generation utilities
# -----------------------------
def _chat_input_ids(system: str, user: str) -> torch.Tensor:
    messages = [{"role": "system", "content": system}, {"role": "user", "content": user}]
    return tokenizer.apply_chat_template(
        messages,
        tokenize=True,
        add_generation_prompt=True,
        return_tensors="pt",
    )

@torch.no_grad()
def generate_answer_variant(variant: str, question: str) -> str:
    """
    variant: "original" | "qlora" | "dpo"
    Uses the SAME model in memory; swaps adapters.
    """
    # switch adapter
    if variant == "original":
        # disable adapters
        if hasattr(model, "disable_adapter"):
            with model.disable_adapter():
                return _generate(question)
        else:
            # older PEFT: set_adapter(None) may not exist; best-effort by using base_model directly
            return _generate(question, use_base_only=True)

    elif variant == "qlora":
        model.set_adapter(qlora_adapter_name)
        return _generate(question)

    elif variant == "dpo":
        model.set_adapter(dpo_adapter_name)
        return _generate(question)

    else:
        raise ValueError(f"Unknown variant: {variant}")

@torch.no_grad()
def _generate(question: str, use_base_only: bool = False) -> str:
    m = base_model if use_base_only else model
    input_ids = _chat_input_ids(SYSTEM_ANSWER, question).to(m.device)
    in_len = input_ids.shape[-1]

    out = m.generate(
        input_ids=input_ids,
        attention_mask=torch.ones_like(input_ids, device=m.device),
        max_new_tokens=MAX_NEW_TOKENS_ANS,
        temperature=TEMPERATURE,
        do_sample=(TEMPERATURE > 0),
        top_p=TOP_P,
        use_cache=True,
    )
    # decode only new tokens
    new_tokens = out[0, in_len:]
    text = tokenizer.decode(new_tokens, skip_special_tokens=True)
    return text.strip()

# -----------------------------
# Judge (GPT) utilities
# -----------------------------
client = AsyncOpenAI()

def _extract_json(text: str) -> Optional[Dict[str, Any]]:
    if not text:
        return None
    text = text.strip()
    try:
        return json.loads(text)
    except Exception:
        pass
    m = re.search(r"\{[\s\S]*?\}", text)
    if m:
        s = m.group(0)
        s = re.sub(r",\s*([}\]])", r"\1", s)
        try:
            return json.loads(s)
        except Exception:
            return None
    return None

def _coerce_winner(w: Any, a: str, b: str) -> str:
    if w is None:
        return "tie"
    s = str(w).strip().lower()
    if s == "tie":
        return "tie"
    if s in ("a", a.lower()):
        return a
    if s in ("b", b.lower()):
        return b
    return "tie"

def _coerce_score(x: Any) -> Optional[float]:
    try:
        v = float(x)
        if math.isfinite(v):
            return max(1.0, min(10.0, v))
        return None
    except Exception:
        return None

async def judge_pair(question: str, ansA: str, ansB: str, labelA: str, labelB: str) -> Dict[str, Any]:
    prompt = f"""
Grade two answers to the same finance/investing question.

Question:
{question}

Answer {labelA}:
{ansA}

Answer {labelB}:
{ansB}

Return ONLY strict JSON:
{{
  "winner": "{labelA}" or "{labelB}" or "tie",
  "score_{labelA}": 1-10,
  "score_{labelB}": 1-10,
  "rationale": "1-3 sentences"
}}
""".strip()

    last_err = None
    for attempt in range(1, JUDGE_RETRIES + 1):
        try:
            r = await client.chat.completions.create(
                model=JUDGE_MODEL,
                messages=[
                    {"role": "system", "content": JUDGE_SYSTEM},
                    {"role": "user", "content": prompt},
                ],
                temperature=0,
                max_tokens=JUDGE_MAX_TOKENS,
            )
            raw = (r.choices[0].message.content or "").strip()
            obj = _extract_json(raw)
            if isinstance(obj, dict):
                winner = _coerce_winner(obj.get("winner"), labelA, labelB)
                sA = _coerce_score(obj.get(f"score_{labelA}"))
                sB = _coerce_score(obj.get(f"score_{labelB}"))
                rationale = str(obj.get("rationale", "")).strip()
                return {
                    "final": {"winner": winner, f"score_{labelA}": sA, f"score_{labelB}": sB, "rationale": rationale},
                    "raw": raw,
                    "parsed_debug": obj,
                }
            last_err = f"Unparsable judge JSON (head): {raw[:200]}"
        except Exception as e:
            last_err = str(e)
        await asyncio.sleep(0.4 * attempt)

    return {
        "final": {"winner": "tie", f"score_{labelA}": None, f"score_{labelB}": None, "rationale": ""},
        "raw": "",
        "parsed_debug": None,
        "error": last_err,
    }

# -----------------------------
# Main eval
# -----------------------------
async def main():
    pairings = [("original", "qlora"), ("original", "dpo"), ("qlora", "dpo")]

    results = []
    win_counts = {"original": 0, "qlora": 0, "dpo": 0, "tie": 0}
    score_sums  = {"original": 0.0, "qlora": 0.0, "dpo": 0.0}
    score_counts= {"original": 0, "qlora": 0, "dpo": 0}

    start = time.time()
    sem = asyncio.Semaphore(JUDGE_CONCURRENCY)

    async def judge_with_sem(q, a, b, la, lb):
        async with sem:
            return await judge_pair(q, a, b, la, lb)

    for i, q in enumerate(test_questions, 1):
        print(f"\n{'='*100}\nQ{i}: {q}\n{'='*100}")

        # Generate 3 answers (sequential, CUDA-safe)
        answers = {}
        for v in ["original", "qlora", "dpo"]:
            ans = generate_answer_variant(v, q)
            answers[v] = ans
            print(f"\n--- {v.upper()} ANSWER --------------------------------------------------")
            print(ans)

        # Judge pairwise (concurrent across 3 pairings)
        judge_tasks = []
        for a, b in pairings:
            judge_tasks.append(judge_with_sem(q, answers[a], answers[b], a, b))

        judge_outs = await asyncio.gather(*judge_tasks)

        judges = {}
        for (a, b), jr in zip(pairings, judge_outs):
            key = f"{a}_vs_{b}"
            judges[key] = jr
            final = jr["final"]
            print(f"\n--- JUDGE: {a} vs {b} ----------------------------------------------------")
            print(json.dumps(final, indent=2))

            # aggregate
            winner = final.get("winner", "tie")
            win_counts[winner] = win_counts.get(winner, 0) + 1

            for name in (a, b):
                sv = final.get(f"score_{name}", None)
                if sv is not None:
                    score_sums[name] += float(sv)
                    score_counts[name] += 1

        results.append({"question": q, "answers": answers, "judges": judges})

        # free some KV cache between questions
        if torch.cuda.is_available():
            torch.cuda.empty_cache()

    elapsed = time.time() - start
    avg_scores = {k: (score_sums[k] / score_counts[k] if score_counts[k] else None) for k in score_sums}

    print(f"\n\n{'='*100}\nSUMMARY")
    print(f"elapsed_sec={elapsed:.1f}")
    print("Win counts:")
    for k, v in win_counts.items():
        print(f"  {k:>8}: {v}")
    print("Avg scores:")
    for k, v in avg_scores.items():
        print(f"  {k:>8}: {v if v is not None else 'n/a'}")

    out = {
        "base_model_id": BASE_MODEL_ID,
        "adapters": ADAPTERS,
        "gen_config": {
            "max_new_tokens_answer": MAX_NEW_TOKENS_ANS,
            "temperature": TEMPERATURE,
            "top_p": TOP_P,
            "seed": SEED,
            "max_seq_len": MAX_SEQ_LEN,
        },
        "judge": {"model": JUDGE_MODEL},
        "pairings": pairings,
        "results": results,
        "summary": {
            "elapsed_sec": elapsed,
            "win_counts": win_counts,
            "avg_scores": avg_scores,
            "score_counts": score_counts,
        },
    }

    with open(OUT_JSON, "w", encoding="utf-8") as f:
        json.dump(out, f, indent=2, ensure_ascii=False)

    print(f"\nSaved: {OUT_JSON}")

await main()


Loading base model once...
==((====))==  Unsloth 2025.12.9: Fast Llama patching. Transformers: 4.57.3.
   \\   /|    inference-ai GPU cuda. Num GPUs = 1. Max memory: 47.988 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.9.1+cu128. CUDA: 8.6. CUDA Toolkit: 12.8. Triton: 3.5.1
\        /    Bfloat16 = TRUE. FA [Xformers = None. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
Adapters available: ['default', 'dpo']
Using qlora adapter name: default
Using dpo adapter name: dpo

Q1: Explain look-ahead bias in backtesting and how to avoid it.

--- ORIGINAL ANSWER --------------------------------------------------
**What is Look-Ahead Bias in Backtesting?**

Look-ahead bias, also known as data snooping or overfitting, is a common issue in backtesting that occurs when a model or strategy is optimized using data that is not available at the time of the investment decision. This means t