# Genex Meta-Study Researcher — Genex Evidence Report

This notebook:
- Screens PDFs in a folder for relevance to a condition
- Extracts structured evidence items (definition / genes / symptoms / treatments / outcomes)
- Generates a **Genex Evidence Report** in Markdown with sections:

1) Executive Summary (definition + genes affected)  
2) Symptoms supported by literature (ranked)  
3) Treatments/interventions supported by literature (ranked)  
4) Outcomes / prognosis (if present)  
5) Conflicting / uncertain areas  
6) Limitations + what to read next  

It also includes **evidence-grounded Q&A** with citations (and uses a **unique session per question** to avoid `AlreadyExistsError`).


In [1]:

# -----------------------
# 0) Imports
# -----------------------
import os
import re
import json
from typing import Any, Dict, List, Optional

import fitz  # PyMuPDF
from pydantic import BaseModel, Field

from collections import defaultdict, Counter
import uuid

# Google ADK
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.models.lite_llm import LiteLlm

# Optional GenAI message types (depends on installed packages)
try:
    from google.genai.types import Content, Part
    _HAS_GENAI_TYPES = True
except Exception:
    Content, Part = None, None
    _HAS_GENAI_TYPES = False


In [11]:

# -----------------------
# 1) Settings
# -----------------------
# TODO: change this to your local folder containing PDFs
PAPERS_DIR = r"C:/Users/T490/Downloads/Genex/docs/papers/serine_deficiency_papers"

MAX_PAGES_PER_PDF = 25
MODEL = "openai/gpt-4o-mini"  # LiteLLM-compatible model id

APP_NAME = "genex_meta_study"
USER_ID = "genex_user"

llm = LiteLlm(model=MODEL)
session_service = InMemorySessionService()


In [12]:

# -----------------------
# 2) Schemas
# -----------------------
class PaperMetadata(BaseModel):
    title: str
    authors: List[str] = Field(default_factory=list)
    year: Optional[int] = None
    journal: Optional[str] = None
    doi: Optional[str] = None

class RelevanceDecision(BaseModel):
    paper_id: str
    title: str
    is_relevant: bool = False
    relevance_score: float = 0.0  # 0..1
    reason: str = ""
    matched_terms: List[str] = Field(default_factory=list)

class ExtractedFinding(BaseModel):
    name: str = Field(..., description="Normalized term")
    category: str = Field(..., description="definition|gene|symptom|treatment|outcome|population|other")
    polarity: str = Field(..., description="supports|refutes|mixed|unclear")
    snippet: str = Field(..., description="<=2 sentences evidence snippet")
    section: str = Field(..., description="methods|results|discussion|abstract|unknown")

class PaperExtraction(BaseModel):
    paper_id: str
    title: str
    authors: List[str] = Field(default_factory=list)
    year: Optional[int] = None
    journal: Optional[str] = None
    doi: Optional[str] = None

    condition: str = ""
    relevance_score: float = 0.0
    summary: str = ""
    key_takeaways: List[str] = Field(default_factory=list)
    findings: List[ExtractedFinding] = Field(default_factory=list)


In [13]:

# -----------------------
# 3) Helper utilities (robust across ADK versions)
# -----------------------
def list_pdfs(folder: str) -> List[str]:
    return sorted(
        os.path.join(folder, f)
        for f in os.listdir(folder)
        if f.lower().endswith(".pdf")
    )

def pdf_to_text(path: str, max_pages: int = 20) -> str:
    doc = fitz.open(path)
    out = []
    for i in range(min(len(doc), max_pages)):
        out.append(doc.load_page(i).get_text("text"))
    doc.close()
    return "\n".join(out)

def safe_json_extract(text: str) -> Optional[Dict[str, Any]]:
    """Extract a JSON object from model output. Returns dict or None."""
    if not text or not isinstance(text, str):
        return None
    text = text.strip()

    # direct JSON
    try:
        obj = json.loads(text)
        if isinstance(obj, dict):
            return obj
    except Exception:
        pass

    # fenced json
    fence = re.search(r"```json\s*(\{[\s\S]*?\})\s*```", text, flags=re.IGNORECASE)
    if fence:
        try:
            obj = json.loads(fence.group(1))
            if isinstance(obj, dict):
                return obj
        except Exception:
            pass

    # first {...}
    m = re.search(r"(\{[\s\S]*\})", text)
    if m:
        try:
            obj = json.loads(m.group(1))
            if isinstance(obj, dict):
                return obj
        except Exception:
            pass

    return None

def _make_message(text: str):
    """Create ADK message payload in a version-tolerant way."""
    if _HAS_GENAI_TYPES and Content is not None and Part is not None:
        return Content(parts=[Part(text=text)])
    return text

async def collect_events(async_gen) -> List[Any]:
    events = []
    async for e in async_gen:
        events.append(e)
    return events

async def run_runner(runner: Runner, user_id: str, session_id: str, text: str) -> List[Any]:
    """
    Works whether runner.run_async returns:
      - an async generator (streaming), or
      - an awaitable
    """
    res = runner.run_async(user_id=user_id, session_id=session_id, new_message=_make_message(text))
    if hasattr(res, "__aiter__"):  # async generator
        return await collect_events(res)
    out = await res                # awaitable
    if isinstance(out, list):
        return out
    return [out]

def _event_to_text(e: Any) -> str:
    """Best-effort conversion of an ADK event to text."""
    if e is None:
        return ""
    if isinstance(e, str):
        return e
    if isinstance(e, dict):
        for k in ("text", "output_text"):
            v = e.get(k)
            if isinstance(v, str):
                return v
        content = e.get("content")
        if isinstance(content, dict):
            parts = content.get("parts") or []
            texts = []
            for p in parts:
                if isinstance(p, dict) and isinstance(p.get("text"), str):
                    texts.append(p["text"])
            return "\n".join(texts)
    for attr in ("text", "output_text"):
        if hasattr(e, attr):
            v = getattr(e, attr)
            if isinstance(v, str):
                return v
    if hasattr(e, "content"):
        c = getattr(e, "content")
        if hasattr(c, "parts"):
            parts = getattr(c, "parts") or []
            texts = []
            for p in parts:
                if hasattr(p, "text") and isinstance(getattr(p, "text"), str):
                    texts.append(getattr(p, "text"))
                elif isinstance(p, dict) and isinstance(p.get("text"), str):
                    texts.append(p["text"])
            if texts:
                return "\n".join(texts)
        if isinstance(c, str):
            return c
    return ""

def extract_last_text(events: List[Any]) -> str:
    for e in reversed(events or []):
        t = _event_to_text(e).strip()
        if t:
            return t
    return ""


In [14]:

# -----------------------
# 4) Aggregation + ranking helpers (for report_markdown)
# -----------------------
def _norm_key(s: str) -> str:
    return re.sub(r"\s+", " ", (s or "").strip().lower())

def _paper_cite(p: dict) -> str:
    title = p.get("title") or "(untitled)"
    authors = ", ".join(p.get("authors") or []) or "UNKNOWN"
    journal = p.get("journal") or "UNKNOWN JOURNAL"
    year = p.get("year") or "n.d."
    doi = p.get("doi")
    doi_txt = f", DOI: {doi}" if doi else ""
    return f"{title} — {authors}. {journal} ({year}){doi_txt}"

def _polarity_weight(polarity: str) -> float:
    polarity = (polarity or "").lower()
    if polarity == "supports":
        return 1.0
    if polarity == "mixed":
        return 0.6
    if polarity == "unclear":
        return 0.35
    if polarity == "refutes":
        return 0.2
    return 0.35

def aggregate_findings(papers: list) -> dict:
    buckets = defaultdict(lambda: defaultdict(lambda: {
        "score": 0.0,
        "count": 0,
        "papers": set(),
        "snippets": [],
        "polarities": Counter(),
    }))

    for p in papers:
        pscore = float(p.get("relevance_score") or 0.0)
        cite = _paper_cite(p)

        for f in (p.get("findings") or []):
            cat = (f.get("category") or "other").lower()
            name = (f.get("name") or "").strip()
            if not name:
                continue
            key = _norm_key(name)

            pol = (f.get("polarity") or "unclear").lower()
            w = _polarity_weight(pol)

            score_add = w * (0.75 + 0.25 * min(1.0, pscore))

            entry = buckets[cat][key]
            entry["score"] += score_add
            entry["count"] += 1
            entry["papers"].add(cite)
            entry["polarities"][pol] += 1

            snip = (f.get("snippet") or "").strip()
            if snip:
                entry["snippets"].append({"snippet": snip, "cite": cite, "polarity": pol})

    ranked = {}
    for cat, items in buckets.items():
        ranked_items = []
        for key, v in items.items():
            ranked_items.append({
                "name": key,
                "score": v["score"],
                "mentions": v["count"],
                "papers": sorted(v["papers"]),
                "polarities": v["polarities"],
                "snippets": v["snippets"][:8],
            })
        ranked_items.sort(key=lambda x: (x["score"], len(x["papers"])), reverse=True)
        ranked[cat] = ranked_items
    return ranked

def pick_genes(ranked: dict, top_n: int = 12) -> list:
    genes = ranked.get("gene", []) + ranked.get("genes", [])
    out = []
    for g in genes:
        token = (g["name"] or "").upper()
        token = re.sub(r"[^A-Z0-9\-]", " ", token).strip()
        token = token.split()[0] if token else ""
        if 2 <= len(token) <= 12:
            out.append({"gene": token, "score": g["score"], "papers": g["papers"]})
    best = {}
    for r in out:
        if r["gene"] not in best or r["score"] > best[r["gene"]]["score"]:
            best[r["gene"]] = r
    return sorted(best.values(), key=lambda x: x["score"], reverse=True)[:top_n]


In [15]:

# -----------------------
# 5) Agents
# -----------------------
METADATA_SYSTEM = """
You extract bibliographic metadata from the first pages of a biomedical paper.

Return ONE valid JSON object ONLY with EXACTLY these keys:
{
  "title": "...",
  "authors": ["Last, First", "..."],
  "year": null,
  "journal": null,
  "doi": null
}

Rules:
- If unsure, use null (or [] for authors).
- Do not hallucinate.
"""

RELEVANCE_SYSTEM = """
You decide whether a paper is relevant to the CONDITION.

Return ONE valid JSON object ONLY with EXACTLY these keys:
{
  "paper_id": "...",
  "title": "...",
  "is_relevant": false,
  "relevance_score": 0.0,
  "reason": "...",
  "matched_terms": ["...", "..."]
}

Rules:
- Use ONLY the provided text.
- relevance_score is 0..1.
- If uncertain, be conservative.
"""

EXTRACTION_SYSTEM = """
You extract structured findings from a paper for the CONDITION.

Return ONE valid JSON object ONLY with EXACTLY these keys:
{
  "paper_id": "...",
  "title": "...",
  "year": null,
  "journal": null,
  "condition": "...",
  "relevance_score": 0.0,
  "summary": "...",
  "key_takeaways": ["...", "..."],
  "findings": [
    {"name":"...", "category":"definition|gene|symptom|treatment|outcome|population|other",
     "polarity":"supports|refutes|mixed|unclear",
     "snippet":"<=2 sentences", "section":"methods|results|discussion|abstract|unknown"}
  ]
}

Rules:
- Use ONLY the provided text.
- Do NOT invent details.
- Keep snippets short (<=2 sentences).
- Put gene names under category "gene" when possible.
"""

QA_SYSTEM = """
You are a biomedical literature Q&A agent.

You will be given:
- a user question
- evidence items (snippets) extracted from papers, each with citation fields

Your job:
- Answer ONLY using the evidence items.
- If the evidence does not contain an answer, say: "Not found in the provided papers."
- Use inline citations like [1], [2] corresponding to evidence item IDs.

Output Markdown:
1) Answer
2) Evidence (bullets with snippet + citation number)
3) References (numbered: title, authors, journal, year; include DOI if present)
Do not invent papers or details.
"""

metadata_agent = LlmAgent(name="PaperMetadataExtractor", model=llm, instruction=METADATA_SYSTEM)
relevance_agent = LlmAgent(name="PaperRelevanceScreener", model=llm, instruction=RELEVANCE_SYSTEM)
extraction_agent = LlmAgent(name="PaperExtractor", model=llm, instruction=EXTRACTION_SYSTEM)
qa_agent = LlmAgent(name="PaperQAAgent", model=llm, instruction=QA_SYSTEM)


In [16]:

# -----------------------
# 6) Pipeline (creates the Genex Evidence Report)
# -----------------------
def infer_meta_from_text(paper_id: str, path: str, raw_text: str) -> Dict[str, Any]:
    title = os.path.splitext(os.path.basename(path))[0]
    year = None
    m = re.search(r"(19\d{2}|20\d{2})", raw_text[:4000])
    if m:
        try:
            year = int(m.group(1))
        except Exception:
            year = None
    return {
        "paper_id": paper_id,
        "path": path,
        "title": title,
        "year": year,
        "journal": None,
        "authors": [],
        "doi": None,
    }

async def extract_metadata(paper_id: str, path: str) -> Dict[str, Any]:
    meta_text = pdf_to_text(path, max_pages=2)[:60000]
    meta = infer_meta_from_text(paper_id, path, meta_text)

    session_id = f"meta-{paper_id}"
    await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=session_id)
    runner = Runner(app_name=APP_NAME, agent=metadata_agent, session_service=session_service)

    msg = f"""PAPER_ID: {paper_id}
FILENAME: {os.path.basename(path)}

TEXT (first pages):
{meta_text}
"""
    events = await run_runner(runner, user_id=USER_ID, session_id=session_id, text=msg)
    meta_json = safe_json_extract(extract_last_text(events))

    if isinstance(meta_json, dict):
        meta["title"] = meta_json.get("title") or meta["title"]
        meta["journal"] = meta_json.get("journal") or meta["journal"]
        meta["year"] = meta_json.get("year") or meta["year"]
        meta["doi"] = meta_json.get("doi") or meta.get("doi")
        meta["authors"] = meta_json.get("authors") or meta.get("authors", [])
    return meta

async def screen_relevance(condition: str, paper_id: str, title: str, text: str) -> RelevanceDecision:
    session_id = f"rel-{paper_id}"
    await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=session_id)
    runner = Runner(app_name=APP_NAME, agent=relevance_agent, session_service=session_service)

    prompt = f"""CONDITION: {condition}
PAPER_ID: {paper_id}
TITLE: {title}

TEXT:
{text[:120000]}
"""
    events = await run_runner(runner, user_id=USER_ID, session_id=session_id, text=prompt)
    obj = safe_json_extract(extract_last_text(events)) or {}
    obj.setdefault("paper_id", paper_id)
    obj.setdefault("title", title)
    return RelevanceDecision(**obj)

async def extract_paper(condition: str, paper_id: str, meta: Dict[str, Any], text: str, relevance_score: float) -> PaperExtraction:
    session_id = f"ext-{paper_id}"
    await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=session_id)
    runner = Runner(app_name=APP_NAME, agent=extraction_agent, session_service=session_service)

    prompt = f"""CONDITION: {condition}
PAPER_ID: {paper_id}
TITLE: {meta.get('title')}
YEAR: {meta.get('year')}
JOURNAL: {meta.get('journal')}
AUTHORS: {", ".join(meta.get("authors") or [])}
DOI: {meta.get('doi')}

TEXT:
{text[:140000]}
"""
    events = await run_runner(runner, user_id=USER_ID, session_id=session_id, text=prompt)
    obj = safe_json_extract(extract_last_text(events)) or {}

    obj["paper_id"] = paper_id
    obj["title"] = meta.get("title") or obj.get("title") or ""
    obj["authors"] = meta.get("authors") or obj.get("authors") or []
    obj["year"] = meta.get("year") or obj.get("year")
    obj["journal"] = meta.get("journal") or obj.get("journal")
    obj["doi"] = meta.get("doi") or obj.get("doi")
    obj["condition"] = condition
    obj["relevance_score"] = float(relevance_score or obj.get("relevance_score") or 0.0)

    if not isinstance(obj.get("findings"), list):
        obj["findings"] = []
    return PaperExtraction(**obj)

async def run_condition_folder(condition: str, folder: str) -> Dict[str, Any]:
    pdfs = list_pdfs(folder)
    papers: List[Dict[str, Any]] = []
    papers_screened = 0
    papers_relevant = 0

    for idx, path in enumerate(pdfs, start=1):
        paper_id = f"paper_{idx:03d}"
        raw_text = pdf_to_text(path, max_pages=MAX_PAGES_PER_PDF)
        papers_screened += 1

        meta = await extract_metadata(paper_id, path)
        rel = await screen_relevance(condition, paper_id, meta.get("title") or paper_id, raw_text)

        if not rel.is_relevant or rel.relevance_score < 0.20:
            continue

        papers_relevant += 1
        extraction = await extract_paper(condition, paper_id, meta, raw_text, rel.relevance_score)
        papers.append(extraction.model_dump())

    ranked = aggregate_findings(papers)

    definitions = ranked.get("definition", [])
    genes = pick_genes(ranked)
    symptoms = ranked.get("symptom", [])
    treatments = ranked.get("treatment", [])
    outcomes = ranked.get("outcome", [])

    def is_conflicted(item):
        pol = item["polarities"]
        total = sum(pol.values()) or 1
        mixed_refutes = (pol.get("mixed", 0) + pol.get("refutes", 0)) / total
        unclear = (pol.get("unclear", 0)) / total
        return mixed_refutes >= 0.4 or unclear >= 0.5

    conflicting = []
    for cat in ("symptom", "treatment", "outcome", "definition", "other"):
        for it in ranked.get(cat, [])[:30]:
            if is_conflicted(it):
                conflicting.append((it["score"], cat, it))
    conflicting.sort(key=lambda x: x[0], reverse=True)
    conflicting = conflicting[:12]

    # up to 3 definition snippets
    def_snips = []
    for it in definitions[:3]:
        if it["snippets"]:
            def_snips.append(it["snippets"][0])

    md = []
    md.append(f"# Genex Evidence Report: {condition}")
    md.append("")
    md.append(f"- Papers screened: **{papers_screened}**")
    md.append(f"- Papers relevant: **{papers_relevant}**")
    md.append("")

    md.append("## 1) Executive Summary")
    if def_snips:
        md.append("**Definition (from literature):**")
        for s in def_snips[:3]:
            md.append(f"- {s['snippet']}  \n  _Source:_ {s['cite']}")
        md.append("")
    else:
        md.append("**Definition (from literature):** Not found in extracted definition snippets.\n")

    if genes:
        md.append("**Genes affected (evidence-weighted):**")
        for g in genes:
            sources = "; ".join(g["papers"][:3]) + (" ..." if len(g["papers"]) > 3 else "")
            md.append(f"- **{g['gene']}** (evidence score: {g['score']:.2f})  \n  _Sources:_ {sources}")
        md.append("")
    else:
        md.append("**Genes affected:** Not found in extracted gene snippets.\n")

    def render_ranked_section(title, items, max_items=15):
        md.append(title)
        if not items:
            md.append("_No evidence items extracted for this section._\n")
            return
        for i, it in enumerate(items[:max_items], start=1):
            md.append(f"### {i}. {it['name']}")
            md.append(f"- Evidence score: **{it['score']:.2f}** | Papers: **{len(it['papers'])}** | Mentions: **{it['mentions']}**")
            for s in it["snippets"][:2]:
                md.append(f"- _Evidence:_ {s['snippet']}  \n  _Source:_ {s['cite']}")
            md.append("")

    render_ranked_section("## 2) Symptoms supported by literature (ranked)", symptoms, max_items=15)
    render_ranked_section("## 3) Treatments / interventions supported by literature (ranked)", treatments, max_items=15)
    render_ranked_section("## 4) Outcomes / prognosis (if present)", outcomes, max_items=12)

    md.append("## 5) Conflicting / uncertain areas")
    if not conflicting:
        md.append("_No major conflicting/uncertain clusters detected from extracted polarities._\n")
    else:
        for i, (_, cat, it) in enumerate(conflicting, start=1):
            md.append(f"### {i}. ({cat}) {it['name']}")
            md.append(f"- Polarity counts: {dict(it['polarities'])} | Papers: **{len(it['papers'])}**")
            for s in it["snippets"][:3]:
                md.append(f"- _Evidence:_ {s['snippet']}  \n  _Source:_ {s['cite']}")
            md.append("")

    md.append("## 6) Limitations + what to read next")
    md.append("- This report is based on **automated extraction from PDFs**; some PDFs may have incomplete text extraction or missing sections.")
    md.append("- Findings are limited to what was present in the provided folder and what the extraction agent captured as snippets.")
    md.append("- Evidence scores are **heuristic** (polarity + relevance) and not a substitute for clinical-grade appraisal (study design, sample size, bias).")
    md.append("")
    md.append("**What to read next (top evidence sources):**")
    for p in sorted(papers, key=lambda p: float(p.get("relevance_score") or 0.0), reverse=True)[:8]:
        md.append(f"- {_paper_cite(p)}")
    md.append("")

    return {
        "condition": condition,
        "papers_screened": papers_screened,
        "papers_relevant": papers_relevant,
        "papers": papers,
        "report_markdown": "\\n".join(md),
    }


In [17]:

# -----------------------
# 7) Q&A over extracted evidence (unique session per question)
# -----------------------
def build_evidence_index(papers: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
    rows = []
    for p in papers:
        for f in (p.get("findings") or []):
            rows.append({
                "paper_id": p.get("paper_id"),
                "title": p.get("title"),
                "authors": p.get("authors") or [],
                "year": p.get("year"),
                "journal": p.get("journal"),
                "doi": p.get("doi"),
                "condition": p.get("condition"),
                "relevance_score": p.get("relevance_score", 0.0),
                "name": f.get("name"),
                "category": f.get("category"),
                "polarity": f.get("polarity"),
                "section": f.get("section"),
                "snippet": f.get("snippet"),
            })
    return rows

def _tokenize(s: str) -> List[str]:
    return re.findall(r"[a-z0-9]+", (s or "").lower())

def retrieve_evidence(question: str, evidence_rows: List[Dict[str, Any]], top_k: int = 14) -> List[Dict[str, Any]]:
    qtok = set(_tokenize(question))
    if not qtok:
        return evidence_rows[:top_k]

    scored = []
    for r in evidence_rows:
        blob = " ".join([
            str(r.get("name","")),
            str(r.get("category","")),
            str(r.get("snippet","")),
            str(r.get("title","")),
            str(r.get("journal","")),
        ]).lower()
        btok = set(_tokenize(blob))
        overlap = len(qtok & btok)
        boost = 0.0
        if (r.get("polarity") or "").lower() == "supports":
            boost += 0.25
        boost += 0.15 * float(r.get("relevance_score") or 0.0)
        score = overlap + boost
        if overlap > 0:
            scored.append((score, r))

    scored.sort(key=lambda x: x[0], reverse=True)
    return [r for _, r in scored[:top_k]]

async def ask_papers(question: str, result: Dict[str, Any], top_k: int = 14) -> str:
    papers = result.get("papers") or []
    evidence_rows = build_evidence_index(papers)
    top = retrieve_evidence(question, evidence_rows, top_k=top_k)

    lines = []
    for i, r in enumerate(top, start=1):
        authors = ", ".join(r.get("authors") or [])
        lines.append(
            f"""EVIDENCE_ITEM [{i}]
paper_id: {r.get("paper_id")}
title: {r.get("title")}
authors: {authors if authors else "UNKNOWN"}
journal: {r.get("journal")}
year: {r.get("year")}
doi: {r.get("doi")}
category: {r.get("category")}
name: {r.get("name")}
polarity: {r.get("polarity")}
section: {r.get("section")}
snippet: {r.get("snippet")}
"""
        )

    qa_context = "\\n".join(lines) if lines else "NO EVIDENCE ITEMS RETRIEVED."

    session_id = f"qa-{uuid.uuid4().hex[:8]}"
    await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=session_id)
    runner = Runner(app_name=APP_NAME, agent=qa_agent, session_service=session_service)

    prompt = f"""USER QUESTION:
{question}

{qa_context}
"""
    events = await run_runner(runner, user_id=USER_ID, session_id=session_id, text=prompt)
    return extract_last_text(events)


In [18]:

# -----------------------
# 8) Example run
# -----------------------
# If your notebook doesn't support top-level `await`, uncomment:
# import nest_asyncio
# nest_asyncio.apply()

# result = await run_condition_folder("L-serine deficiency disorder", PAPERS_DIR)
# print(result["papers_screened"], result["papers_relevant"])
# print(result["report_markdown"][:2500])
# print(await ask_papers("What genes are linked to serine deficiency disorder?", result))


In [19]:
result = await run_condition_folder("L-serine deficiency disorder", PAPERS_DIR)

In [20]:
print(result["papers_screened"], result["papers_relevant"])

11 8


In [21]:
print(result["report_markdown"][:2500])

# Genex Evidence Report: L-serine deficiency disorder\n\n- Papers screened: **11**\n- Papers relevant: **8**\n\n## 1) Executive Summary\n**Definition (from literature):**\n- Neu-Laxova syndrome is characterized by severe intrauterine growth deficiency, microcephaly, congenital bilateral cataracts, and ichthyosis.  
  _Source:_ Serine Deficiency Disorders — van der Crabben, Saskia N, de Koning, Tom J. GeneReviews (2023)\n- This case represents the first known instance of PSAT1 mutation diagnosed and treated in an adult, causing ichthyosis and progressive neuropathy.  
  _Source:_ Adult diagnosis of congenital serine biosynthesis defect: a treatable cause of progressive neuropathy — Debs, Sarah, Ferreira, Carlos R., Groden, Catherine, Kim, H. Jeffrey, King, Kelly A., King, Monique C., Lehky, Tanya, Cowen, Edward W., Brown, Laura H., Merideth, Melissa, Owen, Carter M., Macnamara, Ellen, Toro, Camilo, Gahl, William A., Soldatos, Ariane. Am J Med Genet A (2021), DOI: 10.1002/ajmg.a.62245\n-

In [22]:
print(result["report_markdown"])

# Genex Evidence Report: L-serine deficiency disorder\n\n- Papers screened: **11**\n- Papers relevant: **8**\n\n## 1) Executive Summary\n**Definition (from literature):**\n- Neu-Laxova syndrome is characterized by severe intrauterine growth deficiency, microcephaly, congenital bilateral cataracts, and ichthyosis.  
  _Source:_ Serine Deficiency Disorders — van der Crabben, Saskia N, de Koning, Tom J. GeneReviews (2023)\n- This case represents the first known instance of PSAT1 mutation diagnosed and treated in an adult, causing ichthyosis and progressive neuropathy.  
  _Source:_ Adult diagnosis of congenital serine biosynthesis defect: a treatable cause of progressive neuropathy — Debs, Sarah, Ferreira, Carlos R., Groden, Catherine, Kim, H. Jeffrey, King, Kelly A., King, Monique C., Lehky, Tanya, Cowen, Edward W., Brown, Laura H., Merideth, Melissa, Owen, Carter M., Macnamara, Ellen, Toro, Camilo, Gahl, William A., Soldatos, Ariane. Am J Med Genet A (2021), DOI: 10.1002/ajmg.a.62245\n-

In [23]:
print(await ask_papers("What genes are linked to serine deficiency disorder?", result))

1) Answer  
The genes linked to serine deficiency disorder include **PHGDH**, **PSAT1**, and **PSPH**. 

2) Evidence  
- "The diagnosis of a serine deficiency disorder is established in a proband with biallelic pathogenic variants in **PHGDH**, **PSAT1**, or **PSPH** identified by molecular genetic testing." [2]  
- "Patient 1 has serine deficiency due to **3-phosphoglycerate dehydrogenase (PHGDH)** deficiency." [4]  
- "Patient 2 has serine deficiency due to **phosphoserine aminotransferase (PSAT1)** deficiency." [5]  
- "WES found the same homozygous variant c.43G > C (p.A15P) in the **PSAT1** gene, which cosegregated in the two families." [14]  
- "All three identified genes have been previously implicated in serine-deficiency disorders." [1]  

3) References  
1. Neu-Laxova Syndrome Is a Heterogeneous Metabolic Disorder Caused by Defects in Enzymes of the L-Serine Biosynthesis Pathway, Acuna-Hidalgo R, et al., The American Journal of Human Genetics, 2014. doi: 10.1016/j.ajhg.2014.0