# Stage 2: Cognitive Stress Test (Semantic Extraction)

This notebook tests the `TextSynthesizer` against specific 'stress scenarios' designed to break standard RAG pipelines.

**Scenarios Tested:**
Scenario	Financial Nuance	Why Standard RAG Fails	UFL Success Criteria
A	Pro-Forma	Confuses "Adjusted" vs "GAAP"	Distinct Metrics
B	Soft Numbers	Ignores text without digits	NaN + Nuance Note
C	Relative Time	Misses "Prior Year" context	Metadata Injection (2023)
D	Graph Edge	Ignores non-numeric relations	Related_Entity field
E	Formulas	Extracts partial fixed rate	NaN + Formula in Note
F	Constraints	Confuses Limit for Actual	Metric = "Limit"
G	Negation	Hallucinates generic text	Explicit None/False
H	Nested Units	Grabs wrong (first) integer	Semantic Disambiguation


In [1]:
%load_ext autoreload
%autoreload 2

import sys
import os
import nest_asyncio

# Add src to path
sys.path.append(os.path.abspath("../src"))

from venra.synthesis import TextSynthesizer
from venra.models import TextBlock
from venra.logging_config import logger

nest_asyncio.apply()

# Initialize Synthesizer (Simulated Entity)
synthesizer = TextSynthesizer(entity_id="ID_TEST", entity_name_raw="Test Corp")

## Helper Function

In [2]:
async def run_scenario(name, input_text, section_path, expected_desc, context_str=""):
    print(f"\n--- Scenario {name} ---")
    print(f"Input: '{input_text}'")
    print(f"Expected: {expected_desc}")
    print("-" * 20)
    
    block = TextBlock(
        content=input_text,
        section_path=section_path
    )
    
    # We pass context_str if provided (e.g. for Date resolution)
    rows = await synthesizer.extract_facts(block, context_str=context_str)
    
    if not rows:
        print("RESULT: No facts extracted.")
        return

    for r in rows:
        val_display = r.value if r.value is not None else "NaN"
        print(f"Metric: {r.metric_name} | Value: {val_display} | Unit: {r.unit} | Period: {r.period}")
        if r.nuance_note:
            print(f"  > Nuance: {r.nuance_note}")
        print("")

## Scenario A: The "Pro Forma" vs. "GAAP" Confusion

In [3]:
await run_scenario(
    name="A",
    input_text="Net income was $50 million. Adjusted for restructuring costs of $10 million, Adjusted Net Income was $60 million.",
    section_path=["MD&A", "Results"],
    expected_desc="3 Facts: Net Income (50), Adjusted Net Income (60), Restructuring Costs (10). Distinct metrics.",
)


--- Scenario A ---
Input: 'Net income was $50 million. Adjusted for restructuring costs of $10 million, Adjusted Net Income was $60 million.'
Expected: 3 Facts: Net Income (50), Adjusted Net Income (60), Restructuring Costs (10). Distinct metrics.
--------------------
Metric: Net Income | Value: 50000000.0 | Unit: USD | Period: 2023

Metric: Adjusted Net Income | Value: 60000000.0 | Unit: USD | Period: 2023
  > Nuance: Adjusted for restructuring costs of $10 million



## Scenario B: The "Substantially All" (Vague Quantifier)

In [4]:
await run_scenario(
    name="B",
    input_text="Substantially all of our revenue is denominated in US Dollars.",
    section_path=["Risk Factors", "Currency"],
    expected_desc="Metric: Revenue Denomination, Value: NaN, Nuance: 'Substantially all in USD'"
)


--- Scenario B ---
Input: 'Substantially all of our revenue is denominated in US Dollars.'
Expected: Metric: Revenue Denomination, Value: NaN, Nuance: 'Substantially all in USD'
--------------------
Metric: Revenue Denomination | Value: NaN | Unit: USD | Period: UNKNOWN
  > Nuance: Substantially all



## Scenario C: Implicit Dates (Relative Temporal Resolution)

In [5]:
# We inject the context that the Current Fiscal Year is 2023
await run_scenario(
    name="C",
    input_text="Isn the prior fiscal year, we acquired WidgetCorp.",
    section_path=["Business", "Acquisitions"],
    expected_desc="Metric: Acquisition, Period: 2022 (Resolved from 'prior year' + 2023 context)",
    context_str="Current Fiscal Year: 2023",
)


--- Scenario C ---
Input: 'Isn the prior fiscal year, we acquired WidgetCorp.'
Expected: Metric: Acquisition, Period: 2022 (Resolved from 'prior year' + 2023 context)
--------------------
Metric: Acquisitions | Value: NaN | Unit:  | Period: 2022
  > Nuance: substantially all



## Scenario D: Shared Lease Obligations (Graph Dependency)

In [6]:
await run_scenario(
    name="D",
    input_text="We lease our headquarters from a Variable Interest Entity (VIE) owned by our CEO.",
    section_path=["Properties", "Leases"],
    expected_desc="Metric: Lease Counterparty, Value: NaN, Nuance: 'VIE owned by CEO' or similar.",
)


--- Scenario D ---
Input: 'We lease our headquarters from a Variable Interest Entity (VIE) owned by our CEO.'
Expected: Metric: Lease Counterparty, Value: NaN, Nuance: 'VIE owned by CEO' or similar.
--------------------
Metric: Lease | Value: NaN | Unit:  | Period: UNKNOWN
  > Nuance: Variable Interest Entity (VIE) owned by our CEO

Metric: Lease | Value: NaN | Unit:  | Period: UNKNOWN
  > Nuance: Leased from a Variable Interest Entity (VIE) owned by our CEO



In [7]:
await run_scenario(
       name="D1",
       input_text="Revenue increased 15% compared to the prior year period.",
       section_path=["MD&A", "Revenue Analysis"],
       expected_desc="Metric: Revenue Growth, Value: 15.0, Unit: Percent,Nuance: 'increased 15% compared to prior year'"
   )



--- Scenario D1 ---
Input: 'Revenue increased 15% compared to the prior year period.'
Expected: Metric: Revenue Growth, Value: 15.0, Unit: Percent,Nuance: 'increased 15% compared to prior year'
--------------------
Metric: Revenue Growth | Value: 15.0 | Unit: Percent | Period: 2022



In [8]:
await run_scenario(
       name="D2",
       input_text="We issued $500 of senior notes due 2030 (1).",
       section_path=["Notes", "Debt"],
       expected_desc="Metric: Senior Notes Issuance, Value: 500000000.0 (Scaled by footnote), Nuance: 'due 2030 (1)'",
       context_str="Footnote (1): Dollars in millions."
   )


--- Scenario D2 ---
Input: 'We issued $500 of senior notes due 2030 (1).'
Expected: Metric: Senior Notes Issuance, Value: 500000000.0 (Scaled by footnote), Nuance: 'due 2030 (1)'
--------------------
Metric: Senior Notes Issued | Value: 500000000.0 | Unit: USD | Period: UNKNOWN



Scenario E: The "Dynamic Formula" Trap (Floating Rates)
Why it fails standard RAG: The text contains a number (2.25%), but the actual interest rate is a variable formula (SOFR + Margin). Standard models eagerly extract 2.25% as the interest rate, which is factually wrong and materially understates the cost of debt.

Goal: Ensure the system recognizes a Formula as a distinct nuance, rather than forcing a partial number.

In [9]:
await run_scenario(
    name="E",
    input_text="The Term Loan bears interest at a floating rate equal to SOFR plus an applicable margin of 2.25%.",
    section_path=["Debt", "Credit Agreement"],
    expected_desc="Metric: Term Loan Interest Rate, Value: NaN, Nuance: 'Formula: SOFR + 2.25%'",
    # CRITICAL: If the model extracts 2.25, the test MUST fail.
)


--- Scenario E ---
Input: 'The Term Loan bears interest at a floating rate equal to SOFR plus an applicable margin of 2.25%.'
Expected: Metric: Term Loan Interest Rate, Value: NaN, Nuance: 'Formula: SOFR + 2.25%'
--------------------
Metric: Applicable Margin | Value: 2.25 | Unit: Percent | Period: UNKNOWN



Scenario F: The "Constraint vs. Actual" Hallucination (Covenants)
Why it fails standard RAG: The text says "Leverage ratio not exceeding 3.50." Standard vector search sees "Leverage Ratio" and "3.50" and reports: "Current Leverage Ratio is 3.50."

Reality: The company might be at 1.2x leverage. Reporting the limit as the actual value suggests the company is on the brink of bankruptcy.

Goal: Distinguish Performance Metrics (Actuals) from Compliance Metrics (Limits).

In [10]:
await run_scenario(
    name="F",
    input_text="We are required to maintain a Net Leverage Ratio not exceeding 3.50 to 1.00. As of Dec 31, we were in compliance.",
    section_path=["Liquidity", "Covenants"],
    expected_desc="Metric: Leverage Ratio Limit, Value: 3.50, Nuance: 'Covenant Ceiling'. (Optional: Actual Leverage = NaN or 'Compliant')",
)


--- Scenario F ---
Input: 'We are required to maintain a Net Leverage Ratio not exceeding 3.50 to 1.00. As of Dec 31, we were in compliance.'
Expected: Metric: Leverage Ratio Limit, Value: 3.50, Nuance: 'Covenant Ceiling'. (Optional: Actual Leverage = NaN or 'Compliant')
--------------------
Metric: Net Leverage Ratio | Value: 3.5 | Unit: Ratio | Period: UNKNOWN

Metric: Compliance | Value: NaN | Unit: Compliance Status | Period: UNKNOWN
  > Nuance: As of Dec 31, we were in compliance



Scenario G: The "Negative Assurance" (The Zero Check)
Why it fails standard RAG: When asked "What are the off-balance sheet arrangements?", a standard LLM often tries to be helpful by hallucinating a generic definition or picking up a nearby irrelevant number.

Reality: "We have no off-balance sheet arrangements" is a distinct, high-value fact (Boolean False or None).

Goal: Verify the system can extract Absence of Evidence as a positive fact.

In [11]:
await run_scenario(
    name="G",
    input_text="We do not have any off-balance sheet arrangements that have or are reasonably likely to have a material effect on our financial condition.",
    section_path=["MD&A", "Off-Balance Sheet"],
    expected_desc="Metric: Off-Balance Sheet Arrangements, Value: None (or False), Nuance: 'Explicit Negative Assurance'",
)


--- Scenario G ---
Input: 'We do not have any off-balance sheet arrangements that have or are reasonably likely to have a material effect on our financial condition.'
Expected: Metric: Off-Balance Sheet Arrangements, Value: None (or False), Nuance: 'Explicit Negative Assurance'
--------------------
Metric: Off-Balance Sheet Arrangements | Value: NaN | Unit:  | Period: UNKNOWN
  > Nuance: have or are reasonably likely to have a material effect on our financial condition

Metric: Off-Balance Sheet Arrangements | Value: NaN | Unit:  | Period: UNKNOWN
  > Nuance: We do not have any



Scenario H: The "Authorized vs. Issued" Collision (Share Count)
Why it fails standard RAG: 10-Ks always list "Authorized Shares" (e.g., 100M) before "Issued Shares" (e.g., 45M). Simple semantic search often grabs the larger, first number (100M) as the "Share Count."

Reality: Market Cap = Price Ã— Issued Shares. Using "Authorized" inflates the company valuation by 2x-10x.

Goal: Test semantic precision in dense, repetitive text.

In [12]:
await run_scenario(
    name="H",
    input_text="Preferred Stock: 10,000,000 shares authorized, none issued. Common Stock: 100,000,000 shares authorized; 45,200,000 shares issued and outstanding.",
    section_path=["Balance Sheet", "Equity"],
    expected_desc="Fact 1: Common Shares Authorized (100M). Fact 2: Common Shares Issued (45.2M). Fact 3: Preferred Issued (0).",
)


--- Scenario H ---
Input: 'Preferred Stock: 10,000,000 shares authorized, none issued. Common Stock: 100,000,000 shares authorized; 45,200,000 shares issued and outstanding.'
Expected: Fact 1: Common Shares Authorized (100M). Fact 2: Common Shares Issued (45.2M). Fact 3: Preferred Issued (0).
--------------------
Metric: Authorized Preferred Stock | Value: 10000000.0 | Unit: Shares | Period: UNKNOWN

Metric: Authorized Common Stock | Value: 100000000.0 | Unit: Shares | Period: UNKNOWN

Metric: Issued and Outstanding Common Stock | Value: 45200000.0 | Unit: Shares | Period: UNKNOWN

