# Phase 3: 5-Hop Reasoning Pipeline Test

This notebook tests the 5-hop reasoning pipeline implementation.

## Pipeline Overview
1. **Entity Grounding**: Identify financial entities (tickers, currencies)
2. **Financial Aspect Identification**: Identify economic drivers
3. **Implicit Cue Detection**: Detect hedging, euphemisms
4. **Implicit Sentiment Inference**: Classify sentiment (Positive/Negative/Neutral)
5. **Market Implication Inference**: Determine market direction (Bullish/Bearish/Uncertain)

## 1. Setup and Imports

In [1]:
import sys
from pathlib import Path
import json
import pandas as pd
import warnings

warnings.filterwarnings("ignore")

# Add src to path
sys.path.insert(0, str(Path().resolve().parent))

from src.pipeline import ReasoningPipeline

print("Setup complete!")

Setup complete!


## 2. Initialize Pipeline

**Note**: You need to set your OpenAI API key. Either:
- Set environment variable: `export OPENAI_API_KEY='your-key'`
- Or pass it directly when initializing (not recommended for production)

In [2]:
try:
    import os

    api_key = os.getenv("OPENAI_API_KEY")
    if api_key:
        pipeline = ReasoningPipeline(api_key=api_key)
        print("‚úì Pipeline initialized successfully")
    else:
        print("‚ö† OPENAI_API_KEY not set. Pipeline will not work without API key.")
        print("   Set it with: export OPENAI_API_KEY='your-key'")
        pipeline = None
except Exception as e:
    print(f"‚ö† Error initializing pipeline: {e}")
    pipeline = None

‚úì Pipeline initialized successfully


## 3. Test with Example Headlines

Let's test the pipeline with some example financial news headlines.

In [3]:
# Example headlines (from the dataset or similar)
test_cases = [
    {
        "text": "Euro to benefit from the ECB's pronounced hawkish determination",
        "ticker": "EURUSD",
    },
    {"text": "Bitcoin faces headwinds as regulatory concerns persist", "ticker": "BTC"},
    {
        "text": "Apple shares may see gains despite challenging market conditions",
        "ticker": "AAPL",
    },
    {
        "text": "Fed remains cautious about inflation while keeping rates steady",
        "ticker": None,
    },
]

print(f"Prepared {len(test_cases)} test cases")

Prepared 4 test cases


## 4. Run Pipeline on Test Cases

In [4]:
if pipeline is None:
    print("‚ö† Skipping pipeline execution - API key not set")
    print("\nTo test the pipeline structure, here's what would happen:")
    print("\n1. Entity Grounding: Identifies tickers/entities")
    print("2. Financial Aspect: Identifies economic drivers")
    print("3. Implicit Cue: Detects hedging/euphemisms")
    print("4. Sentiment Inference: Classifies sentiment")
    print("5. Market Implication: Determines market direction")
else:
    results = []

    for i, test_case in enumerate(test_cases, 1):
        print(f"\n{'=' * 60}")
        print(f"Test Case {i}: {test_case['text']}")
        print(f"{'=' * 60}")

        try:
            # Run pipeline
            context = pipeline.run(
                text=test_case["text"], ticker=test_case.get("ticker")
            )

            # Get final result
            result = pipeline.get_final_result(context)
            results.append(result)

            # Display results
            print("\nüìä Results:")
            print(f"  Entity: {result['entity']}")
            print(f"  Financial Aspect: {result['financial_aspect']}")
            print(f"  Implicit Cues: {result['implicit_cues']}")
            print(f"  Sentiment: {result['sentiment']}")
            print(f"  Market Implication: {result['market_implication']}")

            # Access sentiment_reasoning from nested reasoning dict
            sentiment_reasoning = result.get("reasoning", {}).get("sentiment_reasoning")
            if sentiment_reasoning:
                print("\nüí≠ Sentiment Reasoning:")
                print(f"  {sentiment_reasoning}")

        except Exception as e:
            print(f"‚ùå Error: {e}")
            import traceback

            traceback.print_exc()


Test Case 1: Euro to benefit from the ECB's pronounced hawkish determination

üìä Results:
  Entity: EUR
  Financial Aspect: Monetary policy
  Implicit Cues: ['pronounced hawkish determination']
  Sentiment: Positive
  Market Implication: Bullish

üí≠ Sentiment Reasoning:
  The headline suggests that the Euro will benefit from the European Central Bank's (ECB) hawkish determination. The term 'pronounced hawkish determination' implies that the ECB is taking a strong stance towards tightening monetary policy, which is typically viewed positively by investors. This indicates that the Euro is expected to strengthen as a result of the ECB's actions, leading to a positive sentiment.

Test Case 2: Bitcoin faces headwinds as regulatory concerns persist

üìä Results:
  Entity: Bitcoin
  Financial Aspect: regulatory concerns
  Implicit Cues: ['persist', 'concerns']
  Sentiment: Negative
  Market Implication: Bearish

üí≠ Sentiment Reasoning:
  The headline indicates that Bitcoin is facing h

## 5. Detailed Results Analysis

In [5]:
if pipeline is not None and "results" in locals() and results:
    # Create DataFrame for easier analysis
    df_results = pd.DataFrame(
        [
            {
                "text": r["text"],
                "entity": r["entity"],
                "financial_aspect": r["financial_aspect"],
                "sentiment": r["sentiment"],
                "market_implication": r["market_implication"],
                "implicit_cues_count": len(r["implicit_cues"])
                if r["implicit_cues"]
                else 0,
            }
            for r in results
        ]
    )

    print("\nüìã Summary Table:")
    display(df_results)

    # Show detailed reasoning for first result
    if results:
        print("\nüîç Detailed Reasoning (First Case):")
        first_result = results[0]
        print(json.dumps(first_result["reasoning"], indent=2))


üìã Summary Table:


Unnamed: 0,text,entity,financial_aspect,sentiment,market_implication,implicit_cues_count
0,Euro to benefit from the ECB's pronounced hawk...,EUR,Monetary policy,Positive,Bullish,1
1,Bitcoin faces headwinds as regulatory concerns...,Bitcoin,regulatory concerns,Negative,Bearish,2
2,Apple shares may see gains despite challenging...,AAPL,corporate factors,Positive,Bullish,2
3,Fed remains cautious about inflation while kee...,Fed,inflation,Neutral,Uncertain,2



üîç Detailed Reasoning (First Case):
{
  "entity_reasoning": "The headline mentions the Euro (EUR) as the currency to benefit from the European Central Bank's (ECB) hawkish determination.",
  "aspect_reasoning": "The headline mentions the European Central Bank's (ECB) hawkish determination, indicating a potential shift in monetary policy. This could impact the value of the Euro and influence market sentiment towards the currency.",
  "cue_reasoning": "The use of 'pronounced hawkish determination' is a euphemism that suggests a strong and aggressive stance by the ECB, without explicitly stating it. This indirect language implies a positive outlook for the Euro without directly stating it.",
  "sentiment_reasoning": "The headline suggests that the Euro will benefit from the European Central Bank's (ECB) hawkish determination. The term 'pronounced hawkish determination' implies that the ECB is taking a strong stance towards tightening monetary policy, which is typically viewed positivel

## 6. Inspect Individual Hop Results

In [7]:
if pipeline is not None and "results" in locals() and results:
    # Show all hop results for first test case
    first_result = results[0]

    print("Individual Hop Results:")
    print("=" * 60)

    for hop_name, hop_result in first_result["all_hop_results"].items():
        print(f"\n{hop_name.upper().replace('_', ' ')}:")
        print(json.dumps(hop_result, indent=2))

Individual Hop Results:

ENTITY GROUNDING:
{
  "entities": [
    "EUR",
    "ECB"
  ],
  "primary_entity": "EUR",
  "confidence": "high",
  "reasoning": "The headline mentions the Euro (EUR) as the currency to benefit from the European Central Bank's (ECB) hawkish determination."
}

FINANCIAL ASPECT:
{
  "aspects": [
    "Monetary policy"
  ],
  "primary_aspect": "Monetary policy",
  "reasoning": "The headline mentions the European Central Bank's (ECB) hawkish determination, indicating a potential shift in monetary policy. This could impact the value of the Euro and influence market sentiment towards the currency."
}

IMPLICIT CUE:
{
  "cues": [
    "pronounced hawkish determination"
  ],
  "cue_types": [
    "euphemism"
  ],
  "has_implicit_language": true,
  "reasoning": "The use of 'pronounced hawkish determination' is a euphemism that suggests a strong and aggressive stance by the ECB, without explicitly stating it. This indirect language implies a positive outlook for the Euro wit

## 7. Usage Statistics

In [11]:
if pipeline is not None:
    stats = pipeline.get_usage_stats()

    print("LLM Usage Statistics:")
    print("=" * 60)
    print(f"Total API Calls: {stats['total_calls']}")
    print(f"Prompt Tokens: {stats['total_prompt_tokens']:,}")
    print(f"Completion Tokens: {stats['total_completion_tokens']:,}")
    print(f"Total Tokens: {stats['total_tokens']:,}")

    # Estimate cost (GPT-3.5-turbo pricing)
    input_cost_per_1k = 0.0015
    output_cost_per_1k = 0.002

    estimated_cost = (stats["total_prompt_tokens"] / 1000 * input_cost_per_1k) + (
        stats["total_completion_tokens"] / 1000 * output_cost_per_1k
    )

    print(f"\nEstimated Cost: ${estimated_cost:.4f}")
    print("   (Based on GPT-3.5-turbo pricing)")

LLM Usage Statistics:
Total API Calls: 20
Prompt Tokens: 3,887
Completion Tokens: 1,574
Total Tokens: 5,461

Estimated Cost: $0.0090
   (Based on GPT-3.5-turbo pricing)


## 8. Test with Real Data

Test the pipeline with actual data from the dataset.

In [15]:
# Load actual data from the dataset
from src.utils.data_loader import load_all_dataframes

base_path = Path().resolve().parent
data = load_all_dataframes(base_path)

if "ground_truth" in data:
    df_gt = data["ground_truth"]

    # Sample a few headlines
    sample_size = 3
    sample_df = df_gt.sample(n=min(sample_size, len(df_gt)), random_state=42)

    print(f"Testing with {len(sample_df)} real headlines from dataset:\n")

    if pipeline is not None:
        real_results = []

        for idx, row in sample_df.iterrows():
            text = row.get("title", row.get("text", ""))
            ticker = row.get("ticker")
            true_sentiment = row.get("true_sentiment", "Unknown")

            print(f"\n{'=' * 60}")
            print(f"Headline: {text}")
            print(f"Ticker: {ticker}")
            print(f"True Sentiment: {true_sentiment}")
            print(f"{'=' * 60}")

            try:
                context = pipeline.run(text=text, ticker=ticker)
                result = pipeline.get_final_result(context)

                print("\nüìä Pipeline Results:")
                print(f"  Predicted Sentiment: {result['sentiment']}")
                print(f"  Market Implication: {result['market_implication']}")

                # Compare with ground truth
                if true_sentiment:
                    from src.utils.sentiment_encoding import to_text

                    true_sentiment_text = (
                        to_text(true_sentiment)
                        if isinstance(true_sentiment, (int, float))
                        else str(true_sentiment)
                    )
                    match = "‚úì" if result["sentiment"] == true_sentiment_text else "‚úó"
                    print(f"  {match} Ground Truth: {true_sentiment_text}")

                real_results.append(
                    {
                        "text": text,
                        "ticker": ticker,
                        "true_sentiment": true_sentiment_text
                        if "true_sentiment_text" in locals()
                        else true_sentiment,
                        "predicted_sentiment": result["sentiment"],
                        "market_implication": result["market_implication"],
                    }
                )

            except Exception as e:
                print(f"‚ùå Error: {e}")

        if real_results:
            print("\n\nüìã Summary of Real Data Tests:")
            df_real = pd.DataFrame(real_results)
            display(df_real)
    else:
        print("‚ö† Pipeline not initialized. Set OPENAI_API_KEY to test.")
else:
    print("‚ö† Ground truth data not found.")

Loaded ground truth: 2291 rows
Loaded single article predictions: 2291 rows
Loaded all-day articles: 293 rows
Testing with 3 real headlines from dataset:


Headline: USDJPY Next on the upside comes 13790 ‚Äì UOB
Ticker: USDJPY
True Sentiment: Positive


KeyboardInterrupt: 

## 9. Pipeline Architecture Overview

Visualize how the pipeline works:

In [12]:
print("""
5-Hop Reasoning Pipeline Architecture
=====================================

Input: Financial News Headline
  ‚îÇ
  ‚îú‚îÄ‚Üí [Hop 1] Entity Grounding
  ‚îÇ     ‚îî‚îÄ‚Üí Identifies: Tickers, Currencies, Assets
  ‚îÇ
  ‚îú‚îÄ‚Üí [Hop 2] Financial Aspect Identification
  ‚îÇ     ‚îî‚îÄ‚Üí Identifies: Inflation, Rates, Growth, Risk, etc.
  ‚îÇ
  ‚îú‚îÄ‚Üí [Hop 3] Implicit Cue Detection
  ‚îÇ     ‚îî‚îÄ‚Üí Detects: Hedging, Euphemisms, Indirect Language
  ‚îÇ
  ‚îú‚îÄ‚Üí [Hop 4] Implicit Sentiment Inference
  ‚îÇ     ‚îî‚îÄ‚Üí Classifies: Positive / Negative / Neutral
  ‚îÇ
  ‚îî‚îÄ‚Üí [Hop 5] Market Implication Inference
        ‚îî‚îÄ‚Üí Determines: Bullish / Bearish / Uncertain

Output: Complete reasoning with all intermediate steps

Context Passing:
  Each hop receives context from previous hops and adds its own results.
  This allows later hops to use information from earlier hops.
""")


5-Hop Reasoning Pipeline Architecture

Input: Financial News Headline
  ‚îÇ
  ‚îú‚îÄ‚Üí [Hop 1] Entity Grounding
  ‚îÇ     ‚îî‚îÄ‚Üí Identifies: Tickers, Currencies, Assets
  ‚îÇ
  ‚îú‚îÄ‚Üí [Hop 2] Financial Aspect Identification
  ‚îÇ     ‚îî‚îÄ‚Üí Identifies: Inflation, Rates, Growth, Risk, etc.
  ‚îÇ
  ‚îú‚îÄ‚Üí [Hop 3] Implicit Cue Detection
  ‚îÇ     ‚îî‚îÄ‚Üí Detects: Hedging, Euphemisms, Indirect Language
  ‚îÇ
  ‚îú‚îÄ‚Üí [Hop 4] Implicit Sentiment Inference
  ‚îÇ     ‚îî‚îÄ‚Üí Classifies: Positive / Negative / Neutral
  ‚îÇ
  ‚îî‚îÄ‚Üí [Hop 5] Market Implication Inference
        ‚îî‚îÄ‚Üí Determines: Bullish / Bearish / Uncertain

Output: Complete reasoning with all intermediate steps

Context Passing:
  Each hop receives context from previous hops and adds its own results.
  This allows later hops to use information from earlier hops.

