<a href="https://colab.research.google.com/github/Kvnhooman/AAI520_Final-Project_Group5/blob/main/Financial_Research_AI_Assistant_v_10_15_2400.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


## Group 5

**AAI 520 - Final Project**

Authors: Kevin, Spencer, Tommy

Date: 10/20/25




___
## Project Outline


This is an LLM powered research assistant that finds key financial details on publicly traded companies. The agent is able to plan and execute key tasks leveraging several tools in order to generate a final output.


**Core Agent Functions**


* Planning: The ability to plan each step.

* Tool Use: Integrates APIs (Yahoo Finance, SEC EDGAR, News APIs) and Specialists


* Self-Reflection: Evaluates output quality and iterates


* Memory/Learning: Maintains memory across analysis runs


**Workflow Patterns**
* Prompt Chaining: News ingestion → preprocessing → classification → extraction → summarization


* Routing: Directs content to specialist analyzers (earnings, news, market)


* Evaluator-Optimizer: Generates analysis → evaluates quality → refines using feedback

# Setup

In [1]:
%%capture --no-stderr
%pip install --quiet -U \
    langgraph \
    langchain_openai \
    langgraph_sdk \
    langgraph-prebuilt \
    langchain-perplexity \
    huggingface_hub \
    openai \
    yfinance \
    langchain-google-genai \
    langchain-community \
    transformers \
    datasets \
    bitsandbytes accelerate sentence-transformers \
    faiss-cpu \
    ddgs\
    langchain-huggingface

In [None]:
%import langchain-huggingface

In [3]:
#Import key libraries
import os
import requests
import re
import textwrap

from bs4 import BeautifulSoup

from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_community.chat_models import ChatPerplexity #updated
from langchain_openai import OpenAI, ChatOpenAI

#from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
#from langchain.prompts.chat import ChatPromptTemplate, MessagesPlaceholder
from langchain.agents import  AgentExecutor, initialize_agent, Tool, AgentType
from langchain.chains import LLMChain

##from langchain.llms import OpenAI
#from wikipedia import summary
from langgraph.prebuilt import create_react_agent

#Correct import path for YahooFinanceAPI
import yfinance as yf
from langchain_community.tools.yahoo_finance_news import YahooFinanceNewsTool
#from langchain.prompts import PromptTemplate

from transformers import pipeline

#Memory
from dataclasses import dataclass
from datetime import datetime
from typing import Dict, List, Optional, Any, Union



In [6]:
from google.colab import userdata

#API Keys from ColAb Secrets
gemini_key = userdata.get('GEMINI')
hf_key = userdata.get('HF_TOKEN')
openai_key = userdata.get('OPENAI')
fin_news = userdata.get('FIN_API_KEY')
tavily_key = userdata.get('TAVILY_API_KEY')
perplexity_key = userdata.get('PERPLEXITY_API_KEY')

In [7]:
from huggingface_hub import login #Connect to HF using API key accessed above
login(token=hf_key)

import warnings

#Suppress length warnings
warnings.filterwarnings("ignore", category=UserWarning, module="huggingface_hub.utils._auth")

llm_hf = pipeline(
    "text-generation",
    model="meta-llama/Llama-3.2-1B-Instruct",
    device= -1
    )

Device set to use cpu


In [11]:
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
from langchain_huggingface import HuggingFacePipeline

# Load the model and tokenizer (works offline after first download)
model_id = "meta-llama/Llama-3.2-1B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Create a local Hugging Face pipeline
hf_pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=256,
    temperature=0.7,
    device=-1  # CPU mode; use device_map="auto" for GPU acceleration
)

# Wrap pipeline for LangChain / LangGraph usage
llm_local = HuggingFacePipeline(pipeline=hf_pipe)

Device set to use cpu


In [12]:
# Setup Gemini to use in Agents
llm_gemini = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",
    google_api_key=gemini_key
)

In [13]:
llm_openai = ChatOpenAI(
    model="gpt-3.5-turbo",  # "gpt-3.5-turbo", "gpt-4-mini", "gpt-5-mini" ...
    openai_api_key=openai_key,  # Your OpenAI API key
    temperature=0.0             # (optional) set as needed
)

In [19]:
llm_perplexity = ChatPerplexity(
    model="sonar-pro", #Other model options
    pplx_api_key = perplexity_key,
    temperature=0.1
)

In [20]:
#Test OpenAI API
#response = llm_openai.invoke("Explain how neural networks work in 10 words or less.")
#response = llm_gemini.invoke("Explain how neural networks work in 10 words or less.")
response = llm_perplexity.invoke("Explain how neural networks work in 10 words or less.")
print(response.content)

Neural networks process data through interconnected nodes.


## User Input Context Extractor Agent

In [21]:
#-----Simply outlining this here, we can build an ingestion agent or function if we have time
user_input = "COST"

In [22]:
#Input extraction output
context = user_input
print(context)

COST


## Tool Agent
Integrates APIs (Yahoo Finance, SEC EDGAR, News APIs)

List of Tools:
- Web Search
- API call - Yahoo Finance
- to be updated...

### Tool Functions


In [23]:
#Yahoo Finance Tool
import yfinance as yf

def get_stock_summary(ticker: str) -> str:

    """Gets detailed financial info for a given stock symbol."""

    try:
        stock = yf.Ticker(ticker)
        data = stock.info

        return (
            f"Company: {data.get('longName', 'N/A')} ({ticker})\n"
            f"Sector: {data.get('sector', 'N/A')}\n"
            f"Price: ${data.get('regularMarketPrice', 'N/A')}\n"
            f"Market Cap: {data.get('marketCap', 'N/A')}\n"
            f"P/E Ratio: {data.get('trailingPE', 'N/A')}\n"
            f"52-Week High: {data.get('fiftyTwoWeekHigh', 'N/A')}\n"
            f"52-Week Low: {data.get('fiftyTwoWeekLow', 'N/A')}\n"
            f"Dividend Yield: {data.get('dividendYield', 'N/A')}\n"
            f"Beta: {data.get('beta', 'N/A')}\n"
            f"Revenue: {data.get('totalRevenue', 'N/A')}\n"
            f"Profit Margin: {data.get('profitMargins', 'N/A')}\n"
        )
    except Exception as e:
        return f"Error retrieving data for {ticker}: {e}"

yahoo_api_tool = Tool(
    name="YahooFinanceAPI",
    func=get_stock_summary,
    description="Retrieves detailed Yahoo Finance company data such as price, valuation metrics, and financial ratios."
)

In [24]:
#For debugging
print(yahoo_api_tool("AAPL"))

  print(yahoo_api_tool("AAPL"))


Company: Apple Inc. (AAPL)
Sector: Technology
Price: $249.34
Market Cap: 3700302807040
P/E Ratio: 37.893616
52-Week High: 260.1
52-Week Low: 169.21
Dividend Yield: 0.42
Beta: 1.094
Revenue: 408624988160
Profit Margin: 0.24295999



In [25]:
import requests
from bs4 import BeautifulSoup
import xml.etree.ElementTree as ET

def summarize_sec_document(doc_url: str) -> str:
    headers = {"User-Agent": "tpoole@sandiego.edu"}
    resp = requests.get(doc_url, headers=headers)
    if resp.status_code != 200:
        return f"Could not access doc: {doc_url}"
    doc_text = resp.text

    # Try XML parsing first (most filings)
    try:
        root = ET.fromstring(doc_text)
        values = []
        for item in root.iter():
            if item.tag.lower() in ['transactionamount', 'transactioncode', 'issuername']:
                values.append(f"{item.tag}: {item.text}")
        return " | ".join(values) if values else "No key XML findings found."
    except ET.ParseError:
        # Fallback: try HTML/text parsing
        soup = BeautifulSoup(doc_text, 'html.parser')
        summary = soup.get_text(separator=' ', strip=True)
        return summary[:400]  # Return first 400 chars as summary

def get_sec_filings(ticker: str) -> str:
    cik_url = "https://www.sec.gov/files/company_tickers.json"
    headers = {"User-Agent": "your_email@example.com"}
    cik_resp = requests.get(cik_url, headers=headers)
    if cik_resp.status_code != 200:
        return f"SEC.gov rejected request: {cik_resp.status_code}"

    try:
        cik_data = cik_resp.json()
        cik_lookup = {item['ticker']: item['cik_str'] for item in cik_data.values()}
        cik = cik_lookup.get(ticker.upper())
    except Exception as e:
        return f"Error parsing CIK data: {e}"

    if not cik:
        return f"CIK for {ticker} not found."

    filings_url = f"https://data.sec.gov/submissions/CIK{cik:0>10}.json"
    filings_resp = requests.get(filings_url, headers=headers)
    try:
        data = filings_resp.json() if filings_resp.status_code == 200 else {}
        filings = data.get('filings', {}).get('recent', {})
        if not filings: return f"No filings found for {ticker}."
        forms = filings.get('form', [])[:3]
        filing_dates = filings.get('filingDate', [])[:3]
        primary_docs = filings.get('primaryDocument', [])[:3]
        accessions = filings.get('accessionNumber', [])[:3]

        result = []
        for form, date, doc, acc in zip(forms, filing_dates, primary_docs, accessions):
            # Build the SEC document URL:
            doc_url = f"https://www.sec.gov/Archives/edgar/data/{int(cik)}/{acc.replace('-', '')}/{doc}"
            summary = summarize_sec_document(doc_url)
            result.append(f"{form} on {date}:\n{doc_url}\nSummary: {summary}\n")
        return "Latest filings with summaries:\n" + "\n".join(result)
    except Exception as e:
        return f"Error loading filings for {ticker}: {e}"

sec_api_tool = Tool(
    name="SECEDGARAPI",
    func=get_sec_filings,
    description="Retrieves and summarizes key findings from recent SEC filings by ticker"
)


In [26]:
#For debugging
print(sec_api_tool("AAPL"))

Latest filings with summaries:
4 on 2025-10-03:
https://www.sec.gov/Archives/edgar/data/320193/000121415625000011/xslF345X05/wk-form4_1759530830.xml
Summary: SEC FORM 
            4 SEC Form 4 FORM 4 UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C. 20549 STATEMENT OF CHANGES IN BENEFICIAL OWNERSHIP Filed pursuant to Section 16(a) of the Securities Exchange Act of 1934 or Section 30(h) of the Investment Company Act of 1940 OMB APPROVAL OMB Number: 3235-0287 Estimated average burden hours per response: 0.5 Â Â Check this box if no longer

4 on 2025-10-03:
https://www.sec.gov/Archives/edgar/data/320193/000176709425000009/xslF345X05/wk-form4_1759530744.xml
Summary: SEC FORM 
            4 SEC Form 4 FORM 4 UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C. 20549 STATEMENT OF CHANGES IN BENEFICIAL OWNERSHIP Filed pursuant to Section 16(a) of the Securities Exchange Act of 1934 or Section 30(h) of the Investment Company Act of 1940 OMB APPROVAL OMB Number: 323

In [27]:
from langchain_community.document_loaders import WebBaseLoader
from langchain.tools import Tool

def visit_webpage(url: str) -> str:
    """Fetches and cleans webpage or filing content for analysis."""
    try:
        loader = WebBaseLoader(web_path=url)
        documents = loader.load()
        # return the first document’s text content
        return documents[0].page_content[:5000]  # limit length for LLM input
    except Exception as e:
        return f"Error fetching {url}: {e}"

visit_webpage_tool = Tool(
    name="VisitWebpage",
    func=visit_webpage,
    description="Visits a webpage or SEC filing URLs, reads HTML or XML content, and returns the main text for summarization. Feed me your URLs to get more information."
)

In [28]:
#For debugging
print(visit_webpage_tool("https://www.sec.gov/Archives/edgar/data/320193/000207847625000005/xslF345X05/wk-form4_1759530690.xml"))




SEC.gov | Your Request Originates from an Undeclared Automated Tool

html {height: 100%}
body {height: 100%; margin:0; padding:0;}
#header {background-color:#003968; color:#fff; padding:15px 20px 10px 20px;font-family:Arial, Helvetica, sans-serif; font-size:20px; border-bottom:solid 5px #000;}
#footer {background-color:#003968; color:#fff; padding:15px 20px;font-family:Arial, Helvetica, sans-serif; font-size:20px;}
#content {max-width:650px;margin:60px auto; padding:0 20px 100px 20px; background-image:url(seal_bw.png);background-repeat:no-repeat;background-position:50% 100%;}
h1 {font-family:Georgia, Times, serif; font-size:20px;}
h2 {text-align:center; font-family:Georgia, Times, serif; font-size:20px; width:100%; border-bottom:solid #999 1px;padding-bottom:10px; margin-bottom:20px;}
h3 {font-family:Georgia, Times, serif; font-size:16px; margin:25px 0 0 0;}
p {font-family:Verdana, Geneva, sans-serif;font-size:14px;line-height:1.3;}
.grey_box {background-color:#eee; padding:5px 40px

In [29]:
#Finanical news lookup
def get_fin_news(symbol: str) -> str:
    api_key = fin_news
    url = "https://newsapi.org/v2/everything"
    params = {
        "q": symbol,
        "apiKey": api_key,
        "sortBy": "publishedAt",
        "language": "en"
    }
    response = requests.get(url, params=params)
    if response.status_code != 200:
        return f"API error {response.status_code}: {response.text[:200]}"
    data = response.json()
    articles = data.get("articles", [])
    if not articles:
        return f"No news found for {symbol}. Full message: {data.get('message', '')}"
    return "\n".join([a["description"] or a["title"] for a in articles[:3]])



news_api_tool = Tool(
    name="NewsAPI",
    func=get_fin_news,  #Assumes you've defined this class
    description="Finds recent financial news on stock symbol"
)

In [30]:
#For debugging
print(get_fin_news("AAPL"))

BelkinがMagSafeとQi2ワイヤレス充電に対応したiPhone Air/17/Pro用ケース「Belkin SheerForce iPhoneクリアケース」を発売しています。詳細は以下から。 　台湾Foxco […]
The post Belkin、MagSafe/Qi2 25Wワイヤレス充電に対応したiPhone Air/17/Pro用の「Belkin SheerForceクリア/カラーケース」を発売。 first appeared on AAPL Ch..
investors weighed hopes for rate cuts and strong results from Wall Street banks as the US-China trade dispute rolled on.
investors weighed hopes for rate cuts and strong results from Wall Street banks as the US-China trade dispute rolled on.


In [31]:
import requests

def get_fin_news_tavily(symbol: str) -> str:
    """
    Searches for and extracts detailed financial news impacting the given stock symbol.
    Performs a deep extraction with Tavily's advanced read (HTML parsing and markdown conversion).
    """
    query = f"News on {symbol}"
    api_url_search = "https://api.tavily.com/search"
    api_url_extract = "https://api.tavily.com/extract"

    # Step 1: Search for related news articles
    payload_search = {
        "query": query,
        "api_key": tavily_key,
        "max_results": 5,
        "search_depth": "advanced",          # more contextually relevant pages
        "include_raw_content": False
    }

    search_response = requests.post(api_url_search, json=payload_search)
    if search_response.status_code != 200:
        return f"Tavily search error: {search_response.status_code} {search_response.text}"

    results = search_response.json().get("results", [])
    links = [r.get("link") for r in results if r.get("link")]

    # Step 2: Extract full article contents using Tavily Extract API
    if not links:
        return "No articles found for this stock symbol."

    payload_extract = {
        "urls": links,
        "api_key": tavily_key,
        "extract_depth": "advanced",         # deeper DOM parsing, better for long articles
        "include_images": False,
        "format": "markdown"                 # returns Markdown-formatted summaries
    }

    extract_response = requests.post(api_url_extract, json=payload_extract)
    if extract_response.status_code != 200:
        return f"Tavily extract error: {extract_response.status_code} {extract_response.text}"

    extracted_articles = extract_response.json().get("results", [])
    if not extracted_articles:
        return "No extractable content retrieved."

    # Step 3: Structure extracted results for downstream summarization or agent reflection
    summaries = []
    for article in extracted_articles:
        link = article.get("url", "")
        content = article.get("content", "No text content")
        summaries

tavily_news_tool = Tool(
    name="TavilyNews",
    func=get_fin_news_tavily,
    description="Searches for and extracts detailed financial news impacting the given stock symbol."
)


In [32]:
#For debugging
print(get_fin_news_tavily(symbol="AAPL"))

No articles found for this stock symbol.


In [33]:
from transformers import pipeline
from langchain.tools import Tool


class HuggingFaceSentimentTool:
    """
    A lightweight sentiment analysis tool that uses a pretrained Hugging Face model.

    This class wraps the Hugging Face `pipeline("sentiment-analysis")` function
    and exposes a simple interface for evaluating sentiment in text.

    Attributes
    ----------
    classifier : transformers.pipelines.Pipeline
        The Hugging Face sentiment-analysis pipeline used for inference.
    """

    def __init__(self):
        """Initializes the Hugging Face sentiment analysis pipeline."""
        self.classifier = pipeline("sentiment-analysis")

    def analyze(self, text: str) -> int:
        """
        Performs sentiment analysis on input text and returns a numeric sentiment label.

        Parameters
        ----------
        text : str
            The input text string to analyze.

        Returns
        -------
        int
            Returns +1 for positive sentiment and -1 for negative sentiment.
        """
        result = self.classifier(text)[0]
        return 1 if result["label"] == "POSITIVE" else -1


sentiment_api_tool = Tool(
    name="HuggingFaceSentiment",
    func=HuggingFaceSentimentTool().analyze,
    description=(
        "Analyzes sentiment of input text using a pretrained Hugging Face transformer "
        "and returns +1 for positive sentiment or -1 for negative."
    )
)


No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cpu


In [34]:
def earnings_analyzer(symbol: str, context: str = "", yahoo_tool=None, sec_tool=None) -> str:
    """
    Analyzes earnings info given a stock symbol, with string-based context/memory.
    - symbol: stock ticker as string
    - context: concatenated text from tools agent's memory (e.g., research_notes or scratchpad)
    - yahoo_tool, sec_tool: callable tools to retrieve more info if not found in context
    Returns: summary string of earnings info
    """
    #Check context/memory for earnings info
    symbol_upper = symbol.upper()
    relevant_snippet = ""
    for line in context.splitlines():
        # crude match: look for earnings and the symbol in context; refine for your structure
        if symbol_upper in line and "earnings" in line.lower():
            relevant_snippet = line.strip()
            break

    #If not found, query tools for latest info
    if not relevant_snippet:
        if yahoo_tool:
            relevant_snippet = yahoo_tool.func(symbol)  # could return a string summary
        elif sec_tool:
            relevant_snippet = sec_tool.func(symbol)
        else:
            relevant_snippet = f"No earnings info found for {symbol}."

    #You can further refine the summary or do LLM chaining as needed
    return f"Earnings analysis for {symbol}:\n{relevant_snippet}"


earnings_specialist = Tool(
    name="EarningsAnalyzer",
    func=earnings_analyzer,
    description="Analyzes earnings context and info for a stock symbol, using memory/context and sources."
)


In [35]:
#For debugging
print(earnings_analyzer("META"))

Earnings analysis for META:
No earnings info found for META.


In [37]:
#Add Duck Duck Go
from langchain.tools import DuckDuckGoSearchRun
search = DuckDuckGoSearchRun()

duck_duck_go_search = Tool(
    name="DuckDuckGoSearch",
    func=search.invoke,
    description="Useful for when you need to answer questions about current events. You should ask targeted questions."
)

In [38]:
duck_duck_go_search('APPL')

"1 day ago - Apple Inc. is an American multinational technology company headquartered in Cupertino, California, in Silicon Valley, best known for its consumer electronics, software and online services. Founded in 1976 as Apple Computer Company by Steve Jobs, Steve Wozniak and Ronald Wayne, the company was ... 2 days ago - An apple is the round, edible fruit of an apple tree (Malus spp.). Fruit trees of the orchard or domestic apple (Malus domestica), the most widely grown in the genus, are cultivated worldwide. The tree originated in Central Asia, where its wild ancestor, Malus sieversii, is still found. Table of ContentsHow to spell “ appl ”How many syllables in “ appl ” Membre Du Comité Directeur De L' appl (association Des Pharmaciens De La Province De Liège). View and download Benjamin Appl Instagram tagged photos and videos without login.Benjamin Appl . @benjaminapplbaritone."

In [39]:
import yfinance as yf
import matplotlib.pyplot as plt
import pandas as pd
from datetime import datetime, timedelta
from langchain.tools import Tool

#Define the plotting function
def plot_stock_52_week(symbol: str) -> str:
    """
    Plots the last 52 weeks (1 year) of stock closing prices for the given symbol.
    Returns the filepath to the saved chart image.
    """
    try:
        end_date = datetime.today()
        start_date = end_date - timedelta(weeks=52)

        # Download 1-year historical stock data
        data = yf.download(symbol, start=start_date, end=end_date)

        if data.empty:
            return f"No data found for {symbol}. Please check the symbol or try again later."

        # Plot closing price over time
        plt.figure(figsize=(10, 5))
        plt.plot(data.index, data['Close'], label='Closing Price', color='dodgerblue', linewidth=2)
        plt.title(f"{symbol} Stock Price (Last 52 Weeks)", fontsize=14, pad=15)
        plt.xlabel("Date", fontsize=12)
        plt.ylabel("Price (USD)", fontsize=12)
        plt.grid(True, linestyle='--', alpha=0.6)
        plt.legend()

        # Save plot to file
        filename = f"{symbol}_52week_chart.png"
        plt.savefig(filename, bbox_inches='tight')
        plt.close()

        return f"Stock chart saved as {filename}"
    except Exception as e:
        return f"Error generating chart for {symbol}: {e}"

#Wrap as a LangGraph Tool
stock_chart_tool = Tool(
    name="StockChart52Weeks",
    func=plot_stock_52_week,
    description="Plots and saves a 52-week stock price chart for the given symbol using Yahoo Finance data."
)


In [40]:
from datetime import datetime
from langchain.tools import Tool

def get_timestamp(_: str = "") -> str:
    """
    Returns the current date and time in ISO 8601 format.

    This tool can be used by the agent to log events, timestamp analyses,
    or align data with real-world time context. It supports chaining with
    other workflow nodes that require temporal awareness.
    """
    return datetime.now().isoformat()

timestamp_tool = Tool(
    name="TimestampTool",
    func=get_timestamp,
    description="Returns the current date and time in ISO 8601 format, useful for logging or aligning time-based analyses."
)

In [41]:
#Create the tools list
tools = [
    yahoo_api_tool,
    sec_api_tool,
    news_api_tool,
    tavily_news_tool,
    sentiment_api_tool,
    earnings_specialist,
    duck_duck_go_search,
    stock_chart_tool,
    HuggingFaceSentimentTool,
    visit_webpage_tool,
    timestamp_tool
]

### Tools Agent Function

In [68]:
#Tools Agent Prompt & Function

ANALYSIS_PROMPT = """
System Role: You are a detailed Financial Research Agent designed to analyze companies, securities, and markets with a structured workflow.

**Your goal** For {symbol}, provide a ~50 word investment summary, based on extensive reserach.

Include {context_memory} to improve your answers.

NEVER INCLUDE YOUR THOUGHT PROCESS IN THE RESULTS.


**Core Functions**

1. Planning:
Develop and print out a plan for each given stock symbol. Include a plan for a multi-step research process that defines what data to collect (e.g., news, market data, filings), what tools to use, and how to verify results.

Example steps: “Collect company overview → Analyze financials → Cross-check recent filings → Summarize.”

2. Tool Use:
Use all of your tools by default to hit APIs and data sources dynamically, such as Yahoo Finance, SEC EDGAR, news APIs, and any specialist analyzers. Choose the right tools and calls autonomously based on the goal of each stage. Return structured findings.

TOOL USAGE REQUIREMENTS:
- Refer to the memory for useful data.
- Use at least 4 different tools for each analysis
- Cross-reference information from multiple sources
- If a tool fails, try alternative tools for the same data
- Always explain your reasoning between tool calls

3. Self-Reflection:
Evaluate your output’s completeness, correctness, and coherence at each stage. If key data (e.g., P/E, recent filings, or market indicators) appears missing or uncertain, perform iterative refinement using another reasoning pass or additional tool calls.

4. Memory/Learning:
Retain useful context from prior analyses (observations, missing data, errors, patterns in company performance). Use this "brief memory" to improve future research quality and efficiency.


**Workflow Logic**

Follow a defined sequence for every research request:
* Ingest → Preprocess → Classify → Extract → Summarize

1. Ingest: Gather relevant news, filings, and metrics. Use the webpage visit tool on any URLS.

2. Preprocess: Clean, standardize, and interpret results from multiple tools.

3. Classify: Route data to the correct analysis path (earnings, news, or market).

4. Extract: Pull core signals, events, or values.

5. Summarize: Synthesize insights in your final output.

6. Routing:
  * Choose appropriate specialist analyzers according to data type:

  * NewsAnalyzer for press, sentiment, or event narratives.

  * MarketAnalyzer for indices, performance data, and macro context.

7. Evaluation: Review reasoning and completeness, recall tools again if needed.


**Final Report**
At the end of each session, create a final summary report that contains:

Updated output from each tool as well as the date and time of the inquiry.

Do not include any planning information in the report.

Include citations such as [1], along with a hyperlink at the bottom of the page for further review.

Respond in clear, organized Markdown, including headings, bullet points, and labeled data sources for readability.

**STOPPING RULE:** Once you have basic financials, recent news, and market context, conclude your analysis. Do not seek additional tools or data.

Be thorough - this analysis will inform major investment decisions.

"""

#Define the agent
tools_agent = create_react_agent(
    model=llm_openai,
    tools=tools,
    prompt=ANALYSIS_PROMPT,
    debug=True
)

In [87]:
# Create your formatted prompt
formatted_prompt = ANALYSIS_PROMPT.format(
    symbol=user_input,              # "TGT"
    context_memory=""               # or your stored memory string
)

# Build / run your agent call
tools_output = tools_agent.invoke({
    "messages": [
        {"role": "user", "content": formatted_prompt}
    ]
})

[1m[values][0m {'messages': [HumanMessage(content='\nSystem Role: You are a detailed Financial Research Agent designed to analyze companies, securities, and markets with a structured workflow.\n\n**Your goal** For TGT, provide a ~50 word investment summary, based on extensive reserach.\n\nInclude  to improve your answers.\n\nNEVER INCLUDE YOUR THOUGHT PROCESS IN THE RESULTS.\n\n\n**Core Functions**\n\n1. Planning:\nDevelop and print out a plan for each given stock symbol. Include a plan for a multi-step research process that defines what data to collect (e.g., news, market data, filings), what tools to use, and how to verify results.\n\nExample steps: “Collect company overview → Analyze financials → Cross-check recent filings → Summarize.”\n\n2. Tool Use:\nUse all of your tools by default to hit APIs and data sources dynamically, such as Yahoo Finance, SEC EDGAR, news APIs, and any specialist analyzers. Choose the right tools and calls autonomously based on the goal of each stage. Re

  data = yf.download(symbol, start=start_date, end=end_date)
[*********************100%***********************]  1 of 1 completed

[1m[updates][0m {'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_9OsP7THKmlbWOpyprrFBMlSh', 'function': {'arguments': '{"__arg1": "TGT"}', 'name': 'YahooFinanceAPI'}, 'type': 'function'}, {'id': 'call_WgcFrBq7oET6cGpcpdMQPw0D', 'function': {'arguments': '{"__arg1": "TGT"}', 'name': 'SECEDGARAPI'}, 'type': 'function'}, {'id': 'call_G8fD2oWA8cUnCrtd8HkPGMnu', 'function': {'arguments': '{"__arg1": "TGT"}', 'name': 'NewsAPI'}, 'type': 'function'}, {'id': 'call_KtCsIjyc7fDl4qQC1zyZuc2v', 'function': {'arguments': '{"__arg1": "TGT"}', 'name': 'StockChart52Weeks'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 93, 'prompt_tokens': 1869, 'total_tokens': 1962, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 's




[1m[updates][0m {'tools': {'messages': [ToolMessage(content='Walmart is the latest company to announce a partnership with OpenAI.\nBest Buy could quietly emerge as the big retail winner if U.S.–China tariff relief holds.\nWalmart is the latest company to announce a partnership with OpenAI.', name='NewsAPI', tool_call_id='call_G8fD2oWA8cUnCrtd8HkPGMnu')]}}
[1m[updates][0m {'tools': {'messages': [ToolMessage(content='Company: Target Corporation (TGT)\nSector: Consumer Defensive\nPrice: $90.91\nMarket Cap: 41309425664\nP/E Ratio: 10.5955715\n52-Week High: 158.97\n52-Week Low: 85.36\nDividend Yield: 5.02\nBeta: 1.177\nRevenue: 105640001536\nProfit Margin: 0.03718\n', name='YahooFinanceAPI', tool_call_id='call_9OsP7THKmlbWOpyprrFBMlSh')]}}
[1m[updates][0m {'tools': {'messages': [ToolMessage(content='Stock chart saved as TGT_52week_chart.png', name='StockChart52Weeks', tool_call_id='call_KtCsIjyc7fDl4qQC1zyZuc2v')]}}
[1m[updates][0m {'tools': {'messages': [ToolMessage(content='Latest

In [88]:
from IPython.display import Markdown, display

def clean_answer (text):
    msg = text["messages"][-1]
    return msg.content

In [89]:
display(Markdown(clean_answer(tools_output)))

## Investment Summary for Target Corporation (TGT)

### Company Overview:
- **Sector:** Consumer Defensive
- **Price:** $90.91
- **Market Cap:** $41,309,425,664
- **P/E Ratio:** 10.60
- **Dividend Yield:** 5.02%
- **Beta:** 1.18
- **Revenue:** $105,640,001,536
- **Profit Margin:** 3.72%

### Recent Filings:
- **8-K on 2025-10-09:** [Link to Filing](https://www.sec.gov/Archives/edgar/data/27419/000002741925000121/tgt-20251009.htm)
- **4 on 2025-10-02:** [Link to Filing](https://www.sec.gov/Archives/edgar/data/27419/000194181625000004/xslF345X05/wk-form4_1759437603.xml)
- **10-Q on 2025-08-29:** [Link to Filing](https://www.sec.gov/Archives/edgar/data/27419/000002741925000118/tgt-20250802.htm)

### Recent News:
- Walmart partners with OpenAI.
- Best Buy could benefit from U.S.–China tariff relief.
- Walmart announces a partnership with OpenAI.

### 52-Week Stock Price Chart:
- Stock chart saved as TGT_52week_chart.png

[1] Data retrieved from Yahoo Finance, SEC EDGAR, News API, and StockChart52Weeks on the current date and time.

## Self Reflection & Evaluation Agent
Evaluates output quality

In [92]:
#Define Self Evaluation agent

EVAL_PROMPT = """
You are an expert evaluator. Your primary job is to give feedback on the analysis below, NOT to overwrite or revise it.

Instructions:

- Always display the full analysis/summary input *exactly as received* at the start of your answer, clearly labeled.
- Provide your commentary (improvement, completeness, feedback) **separately after the full input**.
- If human feedback is supplied, include your response to it at the end **without changing the original summary**.

FORMAT STRICTLY LIKE THIS:
---
Original Analysis:
{input}

Evaluator Commentary:
[Your bullet points: Completeness, Succinctness, Accuracy, Clarity, Human Feedback summary, Suggestions, etc.]

---

Never rewrite or summarize the original analysis. Only provide clear, constructive evaluator commentary after reproducing the input in its original form.

--- Human Feedback ---
{human_feedback}


"""

#Define agent
evaluator_agent = create_react_agent(
    model=llm_gemini,
    tools = [],
    prompt=EVAL_PROMPT,
    debug=True
)

## Optimization Loop

In [107]:
#This function runs the agents in an optimization kind of loop, to iterate based on feedback
def optimization_loop(tools_agent,
                      evaluator_agent,
                      user_input,
                      session_memory,
                      PRINT_TOOL_MSGS=True
                     ):
    #Step 1: Run tools agent
    print("Conducting research...")
    tools_output = tools_agent.invoke(user_input)

    # clean print out the tool messages
    if PRINT_TOOL_MSGS:
        for msg in tools_output['messages']:
            if msg.content:
                print(f'{msg.content} |  tool {msg.name}\n')

    final_output = list(tools_output.values())[-1][-1].content
    print("\n--- Analysis Summary ---")
    print(final_output)

    #Step 2: Ask user for feedback
    human_feedback = input("\nPlease enter your feedback (areas to improve, missing info, corrections):\n")

    # grab memory if there is one.
    stock_symbol = user_input
    memory_context = query_memory(stock_symbol, session_memory)


    evaluator_input = f"""
                        {final_output}

                        --- Human Feedback ---
                        {human_feedback if human_feedback else "no human feedback provided"}

                        --- Memory ---
                        {memory_context}
                        """

    #Step 3: Evaluate and revise summary using feedback
    eval_payload = {
        "messages": str(evaluator_input),
    }
    print("\nRunning evaluator with feedback...")
    revised_output = evaluator_agent.invoke(eval_payload)
    final_output = revised_output['messages'][-1].content
    print("\n--- Revised Summary ---")
    #print(revised_output)
    print(revised_output)

    #Optional: Loop for more feedback
    while True:
        more = input("\nWould you like to refine further? (y/n): ")
        if more.lower().startswith("y"):
            human_feedback = input("Enter any further feedback:\n")
            eval_payload = {
                            "messages": str(evaluator_input),
                        }

            revised_output = evaluator_agent.invoke(eval_payload)
            final_output = revised_output['messages'][-1].content
            print("\n--- Revised Summary ---")
            print(final_output)
        else:
            break

    user_question = as_text(user_input)
    symbol = extract_symbol(user_question)
    final_answer = as_text(final_output)   # Use your utility function

    SESSION_MEMORY.remember(symbol, user_question, final_answer)

    print("\nWorkflow complete. Final output above.")
    return revised_output  #Return for pretty print


## Learning/Memory Agent
Maintains memory across analysis runs

In [100]:
#MEMORY AGENT CELL 1
#Session-scoped memory

@dataclass
class MemoryItem:
    symbol: str
    question: str
    answer: str
    created_at: str
    meta: Dict[str, Any]

class SessionMemory:
    def __init__(self, max_items: int = 200, max_per_symbol: int = 10):
        self._store: Dict[str, List[MemoryItem]] = {}
        self.max_items = max_items
        self.max_per_symbol = max_per_symbol

    def remember(self, symbol: str, question: str, answer: str, **meta) -> None:
        symbol = (symbol or "GENERIC").upper().strip()
        item = MemoryItem(
            symbol=symbol,
            question=(question or "").strip(),
            answer=(answer or "").strip(),
            created_at=datetime.utcnow().isoformat(timespec="seconds"),
            meta=meta or {}
        )
        bucket = self._store.setdefault(symbol, [])
        bucket.append(item)
        if len(bucket) > self.max_per_symbol:
            del bucket[0 : len(bucket) - self.max_per_symbol]
        self._cap_global()

    def recall(self, symbol: str, question: Optional[str] = None) -> Optional[str]:
        symbol = (symbol or "GENERIC").upper().strip()
        bucket = self._store.get(symbol, [])
        if not bucket:
            return None
        if not question:
            return bucket[-1].answer
        q = (question or "").strip()
        for item in reversed(bucket):
            if item.question == q:
                return item.answer
        return None

    def latest(self, symbol: str) -> Optional[MemoryItem]:
        symbol = (symbol or "GENERIC").upper().strip()
        bucket = self._store.get(symbol, [])
        return bucket[-1] if bucket else None

    def _cap_global(self):
        all_items = []
        for sym, bucket in self._store.items():
            for it in bucket:
                all_items.append((it.created_at, sym, it))
        if len(all_items) <= self.max_items:
            return
        all_items.sort(key=lambda x: x[0])  # oldest first
        to_drop = len(all_items) - self.max_items
        cutoff = set(id(it) for _, _, it in all_items[:to_drop])
        for sym in list(self._store.keys()):
            self._store[sym] = [it for it in self._store[sym] if id(it) not in cutoff]



def extract_symbol(text: str) -> str:
    """
    Grab a likely ticker from the user_input like 'Analyze the SPY stock ticker'.
    Simple heuristic: first ALL-CAPS token 1-5 chars (e.g., AAPL, MSFT, SPY).
    Falls back to 'GENERIC' if none found.
    """
    if not text:
        return "GENERIC"
    candidates = re.findall(r"\b[A-Z]{1,5}\b", text)
    # Light filter for common English words
    stop = {"THE","AND","FOR","WITH","FROM","THIS","THAT","YOUR","HAVE","HOLD"}
    for c in candidates:
        if c not in stop:
            return c
    return "GENERIC"

def as_text(x: Any) -> str:
    """
    Normalize whatever comes back from planner/tools/evaluator/optimizer into a string.
    Works with LangChain AgentExecutor outputs (dict), AIMessage, or raw str.
    """
    try:
        # AIMessage / ChatMessage
        if hasattr(x, "content"):
            return str(x.content)
        # Agent-like dicts
        if isinstance(x, dict):
            if "output" in x and isinstance(x["output"], str):
                return x["output"]
            if "messages" in x and isinstance(x["messages"], list):
                return "\n\n".join(
                    (m.content if hasattr(m, "content") else str(m))
                    for m in x["messages"]
                )
        # plain string
        if isinstance(x, str):
            return x
        return str(x)
    except Exception:
        return str(x)


def query_memory(stock_symbol, session_memory):
    print('query memory stock symbol = ', stock_symbol)
    if stock_symbol in session_memory._store:
        recent_memory = session_memory._store.get(stock_symbol)[-1].answer
        return recent_memory

## Execution Function, Set The Agent Free

In [96]:
SESSION_MEMORY = SessionMemory()

In [108]:
#Execution function
final_analysis = optimization_loop(tools_agent,
                                   evaluator_agent,
                                   {
    "messages": [
        {"role": "user", "content": formatted_prompt}
                  ]
                                    },
                                   SESSION_MEMORY
                                  )

Conducting research...
[1m[values][0m {'messages': [HumanMessage(content='\nSystem Role: You are a detailed Financial Research Agent designed to analyze companies, securities, and markets with a structured workflow.\n\n**Your goal** For TGT, provide a ~50 word investment summary, based on extensive reserach.\n\nInclude  to improve your answers.\n\nNEVER INCLUDE YOUR THOUGHT PROCESS IN THE RESULTS.\n\n\n**Core Functions**\n\n1. Planning:\nDevelop and print out a plan for each given stock symbol. Include a plan for a multi-step research process that defines what data to collect (e.g., news, market data, filings), what tools to use, and how to verify results.\n\nExample steps: “Collect company overview → Analyze financials → Cross-check recent filings → Summarize.”\n\n2. Tool Use:\nUse all of your tools by default to hit APIs and data sources dynamically, such as Yahoo Finance, SEC EDGAR, news APIs, and any specialist analyzers. Choose the right tools and calls autonomously based on the

  data = yf.download(symbol, start=start_date, end=end_date)
[*********************100%***********************]  1 of 1 completed


[1m[updates][0m {'tools': {'messages': [ToolMessage(content='Walmart is the latest company to announce a partnership with OpenAI.\nBest Buy could quietly emerge as the big retail winner if U.S.–China tariff relief holds.\nWalmart is the latest company to announce a partnership with OpenAI.', name='NewsAPI', tool_call_id='call_sx5mjGuoOPJn6kk2obDCee8U')]}}
[1m[updates][0m {'tools': {'messages': [ToolMessage(content='Company: Target Corporation (TGT)\nSector: Consumer Defensive\nPrice: $90.91\nMarket Cap: 41309425664\nP/E Ratio: 10.5955715\n52-Week High: 158.97\n52-Week Low: 85.36\nDividend Yield: 5.02\nBeta: 1.177\nRevenue: 105640001536\nProfit Margin: 0.03718\n', name='YahooFinanceAPI', tool_call_id='call_JJR5q3qD5Sn6XquH0eB3JQk9')]}}
[1m[updates][0m {'tools': {'messages': [ToolMessage(content='Stock chart saved as TGT_52week_chart.png', name='StockChart52Weeks', tool_call_id='call_4cUvkqt9709PHj13fb5FjphB')]}}
[1m[updates][0m {'tools': {'messages': [ToolMessage(content='Latest

KeyError: 'symbol'

In [None]:
print_final_analysis(final_analysis)


Final Analysis Summary
Message 1:
I have gathered some initial information on Apple Inc. (AAPL):

1. **Company Overview**:
   - Stock Price: $245.27
   - Sector: Technology
   - Key Metrics: Apple Inc. (AAPL) is one of the AI Stocks In The Spotlight For Investors.

2. **Financial Data**:
   - Revenue: Not available
   - Profit: Not available
   - Cash Flow: Not available

3. **SEC Filings**:
   - Latest filings for AAPL: 
     - 4 on 2025-10-03: xslF345X05/wk-form4_1759530830.xml
     - 4 on 2025-10-03: xslF345X05/wk-form4_1759530744.xml
     - 4 on 2025-10-03: xslF345X05/wk-form4_1759530690.xml

4. **Recent News**:
   - Apple Inc. is in the spotlight for investors.
   - Recent news includes Apple reportedly acquiring computer vision startup Prompt AI's team and technology.

5. **Market Context**:
   - I couldn't find specific market context information for Apple Inc.

6. **Peer Comparison**:
   - Peer comparison data is not available at the moment.

7. **Technical Analysis**:
   - Te

In [None]:
#Ensure the memory is working as intended
#After running the user query through your whole pipeline:
user_question = user_input
symbol = extract_symbol(user_input)
final_answer = as_text(final_analysis)   # Use your utility function

SESSION_MEMORY.remember(symbol, user_question, final_answer)

Previous answer for AAPL: I have gathered some initial information on Apple Inc. (AAPL):

1. **Company Overview**:
   - Stock Price: $245.27
   - Sector: Technology
   - Key Metrics: Apple Inc. (AAPL) is one of the AI Stocks In The Spotlight For Investors.

2. **Financial Data**:
   - Revenue: Not available
   - Profit: Not available
   - Cash Flow: Not available

3. **SEC Filings**:
   - Latest filings for AAPL: 
     - 4 on 2025-10-03: xslF345X05/wk-form4_1759530830.xml
     - 4 on 2025-10-03: xslF345X05/wk-form4_1759530744.xml
     - 4 on 2025-10-03: xslF345X05/wk-form4_1759530690.xml

4. **Recent News**:
   - Apple Inc. is in the spotlight for investors.
   - Recent news includes Apple reportedly acquiring computer vision startup Prompt AI's team and technology.

5. **Market Context**:
   - I couldn't find specific market context information for Apple Inc.

6. **Peer Comparison**:
   - Peer comparison data is not available at the moment.

7. **Technical Analysis**:
   - Technical a

In [None]:
# Later, you can recall the latest answer for "AAPL":
prev = SESSION_MEMORY.recall("AAPL")
if prev:
    print("Previous answer for AAPL:", prev)