# InvestIQ

InvestIQ is an Agentic AI that combines stock market analysis, news analysis, and knowledge from document sources to provide insights. It uses a combination of 4 agents with different roles to provide insights. 

Project Github Repo Link - https://github.com/Nishchal-dl/InvestIQ

## Agents

1. Stock Market Agent
2. News Agent
3. RAG Agent
4. Formatting Agent

All of these agents are part of a hierarchical agent team and is coordinated by a supervisor agent.

### Supervisor Agent

The supervisor agent coordinates the hierarchical agent team and directs flow of execution. It decides which agent to invoke based on the user query.

### Stock Market Agent

The stock market agent provides insights about a stock ticker. This agent uses a tool (Yahoo Finance –  yfinance ) to fetch stock market data and provides details about the stock. The agent only provides necessary details about the stock.

### News Agent

The news agent provides summary of news articles related to the stock ticker. It uses a tool (NewsAPI) to fetch news articles and provides details about the news articles. This also provides sentiments around the news articles.

### RAG Agent

The RAG agent provides data from document sources. This agent uses the vectoe embedding and search to fetch relevant document chunks when needed.

### Formatting Agent

The formatting agent formats the data gathered by the other agents and provides a final response to the user. This agent does not use any external tools. This agent can be used to summarize the data gathered by the other agents. In this perticular use case we are formatting the data to a JSON format.

## Architecture of InvestIQ

The Agentic workflow of InvestIQ is shown below

![Agent Architecture Diagram](../images/agent_architecture.svg)



In [1]:
from dotenv import load_dotenv
from typing import Dict, Any, List
import yfinance as yf
load_dotenv() 

True

## Implementing tools functions which would be used by the agents.

### Get stock data tool
The purpose of this tool is to fetch the stock data from yfinance. 
Its main use is get real time company data and market information.

This uses [yfinance package from python](https://ranaroussi.github.io/yfinance/).

In [3]:
def get_stock_data(ticker: str) -> Dict[str, Any]:
    """Fetches stock data for a
      given stock ticker using yfinance.
    
    Args:
        ticker (str): The stock ticker
      symbol (e.g., "AAPL", "MSFT").
    
    Returns:
        Dict[str, Any]: A dictionary
      containing selected stock information,
      or an error message if the ticker is invalid.
    """
    try:
        stock = yf.Ticker(ticker)
        info = stock.info

        if not info:
            return {"error": f"Could not retrieve data for ticker: {ticker}. It might be invalid."}

        data = {
            "ticker": ticker,
            "shortName": info.get("shortName", "N/A"),
            "currentPrice": info.get("currentPrice", "N/A"),
            "previousClose": info.get("previousClose", "N/A"),
            "open": info.get("open", "N/A"),
            "dayHigh": info.get("dayHigh", "N/A"),
            "dayLow": info.get("dayLow", "N/A"),
            "volume": info.get("volume", "N/A"),
            "marketCap": info.get("marketCap", "N/A"),
            "currency": info.get("currency", "N/A"),
            "exchange": info.get("exchange", "N/A"),
            "sector": info.get("sector", "N/A"),
            "industry": info.get("industry", "N/A"),
            "longBusinessSummary": info.get("longBusinessSummary", "N/A")
        }
        return data
    except Exception as e:
        return {"error": f"An unexpected error occurred while fetching data for {ticker}: {e}"}



### Get financial data

The purpose of this tool is to fetch the financial cach flows, historical month data, balance sheet data, and income statement data for a given stock symbol.

This also uses yfinance to fetch the real time data.

In [5]:

def get_financial_data(ticker: str) -> str:
    """Fetches stock data for a
      given stock ticker using yfinance.
    
    Args:
        ticker (str): The stock ticker
      symbol (e.g., "AAPL", "MSFT").
    
    Returns:
        str: A String of data containing income_statement, 
        cash_flow, balance_sheet, financials, historical_month_data
      or an error message if the ticker is invalid.
    """
    try:
        stock = yf.Ticker(ticker)
        info = stock.info

        if not info:
            return {"error": f"Could not retrieve data for ticker: {ticker}. It might be invalid."}

        data = f"""
            'income_statement': 
            {stock.income_stmt}

            'cash_flow': 
            {stock.cash_flow}

            'balance_sheet': 
            {stock.balance_sheet}

            'financials': 
            {stock.financials}

            'historical_month_data':
            {stock.history(period='1mo')}
        """
        return data
    except Exception as e:
        return {"error": f"An unexpected error occurred while fetching data for {ticker}: {e}"}



### Get new articles

The purpose of this agent is to get recent news headlines and summaries.

This uses an external service [News API](https://newsapi.org/). 

In [6]:
def get_news_articles(query: str) -> List[Dict]:
    """Fetch news articles based on query or ticker symbol.
    
    Args:
        query: Search keywords or phrases (str)
            
    Returns:
        List of news articles with title, description, url, etc.
    """
    try:
        from newsapi import NewsApiClient
        import os
        
        newsapi = NewsApiClient(api_key=os.getenv("NEWS_API_KEY"))
        response = newsapi.get_everything(
            q=query,
            language="en",
            sort_by="publishedAt",
            page_size=10
        )
        return response.get("articles", [])
    except Exception as e:
        return {"error": f"Failed to fetch news: {str(e)}"}

Utility Functions for tracing message passing within the agents

In [7]:
from langchain_core.messages import convert_to_messages


def pretty_print_message(message, indent=False):
    pretty_message = message.pretty_repr(html=True)
    if not indent:
        print(pretty_message)
        return

    indented = "\n".join("\t" + c for c in pretty_message.split("\n"))
    print(indented)


def pretty_print_messages(update, last_message=False):
    is_subgraph = False
    if isinstance(update, tuple):
        ns, update = update
        if len(ns) == 0:
            return

        graph_id = ns[-1].split(":")[0]
        print(f"Update from subgraph {graph_id}:")
        print("\n")
        is_subgraph = True

    for node_name, node_update in update.items():
        update_label = f"Update from node {node_name}:"
        if is_subgraph:
            update_label = "\t" + update_label

        print(update_label)
        print("\n")

        messages = convert_to_messages(node_update["messages"])
        if last_message:
            messages = messages[-1:]

        for m in messages:
            pretty_print_message(m, indent=is_subgraph)
        print("\n")

## Agents

### News Agent
First agent is the News agent that fetches the latest news data from the News API. 
The News API is called with the tool get_news_articles.

![News API Agent](../images/news_agent.svg)

In [8]:
from langgraph.prebuilt import create_react_agent

news_agent = create_react_agent(
    model="openai:o4-mini",
    tools=[get_news_articles],
    prompt=(
        "You are a news agent.\n\n"
        "INSTRUCTIONS:\n"
        "- Assist ONLY with research-related tasks, DO NOT do any math\n"
        "- After you're done with your tasks, respond to the supervisor directly\n"
        "- Respond ONLY with the results of your work, do NOT include ANY other text."
    ),
    name="news_agent",
)

In [9]:
for chunk in news_agent.stream(
    {"messages": [{"role": "user", "content": "Get the news about AAPL stock?"}]}
):
    pretty_print_messages(chunk)

Update from node agent:


Name: news_agent
Tool Calls:
  get_news_articles (call_2DNt7WKgdApeMmUflgAmHa7j)
 Call ID: call_2DNt7WKgdApeMmUflgAmHa7j
  Args:
    query: AAPL stock


Update from node tools:


Name: get_news_articles

[{"source": {"id": null, "name": "Yahoo Entertainment"}, "author": "Jim Edwards", "title": "Global markets tumble as Beijing imposes new ban on U.S. shipping and Bessent vows China ‘will be hurt the most’ if it doesn’t surrender", "description": "China showed no signs of backing down from President Trump’s trade war.", "url": "https://finance.yahoo.com/news/global-markets-tumble-beijing-imposes-101818085.html", "urlToImage": "https://s.yimg.com/ny/api/res/1.2/AbOODaNKz5V0uPrmU7mHLA--/YXBwaWQ9aGlnaGxhbmRlcjt3PTEyMDA7aD03ODU-/https://media.zenfs.com/en/fortune_175/2ad8b422b81beffbed27abe05923f8d3", "publishedAt": "2025-10-14T10:18:18Z", "content": "<ul><li>Global stock markets fell after China banned certain U.S. shipping firms and Treasury Secretary Scott Besse

### Financial Agent
The second agent is the Financial Agent that fetches the stock data from the [Yahoo finance](https://ranaroussi.github.io/yfinance/) API.
The Stock API is called with the tool get_stock_data 

![Financial Agent](../images/financial_agent.svg)

In [10]:
stock_agent = create_react_agent(
    model="openai:o4-mini",
    tools=[get_stock_data, get_financial_data],
    prompt=(
        "You are a stock agent.\n\n"
        "INSTRUCTIONS:\n"
        "- Assist ONLY with stock-related tasks\n"
        "- After you're done with your tasks, respond to the supervisor directly\n"
        "- Respond ONLY with the results of your work, do NOT include ANY other text."
    ),
    name="stock_agent",
)

In [11]:
for chunk in stock_agent.stream(
    {"messages": [{"role": "user", "content": "Get the financial data related to AAPL stock"}]}
):
    pretty_print_messages(chunk)

Update from node agent:


Name: stock_agent
Tool Calls:
  get_financial_data (call_YGZwRsjXT5u7MkfzVqzHKfHb)
 Call ID: call_YGZwRsjXT5u7MkfzVqzHKfHb
  Args:
    ticker: AAPL


Update from node tools:


Name: get_financial_data


            'income_statement': 
                                                                  2024-09-30  \
Tax Effect Of Unusual Items                         0.000000e+00   
Tax Rate For Calcs                                  2.410000e-01   
Normalized EBITDA                                   1.346610e+11   
Net Income From Continuing Operation Net Minori...  9.373600e+10   
Reconciled Depreciation                             1.144500e+10   
Reconciled Cost Of Revenue                          2.103520e+11   
EBITDA                                              1.346610e+11   
EBIT                                                1.232160e+11   
Net Interest Income                                          NaN   
Interest Expense                              

In [12]:
for chunk in stock_agent.stream(
    {"messages": [{"role": "user", "content": "Get the stock information for AAPL"}]}
):
    pretty_print_messages(chunk)

Update from node agent:


Name: stock_agent
Tool Calls:
  get_stock_data (call_jNWoAqNsFqc3P7vla0SkXez3)
 Call ID: call_jNWoAqNsFqc3P7vla0SkXez3
  Args:
    ticker: AAPL


Update from node tools:


Name: get_stock_data

{"ticker": "AAPL", "shortName": "Apple Inc.", "currentPrice": 247.77, "previousClose": 247.66, "open": 246.615, "dayHigh": 248.845, "dayLow": 244.7, "volume": 35410253, "marketCap": 3677003448320, "currency": "USD", "exchange": "NMS", "sector": "Technology", "industry": "Consumer Electronics", "longBusinessSummary": "Apple Inc. designs, manufactures, and markets smartphones, personal computers, tablets, wearables, and accessories worldwide. The company offers iPhone, a line of smartphones; Mac, a line of personal computers; iPad, a line of multi-purpose tablets; and wearables, home, and accessories comprising AirPods, Apple TV, Apple Watch, Beats products, and HomePod. It also provides AppleCare support and cloud services; and operates various platforms, including the A

### Summary Agent
The third agent is the Summary Agent that summarise the data fetched from all the entire workflow.

In this perticular implementation we are transforming the summarized data into a JSON format.

![Summary Agent](../images/summary_agent.svg)

In [13]:
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.output_parsers import JsonOutputParser

class NewsSentiment(BaseModel):
    news_headline: str = Field(..., description="News Headline")
    time: str = Field(..., description="Time of the news. Friendly format like 1 week ago, 1 month ago, 1year ago")
    sentiment: float = Field(..., description="News rating between -1 and +1")

class SentimentAnalysis(BaseModel):
    key_words: List[str] = Field(..., description="Key words related to finance from recent news")
    news_sentiment: List[NewsSentiment] = Field(..., description="A list of news headlines and its sentiment")
    overall_sentiment_rating: float = Field(..., description="Overall sentiment of the company in the news, rated between -10 and +10")
    reasoning: str = Field(..., description="A one line reason for assigning the overall sentiment rating")

class StockRecommendation(BaseModel):
    recommendation: str = Field(..., description="Recommendation for the stock. Strong Buy, Buy, Hold, Sell, Strong Sell")
    reasoning: List[str] = Field(..., description="3 reasons for assigning the recommendation")
    price_prediction: float = Field(..., description="Predicted price of the stock")
    price_prediction_percentage: float = Field(..., description="Predicted price percentage change of the stock")

class RiskAssessment(BaseModel):
    market_risk: str = Field(..., description="Risk level of the stock. Low, Medium, High")
    volatility: str = Field(..., description="Volatility of the stock. Low, Medium, High")
    growth_potential: str = Field(..., description="Growth potential of the stock. Low, Medium, High")

class StockOverview(BaseModel):
    company_overview: str = Field(..., description="Financial summary of the company in a short paragraph")
    stock_recommendation: StockRecommendation = Field(..., description="Stock recommendation")
    risk_assessment: RiskAssessment = Field(..., description="Risk assessment")
    sentiment_analysis: SentimentAnalysis = Field(..., description="Sentiment analysis")

parser = JsonOutputParser(pydantic_object=StockOverview)

sys_msg = f"""
You are an expert financial analyst. Analyze the stock data and news, 
then respond with a comprehensive analysis in JSON format.

Response must be a valid JSON object matching this schema:
{parser.get_format_instructions()}

Guidelines:
- Be objective and data-driven
- Include both technical and fundamental analysis
- Consider recent news impact
- Provide clear, actionable recommendations
- overall sentiment rating score should reflect certainty (-10 to +10)
"""



For example, replace imports like: `from langchain_core.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. 	from pydantic.v1 import BaseModel

  exec(code_obj, self.user_global_ns, self.user_ns)


In [14]:
summary_agent = create_react_agent(
    model="openai:o4-mini",
    tools=[],
    prompt=(sys_msg),
    name="summary_agent",
)

### RAG Agent
The forth agent is the RAG agent that is used to answer questions based on the document provided by the user. 

The dataset for this notebook is taken from Kaggle [Financial Reports QA Dataset for RAG-based LLM Fin](https://www.kaggle.com/datasets/ahmedsta/data-retreiver). The data contains financial reports in pdf format from various companies. 

The dataset also contains a set of questions and expected answers that can be used to evaluate the performance of the RAG agent.

![RAG Agent](../images/rag_agent.svg)


In [23]:
from langchain_openai import OpenAIEmbeddings
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

import os

embeddings = OpenAIEmbeddings(model="text-embedding-3-large")
vector_store = InMemoryVectorStore(embeddings)
data_dir = "./../dataset/Structured data-20250319T105519Z-001/Structured data"
rag_agent = None
documents = []

def load_data():
    pdf_files = [f for f in os.listdir(data_dir) if f.endswith('.pdf')]

    for pdf_file in pdf_files:
        file_path = os.path.join(data_dir, pdf_file)
        print(f"Loading {pdf_file}...")
        try:
            loader = PyPDFLoader(file_path)
            pages = loader.load()
            documents.extend(pages)
            print(f"Loaded {len(pages)} pages from {pdf_file}")
        except Exception as e:
            print(f"Error loading {pdf_file}: {str(e)}")

def chunk_documents(chunk_size=1000, chunk_overlap=200):
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=chunk_size,  
        chunk_overlap=chunk_overlap,  
        add_start_index=True, 
    )

    all_splits = text_splitter.split_documents(documents)
    vector_store.add_documents(documents=all_splits)


from langchain.tools import tool

@tool(response_format="content_and_artifact")
def retrieve_context(query: str):
    """Retrieve information to help answer a query."""
    retrieved_docs = vector_store.similarity_search(query, k=2)
    serialized = "\n\n".join(
        (f"Source: {doc.metadata}\nContent: {doc.page_content}")
    for doc in retrieved_docs
    )
    return serialized, retrieved_docs

from langgraph.prebuilt import create_react_agent

tools = [retrieve_context]
prompt = """
    You have access to a tool that retrieves context from financial documents. 
    Use the tool to help answer user queries.
"""

rag_agent = create_react_agent(
    model="openai:o4-mini",
    tools=tools,
    prompt=(prompt),
    name="rag_agent"
)

def query(query):
    if rag_agent is None:
        raise ValueError("Agent not created. Call create_rag_agent first.")
    
    return rag_agent.invoke(
        {"messages": [{"role": "user", "content": query}]}
    )

        

load_data()
chunk_documents()

Loading 2021ESG.pdf...
Loaded 116 pages from 2021ESG.pdf
Loading 2022-Absa-Group-limited-Environmental-Social-and-Governance-Data-sheet.pdf...
Loaded 26 pages from 2022-Absa-Group-limited-Environmental-Social-and-Governance-Data-sheet.pdf
Loading Clicks-Sustainability-Report-2022.pdf...
Loaded 21 pages from Clicks-Sustainability-Report-2022.pdf
Loading DISTELL ESG Appendix 2022.pdf...
Loaded 17 pages from DISTELL ESG Appendix 2022.pdf
Loading ESG-spreads.pdf...
Loaded 76 pages from ESG-spreads.pdf
Loading picknpay-esg-report-spreads-2023.pdf...
Loaded 24 pages from picknpay-esg-report-spreads-2023.pdf
Loading SASOL Sustainability Report 2023 20-09_0.pdf...
Loaded 66 pages from SASOL Sustainability Report 2023 20-09_0.pdf


### Supervisor Agent

The supervisor agent is the top-level agent in the hierarchy. It coordinates the workflow execution within the available agents.

In [29]:
from langgraph_supervisor import create_supervisor
from langchain.chat_models import init_chat_model

supervisor = create_supervisor(
    model=init_chat_model("openai:gpt-4.1"),
    agents=[stock_agent, news_agent, summary_agent, rag_agent],
    prompt=(
        "You are a supervisor managing four agents:\n"
        "- a stock agent. Assign stock-related tasks to this agent\n"
        "- a news agent. Assign news-related tasks to this agent\n"
        "- a summary agent. Assign json parsing tasks to this agent\n"
        "- a RAG agent. Assign tasks related to financial lookup from a document to this agent\n"
        "Assign work to one agent at a time, do not call agents in parallel.\n"
        "Do not do any work yourself."
    ),
    add_handoff_back_messages=True,
    output_mode="full_history",
).compile()

In [30]:
from IPython.display import display, Image

mermaid_code = supervisor.get_graph().draw_mermaid()
print(mermaid_code)

---
config:
  flowchart:
    curve: linear
---
graph TD;
	__start__([<p>__start__</p>]):::first
	supervisor(supervisor)
	stock_agent(stock_agent)
	news_agent(news_agent)
	summary_agent(summary_agent)
	rag_agent(rag_agent)
	__end__([<p>__end__</p>]):::last
	__start__ --> supervisor;
	news_agent --> supervisor;
	rag_agent --> supervisor;
	stock_agent --> supervisor;
	summary_agent --> supervisor;
	supervisor -.-> __end__;
	supervisor -.-> news_agent;
	supervisor -.-> rag_agent;
	supervisor -.-> stock_agent;
	supervisor -.-> summary_agent;
	classDef default fill:#f2f0ff,line-height:1.2
	classDef first fill-opacity:0
	classDef last fill:#bfb6fc



![Mermaid](../images/mermaid-diagram.png)

In [31]:
from langchain_core.messages import HumanMessage, SystemMessage

def prepare_human_message(ticker: str) -> str:
    return f"""
Analyze this stock data, financial information, recent news and provide your analysis in the requested JSON format.

STOCK DATA for {ticker.upper()}:

Provide your analysis in the exact JSON format, with no additional text.
"""


for chunk in supervisor.stream(
    {
        "messages": [
            {
                "role": "user",
                "content": prepare_human_message("AAPL"),
            }
        ]
    },
):
    pretty_print_messages(chunk, last_message=True)

Update from node supervisor:


Name: transfer_to_stock_agent

Successfully transferred to stock_agent


Update from node stock_agent:


Name: transfer_back_to_supervisor

Successfully transferred back to supervisor


Update from node supervisor:


Name: transfer_to_news_agent

Successfully transferred to news_agent


Update from node news_agent:


Name: transfer_back_to_supervisor

Successfully transferred back to supervisor


Update from node supervisor:


Name: transfer_to_summary_agent

Successfully transferred to summary_agent


Update from node summary_agent:


Name: transfer_back_to_supervisor

Successfully transferred back to supervisor


Update from node supervisor:


Name: supervisor

{
  "company_overview": "Apple Inc. designs, manufactures, and markets consumer electronics and subscription-based services worldwide. In fiscal 2024, Apple generated $391.0 B in revenue (up 2% year-over-year) and $93.7 B in net income (down 3%), while free cash flow reached $108.8 B (up 9%). The

In [32]:
for chunk in supervisor.stream(
    {
        "messages": [
            {
                "role": "user",
                "content": "Can you cmpare the stock price of Apple and Microsoft",
            }
        ]
    },
):
    pretty_print_messages(chunk, last_message=True)

Update from node supervisor:


Name: transfer_to_stock_agent

Successfully transferred to stock_agent


Update from node stock_agent:


Name: transfer_back_to_supervisor

Successfully transferred back to supervisor


Update from node supervisor:


Name: supervisor

Here is a comparison of the current stock prices:

Apple (AAPL): $247.77  
Microsoft (MSFT): $513.57  

Microsoft's stock price is $265.80 higher than Apple's, and is approximately 107% above Apple's current price. If you need further analysis or a different time frame, let me know!


