<a href="https://colab.research.google.com/github/gopaldewoolkar/finance_agents/blob/main/Finance_Agent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Building Financial AI Agents Session 😀

## Goal of the Session:

1. Build AI Agents to handle various financial tasks

2. Four agents:

- Tool Calling - Websearch
- For RAG based query
- Deep Research Stock Analysis
- Evaluation (LLM-as-a-judge) Agent

## Used this Open-Source Tech Stack for learning:

- Fast OS LLM Inference [Groq]: https://groq.com/
- Agentic Framework [Agno]: https://www.agno.com/
- Vector Database [PgVector]: https://pypi.org/project/pgvector/
- Embeddings [Sentence-transformers]: https://huggingface.co/sentence-transformers
- Containerization [Udocker for Colab]: https://github.com/drengskapur/docker-in-colab
- Websearch [DuckDuckGo]: https://github.com/duckduckgo

In [4]:
## Install required packages
!pip install groq yfinance agno
!pip install groq duckduckgo-search newspaper4k lxml_html_clean agno
!pip install -U sqlalchemy 'psycopg[binary]' pgvector pypdf agno
!pip install sentence-transformers
!pip install ddgs

Collecting ddgs
  Downloading ddgs-9.5.5-py3-none-any.whl.metadata (18 kB)
Collecting lxml>=6.0.0 (from ddgs)
  Downloading lxml-6.0.1-cp312-cp312-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl.metadata (3.8 kB)
Downloading ddgs-9.5.5-py3-none-any.whl (37 kB)
Downloading lxml-6.0.1-cp312-cp312-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl (5.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.3/5.3 MB[0m [31m48.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: lxml, ddgs
  Attempting uninstall: lxml
    Found existing installation: lxml 5.4.0
    Uninstalling lxml-5.4.0:
      Successfully uninstalled lxml-5.4.0
Successfully installed ddgs-9.5.5 lxml-6.0.1


In [2]:
import os
from google.colab import userdata

os.environ['GROQ_API_KEY'] = userdata.get('groq_api_key')
os.environ['PHI_API_KEY'] = userdata.get('PHI_API_KEY')

print("API keys have been set!")

API keys have been set!


## Agent 1:

### Functional Tool Calling Capability: Web Search

In [5]:
from textwrap import dedent
from agno.agent import Agent
from agno.models.groq import Groq
from agno.tools.duckduckgo import DuckDuckGoTools
from agno.tools.newspaper4k import Newspaper4kTools

In [9]:

# Initialize the research agent with advanced journalistic capabilities
research_agent = Agent(
    model=Groq(id="llama-3.1-8b-instant"),
    tools=[DuckDuckGoTools(), Newspaper4kTools()],
    description=dedent("""\
        You are an elite research analyst in the financial services domain.
        Your expertise encompasses:

        - Deep investigative financial research and analysis
        - fact-checking and source verification
        - Data-driven reporting and visualization
        - Expert interview synthesis
        - Trend analysis and future predictions
        - Complex topic simplification
        - Ethical practices
        - Balanced perspective presentation
        - Global context integration\
    """),
    instructions=dedent("""\
        1. Research Phase
           - Search for 5 authoritative sources on the topic
           - Prioritize recent publications and expert opinions
           - Identify key stakeholders and perspectives

        2. Analysis Phase
           - Extract and verify critical information
           - Cross-reference facts across multiple sources
           - Identify emerging patterns and trends
           - Evaluate conflicting viewpoints

        3. Writing Phase
           - Craft an attention-grabbing headline
           - Structure content in Financial Report style
           - Include relevant quotes and statistics
           - Maintain objectivity and balance
           - Explain complex concepts clearly

        4. Quality Control
           - Verify all facts and attributions
           - Ensure narrative flow and readability
           - Add context where necessary
           - Include future implications
    """),
    expected_output=dedent("""\
        # {Compelling Headline}

        ## Executive Summary
        {Concise overview of key findings and significance}

        ## Background & Context
        {Historical context and importance}
        {Current landscape overview}

        ## Key Findings
        {Main discoveries and analysis}
        {Expert insights and quotes}
        {Statistical evidence}

        ## Impact Analysis
        {Current implications}
        {Stakeholder perspectives}
        {Industry/societal effects}

        ## Future Outlook
        {Emerging trends}
        {Expert predictions}
        {Potential challenges and opportunities}

        ## Expert Insights
        {Notable quotes and analysis from industry leaders}
        {Contrasting viewpoints}

        ## Sources & Methodology
        {List of primary sources with key contributions}
        {Research methodology overview}

        ---
        Research conducted by Financial Agent
        Credit Rating Style Report
        Published: {current_date}
        Last Updated: {current_time}\
    """),
    markdown=True,
    stream_intermediate_steps=True,
    add_datetime_to_context=True,
)

# User Prompt 1
research_agent.print_response("Analyze the current state and future implications \
                              of artificial intelligence in Finance",stream=True,)


Output()

In [10]:
# User Prompt 2
research_agent.print_response("Applications of Gen AI in Financial Services",stream=True,)

Output()

In [11]:
# User Prompt 3
research_agent.print_response("AI agents used in Financial Services",stream=True)

Output()

## Agent 2:

### Stock Market analysis
1. Utilize yahoo finance to run comparative analysis using many
2. Generate a small summary report

In [23]:
from textwrap import dedent
from agno.agent import Agent
from agno.models.groq import Groq
from agno.tools.yfinance import YFinanceTools

# The error occurs here, in the initialization of YFinanceTools
# The arguments get_current_stock_price, etc. are not valid for this class's constructor.
# Remove them to fix the TypeError.
stock_agent = Agent(
    model=Groq(id="llama-3.3-70b-versatile"),
    tools=[
        YFinanceTools(),
    ],
    instructions=dedent("""\
        You are a seasoned credit rating analyst with deep expertise in market analysis! 📊

        Follow these steps for comprehensive financial analysis:
        1. Market Overview
           - Latest stock price
           - 52-week high and low
        2. Financial Deep Dive
           - Key metrics (P/E, Market Cap, EPS)
        3. Market Context
           - Industry trends and positioning
           - Competitive analysis
           - Market sentiment indicators

        Your reporting style:
        - Begin with an executive summary
        - Use `tables` for data presentation
        - Include clear section headers
        - Highlight key insights with bullet points
        - Compare metrics to industry averages
        - Include technical term explanations
        - End with a forward-looking analysis

        Risk Disclosure:
        - Always highlight potential risk factors
        - Note market uncertainties
        - Mention relevant regulatory concerns
    """),
    stream_intermediate_steps=True,
    add_datetime_to_context=True,
    markdown=True,
)

print("Stock Agent created. Ready to take user queries..")

Stock Agent created. Ready to take user queries..


In [24]:
# User Query 1
stock_agent.print_response(
    "What's the latest news and financial performance of Apple (AAPL)?", stream=True)

Output()

In [26]:
# User Query 2: Semiconductor market analysis
stock_agent.print_response(
    dedent("""\
    Analyze the semiconductor market performance focusing on:
    - NVIDIA (NVDA)
    - AMD (AMD)
    - Intel (INTC)
    - Taiwan Semiconductor (TSM)
    Compare their market positions, growth metrics, and future outlook in terms of AI growth."""),
    stream=True,
)

Output()

In [27]:
# User Query 3: Competitive analysis

stock_agent.print_response("How is Microsoft performing in the age of AI?", stream=True)

Output()

## Agent 4
###Evaluation: LLM-as-a-judge

In [28]:
from textwrap import dedent
from agno.agent import Agent
from agno.models.groq import Groq

class RAGEvaluator:
    def __init__(self):
        self.evaluator = self._initialize_evaluator()

    def _initialize_evaluator(self):
        return Agent(
            model=Groq(id="openai/gpt-oss-120b"),  # Using different Llama model
            description=dedent("""\
                You are an expert RAG system evaluator with deep expertise in:
                - Information retrieval quality assessment
                - Response accuracy evaluation
                - Source attribution verification
                - Context relevance analysis
                - Natural language generation evaluation
            """),
            instructions=dedent("""\
                Evaluate the RAG system output based on these key metrics:

                1. Faithfulness (1-5):
                   - How accurately does the response reflect the source documents?
                   - Are there any hallucinations or incorrect statements?
                   - Does it maintain factual consistency?

                2. Context Relevance (1-5):
                   - Are the retrieved passages relevant to the query?
                   - Is important context missing?
                   - Is irrelevant information included?

                3. Answer Completeness (1-5):
                   - Does the response fully address the query?
                   - Are all key aspects covered?
                   - Is the level of detail appropriate?

                4. Source Attribution (1-5):
                   - Are sources properly cited?
                   - Is it clear which information comes from where?
                   - Can claims be traced back to sources?

                5. Response Coherence (1-5):
                   - Is the response well-structured?
                   - Does it flow logically?
                   - Is it easy to understand?

                Provide specific examples and explanations for each score.
            """),
            expected_output=dedent("""\
                # RAG Evaluation Report

                ## Overview
                Query: {query}
                Response Length: {n_chars} characters

                ## Metric Scores

                ### Faithfulness: {score}/5
                - Justification:
                - Examples:
                - Areas for Improvement:

                ### Context Relevance: {score}/5
                - Justification:
                - Examples:
                - Areas for Improvement:

                ### Answer Completeness: {score}/5
                - Justification:
                - Examples:
                - Areas for Improvement:

                ### Source Attribution: {score}/5
                - Justification:
                - Examples:
                - Areas for Improvement:

                ### Response Coherence: {score}/5
                - Justification:
                - Examples:
                - Areas for Improvement:

                ## Overall Score: {total}/25

                ## Key Recommendations
                1. {rec1}
                2. {rec2}
                3. {rec3}

                ## Summary
                {final_assessment}
            """),
            markdown=True,
        )

    def evaluate(self, query: str, response: str, context: list, stream: bool = True):
        """
        Evaluate a RAG system's response

        Args:
            query (str): Original user query
            response (str): RAG system's response
            context (list): Retrieved passages used for the response
            stream (bool): Whether to stream the evaluation output
        """
        evaluation_prompt = f"""
        Please evaluate this RAG system output:

        QUERY:
        {query}

        RETRIEVED CONTEXT:
        {' '.join(context)}

        RESPONSE:
        {response}

        Provide a detailed evaluation following the metrics and format specified.
        """

        return self.evaluator.print_response(evaluation_prompt, stream=stream)


# Initialize evaluator
evaluator = RAGEvaluator()
print("LLM-as-a Judge Evaluator initialized successfully!")

LLM-as-a Judge Evaluator initialized successfully!


In [29]:
# Example evaluation. Rerun this to use actual financial RAG outputs

query = "What are the key features of transformer models?"
context = [
    "Transformer models use self-attention mechanisms to process input sequences.",
    "Key features include parallel processing and handling of long-range dependencies."
]
response = "Transformer models are characterized by their self-attention mechanism..."

# Run evaluation
evaluator.evaluate(query, response, context)

Output()

## Papers:
1. The Rise and Potential of Large Language Model Based Agents: A Survey: https://arxiv.org/pdf/2309.07864
2. Self-Reflection in LLM Agents: Effects on Problem-Solving Performance: https://arxiv.org/pdf/2405.06682v3
3. Agent Laboratory: Using LLM Agents as Research Assistants: https://arxiv.org/pdf/2501.04227v1