- **Collect:** University of San Diego
- **Course:** Natural Language Processing and GenAI (AAI-520)
- **Final Project:** Financial Analysis Aystem Powered by Agentic AI
- **Professor:** Kahila Mokhtari Jadid
- **Team Members:** Pros Loung, Dennis Arpurayil, Divya Kamath 
- **GitHub Repository:** (https://github.com/ploung/AAI_520_NLP_Final_Project.git)



### 1. Project Introduction

The project builds a real-world financial analysis system powered by agentic AI. That can reason, plan, and act, coordinating multiple specialized LLM agents to handle complex financial tasks end-to-end. Agentic AI moves beyond scripted flows: it routes tasks, critiques itself, and improves iteratively. The multi-agent stacks parse news, earnings, and market signals at scale.

## 2. System Architecture
### 2.1 Agent Design
- Orchestrator Agent  
- Data Collection Agents (Yahoo Finance, NewsAPI, SEC)  
- Analysis Agents (Sentiment, Fundamentals, Quant)  
- Decision Agent  
- Critic / Evaluator Agent  
- Memory Module  

### 2.2 Workflow Patterns
- Prompt Chaining  
- Routing  
- Evaluator–Optimizer  

*(Insert diagram of pipeline here)*

# 3. Set the enviroment

In [None]:
# Uncomment and run the following lines to install necessary packages
#!pip install yfinance
#!pip install openai
#!pip install langchain 
#!pip install pandas 
#!pip install numpy 
#!pip install requests
#!pip install langchain-community

# 4. Import libraries

In [None]:
import yfinance as yf 
import requests
#import pandas as pd
#import matplotlib.pyplot as plt
#from datetime import datetime, timedelta

# 4. Design Financial Data Retrieval Agent

In [None]:
# Define stock list to fetch data for and analyze
stock_list = ["AAPL", "MSFT", "GOOGL", "AMZN", "TSLA", "NVDA", "META"]

# Create a function to fetch stock data
def fetch_stock_data(symbol):
    stock = yf.Ticker(symbol)
    return stock.history(period="1mo")

# Example usage
stock_result = list(map(fetch_stock_data, stock_list)) # Fetch data for all stocks in the list

# Display stock history data
for stock, stock_data in zip(stock_list, stock_result):
    print(stock, stock_data.head(), "\n")  # Print the first few rows of each stock's data


# 5. Design News Retrieval and Processing Agent

In [None]:
import requests

def get_news(symbol, api_key):
    url = f"https://newsapi.org/v2/everything?q={symbol}&apiKey={api_key}"
    response = requests.get(url)
    articles = response.json()["articles"]
    return [article["content"] for article in articles]

def preprocess_news(news_list):
    return [news.lower().replace("\n", " ") for news in news_list]

In [None]:
stock_news_dict = {}

for symbol in stock_list:
    raw_news = get_news(symbol, api_key)
    cleaned_news = preprocess_news(raw_news)
    stock_news_dict[symbol] = cleaned_news

# Display the news articles for a specific stock symbol
for symbol, news_items in stock_news_dict.items():
    print(f"\nNews for {symbol}:")
    for i, item in enumerate(news_items[:3]):  # Show first 3 articles
        print(f"{i+1}. {item[:100]}...")  # Preview first 100 characters

In [70]:
stock_news_dict = []

for symbol in stock_list:
    raw_news = get_news(symbol, api_key)
    cleaned_news = preprocess_news(raw_news)
    stock_news_dict.append((symbol, cleaned_news))

# Display the news articles for a specific stock symbol
for symbol, news_items in stock_news_dict:
    print(f"\nNews for {symbol}:")
    for i, item in enumerate(news_items[:3]):  # Show first 3 articles
        print(f"{i+1}. {item[:100]}...")  # Preview first 100 characters


News for AAPL:
1. let's turn now to our stock of the day. apple's new iphone 17 and iphone air are officially availabl...
 first up, markets are looking to start the week in the gr...
3. yahoo finance's john hyland tracks tuesday's top moving stocks and biggest market stories in this ma...

News for MSFT:
1. big data refers to a vast and diverse collection of structured, unstructured and semi-structured dat...
2. d-wave quantum qbts and rigetti computing rgti are stepping into the spotlight as speculative pure p...
3. shares of voice ai technology company soundhound ai (nasdaq:soun) jumped 7% in the afternoon session...

News for GOOGL:
1. 08 september 2025, bavaria, munich: the google brand logo (logo, symbol, emblem) can be seen at the ...
2. ionq (ionq) is a pioneering quantum computing company driving the industry forward with its advanced...
3. the year 2025 has been tricky for google parent alphabet  (googl) and its patient investors, but the...

News for AMZN:
1. ankara, turkiy

# 6. Prompt Chaining

In [None]:
from langchain.chains import LLMChain
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
openai_llm = OpenAI(openai_api_key="YOUR_API_KEY")  # Replace with your actual OpenAI API key

template = PromptTemplate(input_variables=["text"], template="Classify this financial news: {text}")
# chain = LLMChain(prompt=template, llm=your_llm)
chain = LLMChain(prompt=template, llm=openai_llm)  # Replace with actual LLM instance

def classify_news(news_list):
    return [chain.run(text=news) for news in news_list]

#classified_news = classify_news(preprocess_news(get_news("AAPL", "YOUR_NEWSAPI_KEY")))  # Replace with your actual NewsAPI key

# Example: Fetch and preprocess news
news_sample = [
    {"title": "Apple beats earnings expectations", "text": "Apple reported higher than expected Q2 earnings..."}
]

def preprocess_news(news):
    return [n["text"].lower() for n in news]

processed = preprocess_news(news_sample)
processed

# 6. Routing Logic Agent

In [None]:
def route_content(classification):
    if "earnings" in classification:
        return "Earnings Analyzer"
    elif "market" in classification:
        return "Market Analyzer"
    else:
        return "General News Analyzer"

# 6. Evaluator Optimizer

In [None]:
def evaluate_output(output):
    score = len(output)  # Placeholder for actual scoring logic
    return score

def optimize_output(output, score):
    if score < 100:
        return output + " [Refined]"
    return output

# 7. Memory Across Runs

In [None]:
memory_log = []

def store_memory(note):
    memory_log.append(note)

def retrieve_memory():
    return memory_log[-3:]  # Last 3 notes

## 4. Agent Functions
We implement the four required capabilities:

1. **Planning** – agent outlines research steps.  
2. **Dynamic Tool Use** – APIs, datasets.  
3. **Self-Reflection** – evaluation of outputs.  
4. **Learning Across Runs** – simple memory store.

In [None]:
# Example: Orchestrator planning steps for a ticker
def plan_research(symbol):
    steps = [
        f"Fetch price history for {symbol}",
        f"Fetch latest news for {symbol}",
        "Run sentiment analysis on news",
        "Compute technical signals",
        "Generate investment recommendation",
        "Critique and refine output"
    ]
    return steps

plan_research("AAPL")

## 5. Workflow Pattern 1 – Prompt Chaining
Pipeline: News → Preprocess → Classify → Extract → Summarize

In [None]:
# Example: Fetch and preprocess news
news_sample = [
    {"title": "Apple beats earnings expectations", "text": "Apple reported higher than expected Q2 earnings..."}
]

def preprocess_news(news):
    return [n["text"].lower() for n in news]

processed = preprocess_news(news_sample)
processed

In [None]:
# Sentiment Classification (placeholder)
def classify_sentiment(text):
    return {"sentiment": "positive", "score": 0.8}

classified = classify_sentiment(processed[0])
classified

In [None]:
# Entity extraction (placeholder)
def extract_entities(text):
    return ["Apple", "Q2 earnings"]

entities = extract_entities(processed[0])
entities

In [None]:
# Summarization (placeholder)
def summarize(text):
    return "Apple beat Q2 earnings expectations, likely positive market reaction."

summary = summarize(processed[0])
summary

## 6. Workflow Pattern 2 – Routing
Route input to correct specialized agent.

In [None]:
def router(input_type, data):
    if input_type == "news":
        return classify_sentiment(data)
    elif input_type == "price":
        return "Quant Agent output"
    elif input_type == "filing":
        return "Fundamentals Agent output"
    else:
        return "Unknown input type"

router("news", "Apple reports record revenue")

## 7. Workflow Pattern 3 – Evaluator–Optimizer
Generate analysis → Evaluate → Refine

In [None]:
# Simple evaluator-optimizer loop
def generate_analysis(symbol, signals):
    return f"Analysis for {symbol}: Buy based on signals {signals}"

def evaluate_analysis(analysis):
    if "Buy" in analysis:
        return "Good, but lacks risk discussion"
    return "Needs improvement"

def refine_analysis(analysis, feedback):
    return analysis + f" | Refined: {feedback}"

analysis = generate_analysis("AAPL", {"sentiment": "positive", "quant": "bullish"})
feedback = evaluate_analysis(analysis)
refined = refine_analysis(analysis, feedback)
refined

## 8. Demonstration – End-to-End Run
- Choose a stock symbol (e.g., AAPL, TSLA, AMZN)  
- Run through all agents and workflows  
- Show final investment recommendation with rationale  

In [None]:
symbol = "AAPL"

steps = plan_research(symbol)
print("Planned steps:", steps)

# Price data
price_data = yf.download(symbol, period="6mo", interval="1d")
price_data["Close"].plot(title=f"{symbol} Price Trend")

# Run agents (stub example)
news_summary = summarize("Apple reported strong iPhone sales this quarter.")
sentiment = classify_sentiment("Apple reported strong iPhone sales this quarter.")
decision = generate_analysis(symbol, {"sentiment": sentiment, "quant": "bullish"})
final_output = refine_analysis(decision, evaluate_analysis(decision))

print("Final Investment Thesis:", final_output)

## 9. Evaluation & Iteration
- Example of system critiquing itself and re-running an agent  
- Show before/after outputs  

## 10. Conclusion
- What worked well  
- Challenges  
- Future improvements (better data sources, advanced models, memory persistence)  