# Financial News Analysis Research Assistant

https://github.com/Almudenagarrido/NLP-Multi-Agent-System

⚠️ Important:
This repository includes all modules and scripts developed by the team.
These files are imported into the main notebook and contain the core functionality of the project, representing the majority of the development work.

Team members: Jacobo Banus, Almudena Garrido & Christopher San Filippo

## Multi-Agent System for Real-Time Financial Analysis

### Project Overview:

This system implements a sophisticated multi-agent architecture designed to analyze financial news in real-time and provide intelligent, context-aware responses to user queries. The pipeline integrates multiple specialized agents that work collaboratively to retrieve, analyze, and synthesize financial information.

### Core Architecture:

- **News Retrieval Agent**: Fetches real-time financial news from multiple sources

- **Sentiment Analysis Agent**: Performs sentiment scoring on news content

- **Topic Classification Agent**: Routes queries to appropriate domain specialists

- **Specialist Agents**: Domain-specific models fine-tuned for financial topics

- **Evaluation & Optimization**: Continuous improvement through feedback loops

- **Memory System**: Maintains conversation history and context

**Technical Stack**: Python, HuggingFace Transformers, PyTorch, Custom Fine-tuned Models

### 1. System Initialization & Agent Imports

*This section imports all specialized agents that form our multi-agent architecture, each responsible for a specific task in the financial analysis pipeline.*

In [1]:
import sys
import os
sys.path.append(os.path.abspath('../'))

from utils.ticker_finder import get_ticker_from_company_name
from memory.memory_agent import MemoryAgent
from agents.topic_classifier_agent import TopicClassifier
from agents.news_retrieval_agent import NewsRetrievalAgent
from agents.sentiment_analysis_agent import SentimentAnalysisAgent
from agents.specialist_agent import Specialist
from agents.evaluator_optimizer_agent import EvaluatorOptimizer

  from .autonotebook import tqdm as notebook_tqdm


### 2. User Query Processing

**Objective**: Process natural language financial queries from users. In production, this would come from an API endpoint or user interface.

In [2]:
query = "What is NVIDIA's pricing strategy for their new graphics cards?"

### 3. Company Identification & Historical Context

**Function**: Extracts company names from user queries and converts them to stock ticker symbols for accurate financial data retrieval.

**Purpose**: Maintains conversation context by storing and retrieving past interactions, enabling the system to provide consistent, personalized responses.

In [3]:
companies = ["apple", "microsoft", "google", "tesla", "nvidia", "intel"]
for company in companies:
    if company in query.lower():
        company_name = company

ticker = get_ticker_from_company_name(company_name)

memory_agent = MemoryAgent("../memory/history.json")
stored_queries = memory_agent.load_entries(ticker)
stored_queries

[]

### 4. Real-Time News Retrieval

**Capability**: Connects to financial news APIs to collect and filter relevant articles in real-time, ensuring responses are based on current market information.

In [4]:
news_retrieval_agent = NewsRetrievalAgent()

limit = 3
days_back = 7

json_data = news_retrieval_agent.get_news_json(company_names=companies, limit_per_source=limit, days_back=days_back)
news = [article["summary"] for article in json_data['data'][ticker]['articles']]

if 'data' in json_data and ticker in json_data['data'] and 'articles' in json_data['data'][ticker]:
    news = [article["summary"] for article in json_data['data'][ticker]['articles']]
else:
    news = []
news

['Google, Amazon, Microsoft, and OpenAI are forging ahead with their in-house chip design efforts.',
 "Editor's Note: A typographical error in this story's headline has been corrected. Apple Inc. (NASDAQ:AAPL) CEO Tim Cook lauded Chinese app developers for their significant role in the company’s innovation ecosystem. Cook Praises China's Vibrant App ...",
 'The cryptocurrency market just experienced what traders are calling the worst liquidation event in history—a flash crash so severe it made the FTX collapse look tame by comparison, wiping out over $20 billion in leveraged positions and leaving retail ...',
 '[株式会社カナダグースジャパン]\n[画像1: https://prcdn.freetls.fastly.net/release_image/104161/76/104161-76-e43fb11dc0da592207b72f3a994e0727-3000x2001.jpg?width=536&quality=85%2C75&format=jpeg&auto=webp&fit=bounds&...',
 '[南足柄市]\n[画像1: https://prcdn.freetls.fastly.net/release_image/110226/24/110226-24-06a081824cff4257244dd794178677da-1920x1080.jpg?width=536&quality=85%2C75&format=jpeg&auto=webp&

### 5. Sentiment Analysis Integration

**Analysis**: Performs sentiment scoring on retrieved content (positive, negative, neutral) to determine market implications of each event.

**Output**: Provides an overall sentiment label that guides the specialist agent's response tone and content.

In [5]:
sentiment_agent = SentimentAnalysisAgent()
news_with_sentiment = []

articles = json_data['data'][ticker]['articles']

for article in articles:
    sentiment = sentiment_agent.predict_one(article["summary"])
    news_with_sentiment.append({
        "summary": article["summary"],
        "sentiment": sentiment.label,
        "sentiment_score": sentiment.score,
        "explanation": sentiment.explanation
    })

overall_sentiment = "neutral"
positive_scores = [n["sentiment_score"] for n in news_with_sentiment if n["sentiment"] == "positive"]
negative_scores = [n["sentiment_score"] for n in news_with_sentiment if n["sentiment"] == "negative"]

if positive_scores and max(positive_scores) > 0.7:
    overall_sentiment = "positive"
elif negative_scores and max(negative_scores) > 0.7:
    overall_sentiment = "negative"
news_with_sentiment

Device set to use cuda:0
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


[{'summary': 'Google, Amazon, Microsoft, and OpenAI are forging ahead with their in-house chip design efforts.',
  'sentiment': 'neutral',
  'sentiment_score': 0.9955110549926758,
  'explanation': 'Google, Amazon, Microsoft, and OpenAI are forging ahead with their in-house chip design efforts.'},
 {'summary': "Editor's Note: A typographical error in this story's headline has been corrected. Apple Inc. (NASDAQ:AAPL) CEO Tim Cook lauded Chinese app developers for their significant role in the company’s innovation ecosystem. Cook Praises China's Vibrant App ...",
  'sentiment': 'positive',
  'sentiment_score': 0.5184598565101624,
  'explanation': "Cook Praises China's Vibrant App ..."},
 {'summary': 'The cryptocurrency market just experienced what traders are calling the worst liquidation event in history—a flash crash so severe it made the FTX collapse look tame by comparison, wiping out over $20 billion in leveraged positions and leaving retail ...',
  'sentiment': 'negative',
  'sentim

### 6. Intelligent Topic Routing

**Routing Logic**: Analyzes the user query to determine the most appropriate domain specialist, ensuring expert-level responses for different financial topics (corporate_business, market_analysis, earnings, etc.).

In [6]:
topic_classifier = TopicClassifier()
topic = topic_classifier.classify(query)
topic

'corporate_business'

### 7. Domain Specialist Response Generation with Evaluation & Continuous Feedback

**Specialization**: Routes the query to a fine-tuned domain-specific model that generates human-readable answers with appropriate context, explanations, and financial insights.

In [15]:
specialist = Specialist(topic)
evaluator = EvaluatorOptimizer()

max_iterations = 3
target_score = 90
best_answer = None
best_score = 0
feedback = ""


for iteration in range(max_iterations):
    
    answer, prompt = specialist.respond(
        query=query, 
        news_summaries=news, 
        past_queries=stored_queries, 
        SA_label=overall_sentiment,
        feedback=feedback
    )
    
    evaluation = evaluator.evaluate_response(
        original_query=query,
        news_summaries=news,
        past_queries=stored_queries,
        specialist_response=answer
    )
    
    current_score = evaluation["overall_score"]
    print(f"\n🔄 Iteration {iteration + 1}/{max_iterations}")
    print(f"📊 Evaluation Score: {current_score}/100")
    print(f"💡 Feedback: {evaluation['actionable_feedback']}")
    
    if current_score > best_score:
        best_score = current_score
        best_answer = answer
    
    if current_score >= target_score:
        print(f"✅ Target score {target_score} achieved!")
        break
    else:
        feedback = evaluation["actionable_feedback"]
        print(f"🔄 Attempting improvement with feedback...")
        
        if evaluation["critical_issues"]:
            critical_feedback = ". ".join(evaluation["critical_issues"])
            feedback += f" Focus on: {critical_feedback}"

final_answer = best_answer if best_answer else answer
final_score = best_score if best_answer else current_score

print(f"🎯 Final Score: {final_score}/100")
print(f"💬 Final Answer: {final_answer}")


Loading model for topic: corporate_business from ../agents/specialized_agents/corporate_business-generator
Using GPU

🔄 Iteration 1/3
📊 Evaluation Score: 51/100
💡 Feedback: Consider incorporating more specific data from the news. Make better use of the available context and history.
🔄 Attempting improvement with feedback...

🔄 Iteration 2/3
📊 Evaluation Score: 51/100
💡 Feedback: Consider incorporating more specific data from the news. Make better use of the available context and history.
🔄 Attempting improvement with feedback...

🔄 Iteration 3/3
📊 Evaluation Score: 51/100
💡 Feedback: Consider incorporating more specific data from the news. Make better use of the available context and history.
🔄 Attempting improvement with feedback...
🎯 Final Score: 51/100
💬 Final Answer: NVIDIA's pricing strategy for their new graphics cards is to sell them at a lower price than their predecessors.


Let's take a look at the prompt that was used

In [20]:
print(f"💬 Prompt the Specialized Agent received:\n\n{prompt}")

💬 Prompt the Specialized Agent received:

You are a domain-specific assistant for corporate_business.

Query: What is NVIDIA's pricing strategy for their new graphics cards?

News Summaries:
    Google, Amazon, Microsoft, and OpenAI are forging ahead with their in-house chip design efforts.
Editor's Note: A typographical error in this story's headline has been corrected. Apple Inc. (NASDAQ:AAPL) CEO Tim Cook lauded Chinese app developers for their significant role in the company’s innovation ecosystem. Cook Praises China's Vibrant App ...
The cryptocurrency market just experienced what traders are calling the worst liquidation event in history—a flash crash so severe it made the FTX collapse look tame by comparison, wiping out over $20 billion in leveraged positions and leaving retail ...
[株式会社カナダグースジャパン]
[画像1: https://prcdn.freetls.fastly.net/release_image/104161/76/104161-76-e43fb11dc0da592207b72f3a994e0727-3000x2001.jpg?width=536&quality=85%2C75&format=jpeg&auto=webp&fit=bounds&...


### 8. Knowledge Persistence & Storage

**Memory Management**: Stores all interactions with timestamps and context, building a comprehensive knowledge base for future reference and contextual understanding.

In [8]:
entry = {
    "question": query,
    "articles": news,
    "SA": overall_sentiment,
    "feedback": feedback,
    "answer": answer
}
memory_agent.save_entry(entry, ticker)

## System Workflow Summary

1. Input: User submits financial query

2. Processing: System identifies company and retrieves historical context

3. Data Collection: Real-time news retrieval from financial APIs

4. Analysis: Sentiment scoring and topic classification

5. Routing: Query directed to appropriate domain specialist

6. Generation: Context-aware, expert response creation

7. Evaluation: Quality assessment and feedback incorporation

8. Storage: Knowledge persistence for future interactions

This pipeline demonstrates a sophisticated multi-agent architecture that transforms raw financial data into intelligent, conversational insights for end-users.