# Master in Applied Artificial Intelligence

## Course: *Fintech and Digital Innovation in Finance*

### **Course Project – Part 3**
### Retrieval Augmented Generation with S&P 500 news

---

**Institution:** Tecnológico de Monterrey

**Instructors:** Marie-Ève Malette, Yetnalezi Quintas Ruiz

**Author:** Alejandro Díaz Villagómez | A01276769

**Date:** August 11th, 2025

---

# Introduction to Retrieval Augmented Generation with S&P 500 news

In this notebook, you will explore how to build a simple Retrieval-Augmented Generation (RAG) pipeline using financial news articles from S&P 500 companies.

We'll start by vectorizing text data, creating a vector store using FAISS, and integrating it with OpenAI's GPT models to answer questions using retrieved information.

This workflow emulates real-world systems in finance where natural language data (news, filings, analyst reports) are used to support decision-making.

# 📌 Objectives

By the end of this notebook, students will be able to:

1. **Perform Semantic Search with Metadata Filtering:**
   - Query the provided FAISS vector store to retrieve relevant financial news articles based on natural language questions.
   - Apply optional filters using metadata such as ticker or publication date to refine search results.

2. **Enrich Data with Company Metadata:**
   - Use the `yfinance` library to retrieve company-level metadata (company name, sector, industry) for tickers in the dataset.
   - Integrate this metadata to support enhanced filtering and analysis of news data.

3. **Build a Retrieval-Augmented Generation (RAG) Pipeline:**
   - Combine retrieved news snippets as context to generate answers using OpenAI’s GPT models.
   - Construct effective prompts that guide the language model to provide concise, context-aware responses.

4. **Evaluate and Analyze RAG Outputs:**
   - Review generated answers alongside the supporting news excerpts.
   - Reflect on the strengths and limitations of the simple RAG pipeline and consider potential improvements, such as adding more filters or refining retrieval strategies.

5. **Incorporate Financial Metadata into Retrieval Context:**
   - Enrich retrieved news snippets with key financial metadata including ticker, company name, sector, and industry.
   - Format prompts that combine both text excerpts and metadata to provide richer context to the language model.

6. **Generate Context-Aware Answers Using OpenAI Models:**
   - Construct and send prompts to an LLM that leverage both news content and metadata to produce concise, informed financial analysis.

7. **Compare Answers With and Without Metadata:**
   - Evaluate the impact of including financial metadata on answer quality using criteria such as clarity, detail, accuracy, and contextual relevance.
   - Summarize findings to reflect on the role of metadata in improving retrieval-augmented generation.

## Install and Import important librairies

First, we install and import the necessary libraries for:
- Text embedding generation (sentence-transformers)
- Efficient similarity search (faiss)
- Data manipulation (pandas, numpy)
- Visualization (matplotlib)

> ℹ️ FAISS uses inner product for cosine similarity by normalizing vectors.

In [1]:
# !pip install -q pandas faiss-cpu sentence-transformers scikit-learn openai python-dotenv
# !pip freeze > ../requirements.txt

In [2]:
from sentence_transformers import SentenceTransformer
import faiss
import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
from collections import Counter
import matplotlib.pyplot as plt
import faiss
import yfinance as yf
import time
from openai import OpenAI
import os
from dotenv import load_dotenv
from tqdm import tqdm
import textwrap

load_dotenv()

True

## Load news data
We load a CSV file of financial news, focusing on TITLE and SUMMARY, along with metadata like TICKER and PUBLICATION_DATE.
These will be embedded into vectors and used for semantic retrieval.

In [3]:
K = 25

In [4]:
df_news = pd.read_csv('df_news.csv')
df_news['PUBLICATION_DATE'] = pd.to_datetime(df_news['PUBLICATION_DATE']).dt.date
display(df_news.head())

Unnamed: 0,TICKER,TITLE,SUMMARY,PUBLICATION_DATE,PROVIDER,URL
0,MMM,2 Dow Jones Stocks with Promising Prospects an...,The Dow Jones (^DJI) is made up of 30 of the m...,2025-05-29,StockStory,https://finance.yahoo.com/news/2-dow-jones-sto...
1,MMM,3 S&P 500 Stocks Skating on Thin Ice,The S&P 500 (^GSPC) is often seen as a benchma...,2025-05-27,StockStory,https://finance.yahoo.com/news/3-p-500-stocks-...
2,MMM,3M Rises 15.8% YTD: Should You Buy the Stock N...,"MMM is making strides in the aerospace, indust...",2025-05-22,Zacks,https://finance.yahoo.com/news/3m-rises-15-8-y...
3,MMM,Q1 Earnings Roundup: 3M (NYSE:MMM) And The Res...,Quarterly earnings results are a good time to ...,2025-05-22,StockStory,https://finance.yahoo.com/news/q1-earnings-rou...
4,MMM,3 Cash-Producing Stocks with Questionable Fund...,While strong cash flow is a key indicator of s...,2025-05-19,StockStory,https://finance.yahoo.com/news/3-cash-producin...


In [5]:
df_news['EMBEDDED_TEXT'] = df_news['TITLE'] + ' : ' + df_news['SUMMARY']

In [6]:
model = SentenceTransformer('all-MiniLM-L6-v2')

## Implement FAISS vector store
We:
- Use a pre-trained sentence transformer (all-MiniLM-L6-v2) to embed documents.
- Normalize vectors to use cosine similarity.
- Create a FAISS index and implement a basic search function.

This will allow us to retrieve relevant news snippets given a natural language question.
 

In [7]:
# Load model and compute embeddings
text_embeddings = model.encode(df_news['EMBEDDED_TEXT'].tolist(), convert_to_numpy=True)

# Normalize embeddings to use cosine similarity (via inner product in FAISS)
text_embeddings = text_embeddings / np.linalg.norm(text_embeddings, axis=1, keepdims=True)

# Prepare metadata
documents = df_news['EMBEDDED_TEXT'].tolist()
metadata = [
    {
        'PUBLICATION_DATE': row['PUBLICATION_DATE'], 
        'TICKER': row['TICKER'], 
        'PROVIDER': row['PROVIDER']
    }
    for _, row in df_news.iterrows()
]

In [8]:
embedding_dim = text_embeddings.shape[1]
faiss_index = faiss.IndexFlatIP(embedding_dim)  # Cosine similarity via inner product
faiss_index.add(text_embeddings)

In [10]:
class FaissVectorStore:
    def __init__(self, model, index, embeddings, documents, metadata):
        self.model = model
        self.index = index
        self.embeddings = embeddings
        self.documents = documents
        self.metadata = metadata

    def search(self, query, k=5, metadata_filter=None):
        query_embedding = self.model.encode([query], convert_to_numpy=True)
        query_embedding = query_embedding / np.linalg.norm(query_embedding, axis=1, keepdims=True)
        query_embedding = query_embedding.astype('float32')

        if metadata_filter:
            filtered_indices = [i for i, meta in enumerate(self.metadata) if metadata_filter(meta)]
            if not filtered_indices:
                return []
            filtered_embeddings = self.embeddings[filtered_indices]
            filtered_embeddings = filtered_embeddings.astype('float32')
            temp_index = faiss.IndexFlatIP(filtered_embeddings.shape[1])
            temp_index.add(filtered_embeddings)
            k_eff = min(k, filtered_embeddings.shape[0])
            D, I = temp_index.search(query_embedding, k_eff)
            indices = [filtered_indices[i] for i in I[0]]
            D = D[0]
        else:
            D, I = self.index.search(query_embedding, k)
            indices = I[0]
            D = D[0]

        results = []
        for idx, sim in zip(indices, D):
            results.append((self.documents[idx], self.metadata[idx], float(sim)))
        return results


In [11]:
# Create FAISS-based store
faiss_store = FaissVectorStore(
    model=model,
    index=faiss_index,
    embeddings=text_embeddings,
    documents=documents,
    metadata=metadata
)

### Setup OpenAI Client

👉 **Instructions**:
- Import the `OpenAI` client from the `openai` Python library.
- You will need an **OpenAI API key** to use their models programmatically:
  - Go to [https://platform.openai.com/](https://platform.openai.com/) and sign up or log in.
  - Create an API key from your [API keys dashboard](https://platform.openai.com/account/api-keys).
  - ⚠️ **Keep your API key private** and **do not** share or hardcode it in public notebooks.
- Note that **usage of the OpenAI API is not free**. You will need to:
  - Add a payment method.
  - Monitor your usage to avoid unexpected charges.
  - Optionally set usage limits from your account settings.
- You can refer to the **course’s Study Resources** for a step-by-step guide on creating an OpenAI account and retrieving your API key.

Then:
- Initialize the client with `OpenAI(api_key="YOUR_KEY_HERE")`.
- Send a test request using `.responses.create()` and the `"gpt-4o-mini"` model with a simple prompt:

  ```python
  response = client.responses.create(
      model="gpt-4o-mini",
      input="Write a one-sentence bedtime story about a unicorn."
  )
  print(response.output_text)


In [12]:
# CODE HERE
# Use as many coding cells as you need

client = OpenAI(
  api_key = os.getenv('API_KEY_OPEN_API')
)

response = client.responses.create(
    model="gpt-4o-mini",
    input="Write a one-sentence bedtime story about a unicorn."
)

print(response.output_text)

As the moonlight danced on the shimmering lake, Luna the unicorn spread her magical wings and soared into the starry sky, promising to sprinkle dreams of joy and wonder to every child asleep below.


## Retrieve Additional Metadata from Yahoo Finance

👉 **Instructions**:
- We will enrich our news dataset by retrieving **company-level metadata** using the `yfinance` library.
- The goal is to map each unique stock ticker (`TICKER`) in the dataset to:
  - `COMPANY_NAME`
  - `SECTOR`
  - `INDUSTRY`

> ℹ️ `yfinance` fetches live data from Yahoo Finance. If you're running this in a cloud environment or during peak hours, expect some tickers to fail or rate limits to apply.

✅ After this step, you will have a new DataFrame (e.g. `df_meta`) with the columns `TICKER`, `COMPANY_NAME`, `SECTOR`, `INDUSTRY` that maps tickers to their company names, sectors, and industries. This metadata will be useful later to add filters and analysis based on sector or industry categories.


In [13]:
# CODE HERE
# Use as many coding cells as you need

def fetch_meta_for_ticker(ticker: str, max_retries: int = 3, base_sleep: float = 0.7):
    """
    Fetch metadata for a given stock ticker from Yahoo Finance.
    Returns a dict with metadata for the ticker.
    Retries with backoff if Yahoo responds with error/rate limit.
    """
    for attempt in range(max_retries):
        try:
            # May take time or error out during peak hours
            info = yf.Ticker(ticker).get_info()
            return {
                "TICKER": ticker,
                "COMPANY_NAME": info.get("longName") or info.get("shortName"),
                "SECTOR": info.get("sector"),
                "INDUSTRY": info.get("industry") or info.get("industryKey"),
            }
        except Exception as e:
            print(f"Error fetching {ticker}: {e}")
            # Exponential backoff
            time.sleep(base_sleep * (2 ** attempt))
    # If no success, return empty (you can review them later)
    return {"TICKER": ticker, "COMPANY_NAME": None, "SECTOR": None, "INDUSTRY": None}

In [14]:
# Step 1: Fetch unique tickers from the DataFrame
unique_tickers = df_news['TICKER'].unique()
print(f"Total unique tickers: {len(unique_tickers)}")
print("First 10 Unique Tickers:")
print("-" * 30)
print(unique_tickers[:10])  # Print first 10 unique tickers

Total unique tickers: 490
First 10 Unique Tickers:
------------------------------
['MMM' 'AOS' 'ABT' 'ABBV' 'ACN' 'ADBE' 'AMD' 'AES' 'AFL' 'A']


In [15]:
# Step 2: Fetch metadata for each ticker
rows = [
    fetch_meta_for_ticker(t) 
    for t in tqdm(unique_tickers, desc="Fetching Yahoo Finance metadata")
]

Fetching Yahoo Finance metadata: 100%|██████████| 490/490 [02:41<00:00,  3.04it/s]


In [16]:
# Step 3: Create a DataFrame from the metadata
df_meta = pd.DataFrame(rows)
print(f"Metadata DataFrame Head: {df_meta.shape}")
print("-" * 30)
display(df_meta.head())

Metadata DataFrame Head: (490, 4)
------------------------------


Unnamed: 0,TICKER,COMPANY_NAME,SECTOR,INDUSTRY
0,MMM,3M Company,Industrials,Conglomerates
1,AOS,A. O. Smith Corporation,Industrials,Specialty Industrial Machinery
2,ABT,Abbott Laboratories,Healthcare,Medical Devices
3,ABBV,AbbVie Inc.,Healthcare,Drug Manufacturers - General
4,ACN,Accenture plc,Technology,Information Technology Services


## Retrieval-Augmented Generation (RAG): Retrieve Documents and Generate Answers

👉 **Instructions**:

In this part of the assignment, your task is to build a simple Retrieval-Augmented Generation (RAG) pipeline that:

- Takes a user question as input.
- Searches the FAISS vector store to find a set of relevant financial news articles based on semantic similarity.
- Uses the retrieved news articles as context to generate a clear, concise answer to the question by interacting with the OpenAI language model.
- Returns both the generated answer and the underlying news snippets used for context.

### What you need to focus on:

- Implement a retrieval mechanism to query your vector store and obtain the top relevant documents for any question.
- Construct prompts that effectively combine retrieved news content with the user’s question to guide the language model’s response.
- Use the OpenAI API to generate answers grounded in the retrieved context.
- Organize the outputs so that for each question, you have:
  - The generated answer.
  - The collection of news excerpts used to produce that answer.

### What you will be provided:

- Helper functions to display outputs in markdown format.
- Lists of example questions covering topics, companies, and industries to test your implementation.

---

Your solution can take any form or structure you find appropriate, as long as it fulfills these core objectives. This exercise will give you hands-on experience with integrating retrieval and generation for practical applications in finance.


#### Print markdown
You can use the following function to print answers from GPT4o-mini in markdown.

In [17]:
from IPython.display import Markdown, display

def print_markdown(text):
    display(Markdown(text))

#### Predefined questions

In [18]:
questions_topic = [
"What are the major concerns expressed in financial news about inflation?",
"How is investor sentiment described in recent financial headlines?",
"What role is artificial intelligence playing in recent finance-related news stories?"
]

questions_company = [
"How is Microsoft being portrayed in news stories about artificial intelligence?",
"What financial news headlines connect Amazon with automation or logistics?"
]

questions_industry = [
"What are the main themes emerging in financial news about the semiconductor industry?",
"What trends are being reported in the retail industry?",
"What risks or challenges are discussed in recent news about the energy industry?"
]

In [19]:
# CODE HERE
# Use as many coding cells as you need

def rag_answer(question, k=5):
    # 1) Step 1: Retrieve relevant documents from FAISS
    results = faiss_store.search(question, k=k)
    context_snippets = [
        f"- {meta['PUBLICATION_DATE']} | {meta['TICKER']} | {doc}"
        for doc, meta, _ in results
    ]

    # 2) Step 2: Build prompt
    context_text = "\n".join(context_snippets)
    prompt = textwrap.dedent(f"""
    You are a financial analyst assistant.
    Use ONLY the information in the CONTEXT below to answer the QUESTION.
    If you lack enough information, say so.

    CONTEXT:
    {context_text}

    QUESTION:
    {question}

    Answer in 3-5 sentences, concise and factual.
    """).strip()
    
    # 3) Step 3: Generate answer using OpenAI model
    response = client.responses.create(
        model="gpt-4o-mini",
        input=prompt
    )

    # 4) Step 4: Output
    answer_text = response.output_text
    return answer_text, context_snippets

In [20]:
# Questions about a topic
for q in questions_topic:
    answer, ctx = rag_answer(q, k=5)
    print_markdown(f"**Q:** {q}\n\n**Answer:** {answer}\n\n**Context:**\n" + "\n".join(ctx))

**Q:** What are the major concerns expressed in financial news about inflation?

**Answer:** The major concerns expressed in financial news about inflation include mounting apprehension over persistent US inflation, as highlighted in the Federal Reserve’s May policy meeting. This situation raises fears of an economic slowdown and dampens hopes for a rate cut. Additionally, food inflation is specifically noted as a significant factor impacting economic expectations alongside ongoing tariff issues.

**Context:**
- 2025-05-29 | BLK | Bitcoin price slips as Fed minutes flag US inflation risks : The Federal Reserve’s May policy meeting revealed mounting concern over persistent US inflation and the potential for economic slowdown.
- 2025-05-31 | TSLA | The Weekend: Food inflation dampens hopes of a rate cut as tariff twists and turns continue : Key moments from the last seven days, plus a glimpse at the week ahead
- 2025-05-31 | NVDA | The Weekend: Food inflation dampens hopes of a rate cut as tariff twists and turns continue : Key moments from the last seven days, plus a glimpse at the week ahead
- 2025-05-31 | LULU | The Weekend: Food inflation dampens hopes of a rate cut as tariff twists and turns continue : Key moments from the last seven days, plus a glimpse at the week ahead
- 2025-05-31 | AVGO | The Weekend: Food inflation dampens hopes of a rate cut as tariff twists and turns continue : Key moments from the last seven days, plus a glimpse at the week ahead

**Q:** How is investor sentiment described in recent financial headlines?

**Answer:** Investor sentiment in recent financial headlines appears to be a mix of optimism and skepticism. Articles highlight that while many stocks are showing positive performances and analysts have set ambitious price targets, there is caution due to potential institutional pressures leading to overly optimistic forecasts. Additionally, some stocks are under scrutiny for having questionable fundamentals, prompting a need for investors to be wary of bullish ratings. Overall, the environment suggests a dichotomy between bullish outlooks and the need for critical evaluation of analyst recommendations.

**Context:**
- 2025-05-26 | KMX | 3 of Wall Street’s Favorite Stocks Facing Headwinds : Wall Street has set ambitious price targets for the stocks in this article. While this suggests attractive upside potential, it’s important to remain skeptical because analysts face institutional pressures that can sometimes lead to overly optimistic forecasts.
- 2025-05-20 | MCHP | 3 Hyped Up  Stocks Facing Headwinds : Great things are happening to the stocks in this article. They’re all outperforming the market over the last month because of positive catalysts such as a new product line, constructive news flow, or even a loyal Reddit fanbase.
- 2025-05-06 | MPWR | 1 of Wall Street’s Favorite Stock with Impressive Fundamentals and 2 to Think Twice About : The stocks in this article have caught Wall Street’s attention in a big way, with price targets implying returns above 20%. But investors should take these forecasts with a grain of salt because analysts typically say nice things about companies so their firms can win business in other product lines like M&A advisory.
- 2025-05-21 | DRI | 1 Unpopular Stock that Should Get More Attention and 2 to Steer Clear Of : When Wall Street turns bearish on a stock, it’s worth paying attention. These calls stand out because analysts rarely issue grim ratings on companies for fear their firms will lose out in other business lines such as M&A advisory.
- 2025-05-23 | RVTY | 3 of Wall Street’s Favorite Stocks with Questionable Fundamentals : Wall Street is overwhelmingly bullish on the stocks in this article, with price targets suggesting significant upside potential. However, it’s worth remembering that analysts rarely issue sell ratings, partly because their firms often seek other business from the same companies they cover.

**Q:** What role is artificial intelligence playing in recent finance-related news stories?

**Answer:** Artificial intelligence is increasingly influencing various sectors, including finance, by enhancing productivity and reducing human error. Companies like Jack Henry & Associates are integrating AI-driven technologies to improve lending processes. Meta Platforms is investing in AI applications, but its stock is currently valued based on its legacy business. Additionally, firms like Palantir and Upstart are utilizing AI to meet demand from government and commercial clients, focusing on credit risk assessment and generative AI capabilities. These developments suggest a growing interest in AI as a tool for financial innovation and operational efficiency.

**Context:**
- 2025-03-17 | JKHY | Jack Henry (JKHY) Integrates AI-Driven Lending Tech With Algebrik : We recently published a list of 12 AI News Investors Should Not Miss This Week. In this article, we are going to take a look at where Jack Henry & Associates, Inc. (NASDAQ:JKHY) stands against other AI news Investors should not miss this week. Artificial Intelligence (AI) is known to increase productivity, decrease human error, […]
- 2025-05-31 | META | This "Magnificent Seven" Stock Is Set to Skyrocket If Its AI Investments Pay Off : Meta Platforms has investments in several AI applications.  The tech giant's stock is only valued on its legacy business.  Over the past two-and-a-half years, investors have heard about various artificial intelligence (AI) investments that tech companies are making.
- 2025-05-31 | PLTR | Billionaires Are Buying 2 Artificial Intelligence (AI) Stocks That Wall Street Analysts Say Can Soar Up to 240% : Several billionaire hedge fund managers bought shares of Palantir and/or Upstart in the first quarter -- stocks where certain analysts anticipate substantial upside.  Palantir is successfully tapping demand for artificial intelligence (AI) with government and commercial customers, but the stock trades at a very expensive valuation.  Upstart is generating attractive returns for lenders by helping them quantify credit risk with artificial intelligence, and the stock trades at a very reasonable valuation.
- 2025-05-31 | PLTR | Better Artificial Intelligence (AI) Stock: Palantir vs. Snowflake : Shares of both Palantir and Snowflake have delivered healthy gains in 2025 despite the broader stock market weakness.  Palantir stock has shot up 63% this year despite bouts of volatility.  Palantir Technologies helps commercial and government clients integrate generative AI capabilities into their operations with its Artificial Intelligence Platform (AIP), which was launched roughly two years ago.
- 2025-05-29 | NFLX | 2 Underrated Artificial Intelligence (AI) Stocks to Buy and Hold : Generative AI can simplify and speed up many tasks, including content production.  It's easy to see the potential for Netflix, whose content strategy is integral to its success.  Netflix's creations have attracted millions of viewers and won many awards.

In [21]:
# Questions about a company
for q in questions_company:
    answer, ctx = rag_answer(q, k=5)
    print_markdown(f"**Q:** {q}\n\n**Answer:** {answer}\n\n**Context:**\n" + "\n".join(ctx))

**Q:** How is Microsoft being portrayed in news stories about artificial intelligence?

**Answer:** The provided context does not include any information about Microsoft or its portrayal in news stories related to artificial intelligence. Therefore, I cannot answer your question based on the given context.

**Context:**
- 2025-05-31 | META | This "Magnificent Seven" Stock Is Set to Skyrocket If Its AI Investments Pay Off : Meta Platforms has investments in several AI applications.  The tech giant's stock is only valued on its legacy business.  Over the past two-and-a-half years, investors have heard about various artificial intelligence (AI) investments that tech companies are making.
- 2025-05-29 | CRM | How Salesforce has 'overcorrected' by leaning into AI : D.A. Davidson head of technology research Gil Luria joins Market Domination to discuss Salesforce (CRM) earnings and the company's trajectory. Luria says Salesforce is "too focused" on artificial intelligence (AI), as the other parts of its business "rapidly" decelerate and the company loses market share to competitors. Luria has the equivalent of a Sell rating on the stock. To watch more expert insights and analysis on the latest market action, check out more Market Domination here.
- 2025-03-17 | JKHY | Jack Henry (JKHY) Integrates AI-Driven Lending Tech With Algebrik : We recently published a list of 12 AI News Investors Should Not Miss This Week. In this article, we are going to take a look at where Jack Henry & Associates, Inc. (NASDAQ:JKHY) stands against other AI news Investors should not miss this week. Artificial Intelligence (AI) is known to increase productivity, decrease human error, […]
- 2025-05-29 | NFLX | 2 Underrated Artificial Intelligence (AI) Stocks to Buy and Hold : Generative AI can simplify and speed up many tasks, including content production.  It's easy to see the potential for Netflix, whose content strategy is integral to its success.  Netflix's creations have attracted millions of viewers and won many awards.
- 2025-05-30 | META | Meta (META) AI Reaches 1 Billion Users, Eyes Paid Features and Subscriptions : We recently published a list of 10 AI Stocks on Wall Street’s Radar. In this article, we are going to take a look at where Meta Platforms, Inc. (NASDAQ:META) stands against other AI stocks on Wall Street’s radar. Meta Platforms, Inc. (NASDAQ:META) is a global technology company. On May 28, CNBC reported that Meta Platforms, Inc. (NASDAQ:META)’s artificial […]

**Q:** What financial news headlines connect Amazon with automation or logistics?

**Answer:** The financial news headlines connecting Amazon with automation and logistics include remarks by Matt Garman, CEO of Amazon Web Services, highlighting that every aspect of Amazon is leveraging artificial intelligence, which often intersects with automation technologies. Additionally, Truist's report discusses Woodward's increasing volumes and automation as drivers for its earnings growth, emphasizing the importance of these trends in the aerospace industry. Although it doesn't directly link to Amazon’s logistics, the mention of automation in relation to operational efficiencies ties back to Amazon's logistics operations in e-commerce and cloud services.

**Context:**
- 2025-05-25 | TFC | Truist Reiterates Buy on Amazon.com (AMZN) as Q2 Revenue Tracks Ahead : We recently published a list of 10 AI Stocks on Wall Street’s Radar. In this article, we are going to take a look at where Amazon.com Inc. (NASDAQ:AMZN) stands against other AI stocks on Wall Street’s radar. Amazon.com Inc. (NASDAQ:AMZN) is an American technology company offering e-commerce, cloud computing, and other services, including digital streaming […]
- 2025-05-30 | AMZN | Amazon's AI Roadmap With AWS CEO Garman : Every aspect of Amazon is leveraging artificial intelligence, says Matt Garman, CEO of Amazon Web Services. Garman discusses Amazon's AI roadmap and reflects on his first year in the role with Ed Ludlow on "Bloomberg Technology."
- 2025-05-23 | TFC | Woodward's Volumes, Automation to Drive Earnings Growth, Truist Says : Woodward's (WWD) increasing volumes, pricing, automation, and products will push its aerospace margi
- 2025-04-30 | AON | Top Stock Reports for Amazon.com, Johnson & Johnson & Cisco Systems : Today's Research Daily features new research reports on 16 major stocks, including Amazon.com, Inc. (AMZN), Johnson & Johnson (JNJ) and Cisco Systems, Inc. (CSCO), as well as a micro-cap NeurAxis, Inc. (NRXS).
- 2025-05-23 | CHRW | Winners And Losers Of Q1: C.H. Robinson Worldwide (NASDAQ:CHRW) Vs The Rest Of The Air Freight and Logistics Stocks : As the craze of earnings season draws to a close, here’s a look back at some of the most exciting (and some less so) results from Q1. Today, we are looking at air freight and logistics stocks, starting with C.H. Robinson Worldwide (NASDAQ:CHRW).

In [22]:
# Questions about an industry
for q in questions_industry:
    answer, ctx = rag_answer(q, k=5)
    print_markdown(f"**Q:** {q}\n\n**Answer:** {answer}\n\n**Context:**\n" + "\n".join(ctx))

**Q:** What are the main themes emerging in financial news about the semiconductor industry?

**Answer:** The main themes emerging in the financial news about the semiconductor industry include significant investor attention on key companies like ON Semiconductor Corp., highlighted by discussions on their international revenue trends and how these metrics influence forecasts. Despite soft earnings, shareholder optimism seems to persist, indicating confidence in the stock's potential. Additionally, industry comparisons are being made with other semiconductor companies, showcasing a broader interest in assessing stocks that may offer substantial upside potential in the current market environment.

**Context:**
- 2025-05-13 | ON | Investing in ON Semiconductor Corp. (ON)? Don't Miss Assessing Its International Revenue Trends : Explore ON Semiconductor Corp.'s (ON) international revenue trends and how these numbers impact Wall Street's forecasts and what's ahead for the stock.
- 2025-05-21 | ON | ON Semiconductor Corporation (ON) is Attracting Investor Attention: Here is What You Should Know : Recently, Zacks.com users have been paying close attention to ON Semiconductor Corp. (ON). This makes it worthwhile to examine what the stock has in store.
- 2025-05-12 | ON | Some May Be Optimistic About ON Semiconductor's (NASDAQ:ON) Earnings : Soft earnings didn't appear to concern ON Semiconductor Corporation's ( NASDAQ:ON ) shareholders over the last week...
- 2025-05-29 | ADI | Spotting Winners: Vishay Intertechnology (NYSE:VSH) And Analog Semiconductors Stocks In Q1 : The end of an earnings season can be a great time to discover new stocks and assess how companies are handling the current business environment. Let’s take a look at how Vishay Intertechnology (NYSE:VSH) and the rest of the analog semiconductors stocks fared in Q1.
- 2025-05-11 | ON | ON Semiconductor (ON): Among Billionaire Glenn Russell Dubin’s Stock Picks with Huge Upside Potential : We recently published a list of Billionaire Glenn Russell Dubin’s 10 Stock Picks with Huge Upside Potential. In this article, we are going to take a look at where ON Semiconductor Corporation (NASDAQ:ON) stands against Billionaire Glenn Russell Dubin’s other stock picks with huge upside potential. Glenn Russell Dubin is one of the industry’s most […]

**Q:** What trends are being reported in the retail industry?

**Answer:** Recent trends in the retail industry indicate a significant decline in stock performance, with retail stocks dropping by 13.7% over the past six months, which is worse than the S&P 500’s 5.5% loss. This decline is attributed to volatility in consumer spending and the ongoing adaptation of retailers' business models to technological changes. Additionally, executives are adjusting supply chains and implementing price increases in response to legal rulings on tariffs, indicating challenges in navigating external economic factors. Overall, the performance has been adversely influenced by negative demand trends linked to economic cycles.

**Context:**
- 2025-05-12 | KMX | 3 Consumer Stocks That Concern Us : Retailers are adapting their business models as technology changes how people shop. Still, demand can be volatile as the industry is exposed to the ups and downs of consumer spending. This has stirred some uncertainty lately as retail stocks have tumbled by 13.7% over the past six months. This performance was worse than the S&P 500’s 5.5% loss.
- 2025-05-29 | BBY | Retailers, Ducking Trade-War Curveballs, Stick to Their Plans : As legal rulings roll in on Trump’s tariff policies, retail executives say they have shifted their supply chains and many price increases already have hit shelves.
- 2025-05-22 | HLT | 3 Consumer Stocks Skating on Thin Ice : The performance of consumer discretionary businesses is closely linked to economic cycles. Over the past six months, it seems like demand trends are working against their favor as the industry has tumbled by 12.3%. This drop was significantly worse than the S&P 500’s 2.1% decline.
- 2025-04-26 | PKG | Packaging Corporation of America (NYSE:PKG) Hasn't Managed To Accelerate Its Returns : What trends should we look for it we want to identify stocks that can multiply in value over the long term? Amongst...
- 2025-05-14 | APD | Air Products and Chemicals (NYSE:APD) Will Be Hoping To Turn Its Returns On Capital Around : To find a multi-bagger stock, what are the underlying trends we should look for in a business? Firstly, we'll want to...

**Q:** What risks or challenges are discussed in recent news about the energy industry?

**Answer:** Recent news highlights significant challenges in the energy industry, particularly for renewable energy stocks. A bill advancing in Congress could repeal crucial subsidies, potentially making renewable projects uneconomical as companies ramp up production. Additionally, the oilfield service sector faces a difficult future due to sliding oil prices, rising tariffs, and shrinking drilling budgets, raising concerns about whether LNG and AI demand can provide sufficient support.

**Context:**
- 2025-05-23 | NEE | Renewable Energy Stocks Crash as U.S. Advances Bill That Could Decimate the Industry : Congress is pushing forward a bill that could upend the renewable energy industry.  Just as companies have ramped up production and renewable electricity generation in the U.S., those projects may become uneconomical.  The news was about as bad as it could get for renewable energy stocks this week as the U.S. House of Representatives early Thursday passed a bill that will repeal some of the most important subsidies for the industry if it becomes law.
- 2025-05-23 | ENPH | Renewable Energy Stocks Crash as U.S. Advances Bill That Could Decimate the Industry : Congress is pushing forward a bill that could upend the renewable energy industry.  Just as companies have ramped up production and renewable electricity generation in the U.S., those projects may become uneconomical.  The news was about as bad as it could get for renewable energy stocks this week as the U.S. House of Representatives early Thursday passed a bill that will repeal some of the most important subsidies for the industry if it becomes law.
- 2025-05-21 | HAL | Tariffs, Prices, and Pain: What's Next for Oilfield Service? : The likes of SLB, HAL and BKR face a tough future as oil prices slide, tariffs rise and drilling budgets shrink - can LNG and AI demand offer enough support?
- 2025-05-21 | BKR | Tariffs, Prices, and Pain: What's Next for Oilfield Service? : The likes of SLB, HAL and BKR face a tough future as oil prices slide, tariffs rise and drilling budgets shrink - can LNG and AI demand offer enough support?
- 2025-05-21 | FCX | 3 American Companies Investors Need to Know Amid Trump's Tariff Wars : Copper is a critical metal for the U.S. industrial economy.  This American appliance maker expects the Trump administration to close loopholes that will improve its competitive positioning.  It's difficult to predict precisely what the tariff landscape will look like when the dust settles on the trade conflict, but we can say some things with a high degree of certainty.

## Analysis & Questions - Section 1

### Analysis and Reflection on Retrieval and Generation Results
After running the RAG pipeline and obtaining answers along with their supporting news excerpts, take some time to carefully review both the generated responses and the retrieved contexts.

- **For each question, read the answer and then the corresponding news snippets used as context.**

- Reflect on the following points and document your observations:
1. **Relevance** 
2. **Completeness**  
3. **Bias or Noise** 
4. **Consistency**  
5. **Improvement Ideas**   

and answer the questions below:

#### **Question 1.** How well do the retrieved news snippets support the generated answer? Are the key facts or themes in the answer clearly grounded in the context?

**RESPONSE:**

For most of the questions, the retrieved news snippets provide a solid foundation for the generated answers, with key facts and themes traceable to the contextual excerpts. In the inflation question, for instance, the answer’s focus on persistent U.S. inflation risks, the Federal Reserve’s concerns, and the impact of food inflation is clearly grounded in multiple snippets referencing Fed minutes and tariff-related pressures. Similarly, the AI in finance response directly aligns with retrieved articles on Jack Henry, Palantir, Upstart, and Meta, reinforcing the connection between the stated themes and the source material.

However, there are cases where grounding is weaker. For example, in the Microsoft AI question, the retrieved context did not contain any Microsoft-specific snippets, leading to a correct model disclaimer about insufficient information. This indicates that retrieval precision can vary, and when relevant documents are missing, the generated content cannot be meaningfully anchored to the context. In general, when relevant matches are present, the answers remain well-supported and faithful to the retrieved material, but performance depends heavily on retrieval accuracy and coverage for the given query.

#### **Question 2.** Does the answer fully address the question, or does it leave important aspects out? Consider if the retrieved context provided enough information to generate a thorough response.

**RESPONSE:**

In most cases, the answers adequately address the main intent of the questions, but, once again, the level of completeness varies depending on the richness of the retrieved context. For the inflation and AI in finance topics, the context contained multiple, diverse snippets covering different angles of the issue—policy concerns, economic impacts, and company-level applications—which allowed the model to produce comprehensive, multi-faceted responses. These answers not only reflected the central themes but also incorporated secondary details that added depth.

Conversely, some answers reveal clear gaps tied to retrieval limitations. The Microsoft AI question is a prime example: the absence of any Microsoft-related articles in the retrieved set meant that the model could not elaborate on the company’s portrayal and instead provided a non-answer. Similarly, certain industry-level questions, such as those on the semiconductor and retail sectors, offered relevant but somewhat narrow perspectives, omitting potential macroeconomic or competitive dynamics that might have been captured with broader retrieval. Overall, completeness is strong when retrieval returns diverse, directly relevant sources, but noticeably constrained when the context set is limited or too homogeneous.

#### **Question 3.** Are there any irrelevant or misleading snippets retrieved that may have influenced the answer? How might this affect the quality of the output?

**RESPONSE:**

Yes, there are instances where irrelevant or tangential snippets appear in the retrieved context, which can subtly dilute the precision of the generated answers. For example, in the Amazon automation/logistics question, some retrieved articles referenced broader industry players or unrelated companies (e.g., Woodward or C.H. Robinson) without a direct link to Amazon’s automation strategy. While these may provide peripheral industry context, they risk shifting the model’s focus away from the target entity and introducing speculative connections, as seen when the answer loosely associated logistics sector trends with Amazon’s operations.

Similarly, in a few industry-level queries, certain snippets addressed companies or events only indirectly tied to the question’s scope. Although the model often filters out clearly irrelevant details, the presence of loosely related material increases the chance of subtle bias (where the generated text emphasizes secondary themes or overgeneralizes trends not strongly supported by the question’s intent). This underscores the importance of refining retrieval filters (e.g., by ticker, sector, or keyword constraints) to ensure that the context remains tightly aligned with the query, thereby enhancing factual grounding and minimizing noise in the output.

#### **Question 4.**  Do the news snippets show consistent information, or are there conflicting viewpoints? How does the LLM handle potential contradictions in the context?

**RESPONSE:**

Overall, the retrieved news snippets tend to present consistent information within each query’s scope, with minimal direct contradictions. For example, in the inflation and AI in finance questions, the excerpts largely reinforce one another (multiple sources echo concerns about persistent inflationary pressures or highlight the strategic role of AI in specific companies) allowing the model to synthesize a coherent narrative without reconciling opposing claims.

In cases where slight variations in emphasis occur (such as differing levels of optimism in investor sentiment headlines or varying degrees of risk in energy industry coverage) the model appears to smooth these differences into a balanced response. It typically blends perspectives, acknowledging both positive and negative elements rather than explicitly flagging disagreement. This approach maintains a unified tone but can obscure the existence of genuine divergences in viewpoint.

While the absence of sharp contradictions simplifies generation, it also means the LLM has limited opportunity to demonstrate conflict resolution or source attribution skills. If future retrieval includes more polarized or dissenting coverage, prompt instructions could be adjusted to encourage the model to explicitly highlight and compare contrasting perspectives, thus enhancing analytical depth.

#### **Question 5.**  Based on your observations, suggest ways the retrieval or generation process could be improved (e.g., better filtering, adjusting `k`, refining prompt design).

**RESPONSE:**

Several targeted improvements could enhance both retrieval precision and generation quality.

* *Retrieval enhancements*

    * *Metadata filtering*: Apply constraints by TICKER, SECTOR, or INDUSTRY when the question clearly targets a company or industry. This would reduce noise from tangential entities (e.g., avoiding unrelated companies in the Amazon automation example).

    * *Hybrid search*: Combine semantic similarity with keyword or entity matching to ensure that high-similarity but irrelevant documents are deprioritized.

    * *Dynamic k adjustment*: Use a smaller k for narrow, entity-focused queries to limit noise, and a larger k for broad thematic queries to capture diversity of viewpoints.

* Context preparation

    * *Deduplication*: Remove near-identical snippets, particularly those from the same date and source, to avoid overweighting a single perspective.

    * *Snippet selection logic*: Prioritize diversity of dates, tickers, and subtopics to ensure context breadth.

* Prompt refinement

    * *Explicit grounding instruction*: Reinforce in the prompt that answers must cite or clearly reference specific context points to increase factual traceability.

    * *Contradiction handling*: Instruct the model to highlight conflicting viewpoints when present, rather than smoothing them into a single narrative.

    * *Context structuring*: Present retrieved items in a structured format (e.g., grouped by source or theme) to help the model organize reasoning.

Implementing these changes would likely improve relevance, reduce the inclusion of peripheral material, and strengthen the factual grounding and analytical richness of generated responses.

## 🧠 Retrieval-Augmented Generation (RAG) v2: Adding Financial Metadata to Improve Generation

👉 **Instructions**:

In this part of the assignment, you’ll enhance your Retrieval-Augmented Generation (RAG) pipeline by incorporating *financial metadata* to provide more contextually rich answers.

Your goal is to evaluate whether metadata such as **company name**, **sector**, and **industry** helps the LLM generate **more accurate and grounded answers** to financial questions.

---

### ✅ What your updated pipeline should do:

- Retrieve relevant financial news articles using semantic similarity with FAISS.
- Enrich each retrieved document with financial metadata:
  - Ticker symbol
  - Full company name
  - Sector (e.g., Technology, Energy)
  - Industry (e.g., Semiconductors, Retail)
- Construct prompts that include both:
  - Retrieved news text
  - Associated metadata
- Send the prompt to the OpenAI model to generate an informed response.
- Return:
  - The final answer
  - The exact set of contextual documents used to produce that answer

---

### 🧪 Evaluation and Comparison:

You will test your improved RAG pipeline on the same three types of questions provided earlier:
- **Topic-focused** (e.g., inflation, interest rates)
- **Company-focused** (e.g., questions about Tesla, Nvidia)
- **Industry-focused** (e.g., semiconductors, utilities)


In [23]:
# CODE HERE
# Use as many coding cells as you need

# Create an index for O(1)-like lookups
df_meta_idx = (
    df_meta
    .drop_duplicates(subset=["TICKER"])
    .set_index("TICKER")[["COMPANY_NAME", "SECTOR", "INDUSTRY"]]
)

print("Metadata index created for O(1) lookups.")
print("-" * 30)
display(df_meta_idx.head())

Metadata index created for O(1) lookups.
------------------------------


Unnamed: 0_level_0,COMPANY_NAME,SECTOR,INDUSTRY
TICKER,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
MMM,3M Company,Industrials,Conglomerates
AOS,A. O. Smith Corporation,Industrials,Specialty Industrial Machinery
ABT,Abbott Laboratories,Healthcare,Medical Devices
ABBV,AbbVie Inc.,Healthcare,Drug Manufacturers - General
ACN,Accenture plc,Technology,Information Technology Services


In [24]:
def lookup_meta_df(ticker: str):
    """Return (company, sector, industry) from df_meta; safe for missing tickers."""
    if ticker in df_meta_idx.index:
        row = df_meta_idx.loc[ticker]
        return (
            (row.get("COMPANY_NAME") if "COMPANY_NAME" in row.index else None) or "Unknown",
            (row.get("SECTOR") if "SECTOR" in row.index else None) or "Unknown",
            (row.get("INDUSTRY") if "INDUSTRY" in row.index else None) or "Unknown",
        )
    return ("Unknown", "Unknown", "Unknown")

In [25]:
def build_enriched_snippets_df(results):
    """
    results: list of tuples (doc, meta, score) from faiss_store.search
    Returns a list of markdown lines that include metadata from df_meta.
    """
    enriched_lines = []
    for doc, m, score in results:
        ticker = str(m.get("TICKER", "") or "")
        company, sector, industry = lookup_meta_df(ticker)
        enriched_lines.append(
            f"- DATE: {m.get('PUBLICATION_DATE')} | "
            f"TICKER: {ticker} | COMPANY: {company} | "
            f"SECTOR: {sector} | INDUSTRY: {industry}\n"
            f"  SNIPPET: {doc}"
        )
    return enriched_lines

In [26]:
def make_metadata_filter_df(sector=None, industry=None, tickers=None):
    """
    Returns a function compatible with FaissVectorStore.metadata_filter
    that consults df_meta (via df_meta_idx) at retrieval time.
    """
    # Normalize inputs
    tickers = set(tickers) if tickers else None

    def _flt(meta):
        t = str(meta.get("TICKER", "") or "")
        # ticker filter
        if tickers is not None and t not in tickers:
            return False
        # sector/industry filters
        if (sector is not None) or (industry is not None):
            c, s, i = lookup_meta_df(t)
            if sector is not None and s != sector:
                return False
            if industry is not None and i != industry:
                return False
        return True

    # If no constraints provided, return None to avoid the temp index
    return _flt if any([sector, industry, tickers]) else None

In [27]:
def rag_answer_v2_df(question, k=5, sector=None, industry=None, tickers=None):
    """
    Retrieval-Augmented Generation with financial metadata from df_meta.
    - Retrieves top-K with faiss_store
    - Enriches each snippet with Company/Sector/Industry (from df_meta)
    - Builds a grounded prompt and queries OpenAI
    """
    mf = make_metadata_filter_df(sector=sector, industry=industry, tickers=tickers)
    results = faiss_store.search(question, k=k, metadata_filter=mf)
    enriched = build_enriched_snippets_df(results)
    context_block = "\n".join(enriched) if enriched else "(no context retrieved)"

    prompt = textwrap.dedent(f"""
    You are a financial analyst assistant.
    Use ONLY the CONTEXT below to answer the QUESTION.
    The CONTEXT includes news snippets and financial metadata (Company, Sector, Industry).
    Rules:
    - Ground every claim in the provided context.
    - If the context is insufficient, say so explicitly.
    - Prefer mentioning the relevant tickers/companies/sectors when synthesizing.
    - Keep the answer concise (3–6 sentences), factual, and finance-oriented.

    CONTEXT:
    {context_block}

    QUESTION:
    {question}

    Provide a single, well-structured paragraph.
    """).strip()

    response = client.responses.create(
        model="gpt-4o-mini",
        input=prompt
    )
    return response.output_text, enriched

In [28]:
def compare_v1_v2_df(question, k=5, sector=None, industry=None, tickers=None):
    """
    Compare RAG v1 (no metadata) vs RAG v2 (with financial metadata).
    - RAG v1 uses rag_answer
    - RAG v2 uses rag_answer_v2_df with optional filters
    """
    a1, ctx1 = rag_answer(question, k=k)
    a2, ctx2 = rag_answer_v2_df(question, k=k, sector=sector, industry=industry, tickers=tickers)

    display(Markdown(f"### Question: {question}"))
    display(Markdown(f"#### RAG v1 (no metadata)\n{a1}\n\n**Context (top-{k}):**\n" + "\n".join(ctx1)))
    display(Markdown(f"#### RAG v2 (with financial metadata)\n{a2}\n\n**Context (top-{k} enriched):**\n" + "\n".join(ctx2)))

### a) Topic-focused

In [29]:
# Sector-focused: optionally force sector relevance (e.g., Financial Services)
print(f"Number of sector-focused questions: {len(questions_topic)}")
print("-" * 30)
for i, q in enumerate(questions_topic):
    print(f"Q{i + 1}. {q}")

Number of sector-focused questions: 3
------------------------------
Q1. What are the major concerns expressed in financial news about inflation?
Q2. How is investor sentiment described in recent financial headlines?
Q3. What role is artificial intelligence playing in recent finance-related news stories?


In [30]:
unique_sectors = df_meta["SECTOR"].dropna().unique()
unique_sectors.sort()
print(f"Unique sectors ({len(unique_sectors)}):")
print("-" * 30)
print(unique_sectors)

Unique sectors (11):
------------------------------
['Basic Materials' 'Communication Services' 'Consumer Cyclical'
 'Consumer Defensive' 'Energy' 'Financial Services' 'Healthcare'
 'Industrials' 'Real Estate' 'Technology' 'Utilities']


In [31]:
# Topic-focused (all questions related to finance topics)
for q in questions_topic:
    compare_v1_v2_df(q, k=5, sector="Financial Services")

### Question: What are the major concerns expressed in financial news about inflation?

#### RAG v1 (no metadata)
The major concerns expressed in financial news about inflation include the persistent nature of US inflation, which is raising alarm among policymakers, as indicated by the Federal Reserve's May meeting minutes. Additionally, food inflation is dampening expectations for potential interest rate cuts, suggesting that inflationary pressures are affecting economic forecasts and financial strategies. There is also mention of ongoing tariff issues, which may further complicate the inflation landscape. Overall, these elements contribute to worries about a potential economic slowdown.

**Context (top-5):**
- 2025-05-29 | BLK | Bitcoin price slips as Fed minutes flag US inflation risks : The Federal Reserve’s May policy meeting revealed mounting concern over persistent US inflation and the potential for economic slowdown.
- 2025-05-31 | TSLA | The Weekend: Food inflation dampens hopes of a rate cut as tariff twists and turns continue : Key moments from the last seven days, plus a glimpse at the week ahead
- 2025-05-31 | NVDA | The Weekend: Food inflation dampens hopes of a rate cut as tariff twists and turns continue : Key moments from the last seven days, plus a glimpse at the week ahead
- 2025-05-31 | LULU | The Weekend: Food inflation dampens hopes of a rate cut as tariff twists and turns continue : Key moments from the last seven days, plus a glimpse at the week ahead
- 2025-05-31 | AVGO | The Weekend: Food inflation dampens hopes of a rate cut as tariff twists and turns continue : Key moments from the last seven days, plus a glimpse at the week ahead

#### RAG v2 (with financial metadata)
Major concerns about inflation are articulated in two key financial news snippets. Firstly, the Federal Reserve's May policy meeting highlighted anxieties regarding persistent US inflation and the potential for an economic slowdown (BlackRock, BLK). This indicates that inflation risks are being closely monitored by central authorities, raising questions about future monetary policy adjustments. Additionally, broader economic uncertainties, fueled by underwhelming US economic data and renewed trade issues, have contributed to a weakened dollar, further complicating the inflation landscape (Goldman Sachs, GS; Citigroup, C). These elements collectively suggest an evolving environment where inflationary pressures could significantly impact financial markets and economic growth.

**Context (top-5 enriched):**
- DATE: 2025-05-29 | TICKER: BLK | COMPANY: BlackRock, Inc. | SECTOR: Financial Services | INDUSTRY: Asset Management
  SNIPPET: Bitcoin price slips as Fed minutes flag US inflation risks : The Federal Reserve’s May policy meeting revealed mounting concern over persistent US inflation and the potential for economic slowdown.
- DATE: 2025-05-29 | TICKER: GS | COMPANY: The Goldman Sachs Group, Inc. | SECTOR: Financial Services | INDUSTRY: Capital Markets
  SNIPPET: Dollar Drops on Renewed Trade Uncertainty, Soft Economic Data : (Bloomberg) -- Underwhelming US economic data weighed on the dollar on Thursday, amplifying investor uncertainty about the outlook for growth as a federal appeals court temporarily stayed a ruling that had blocked the Trump administration’s global tariffs.Most Read from BloombergNYC Congestion Toll Brings In $216 Million in First Four MonthsThe Economic Benefits of Paying Workers to MoveNow With Colorful Blocks, Tirana’s Pyramid Represents a Changing AlbaniaNY Wins Order Against US Funding Freez
- DATE: 2025-05-29 | TICKER: C | COMPANY: Citigroup Inc. | SECTOR: Financial Services | INDUSTRY: Banks - Diversified
  SNIPPET: Dollar Drops on Renewed Trade Uncertainty, Soft Economic Data : (Bloomberg) -- Underwhelming US economic data weighed on the dollar on Thursday, amplifying investor uncertainty about the outlook for growth as a federal appeals court temporarily stayed a ruling that had blocked the Trump administration’s global tariffs.Most Read from BloombergNYC Congestion Toll Brings In $216 Million in First Four MonthsThe Economic Benefits of Paying Workers to MoveNow With Colorful Blocks, Tirana’s Pyramid Represents a Changing AlbaniaNY Wins Order Against US Funding Freez
- DATE: 2025-05-27 | TICKER: APO | COMPANY: Apollo Global Management, Inc. | SECTOR: Financial Services | INDUSTRY: Asset Management
  SNIPPET: Magnificent 7's AI Froth May Deflate on Rates : High borrowing costs and low risk appetite could puncture future-focused tech valuations
- DATE: 2025-05-05 | TICKER: IVZ | COMPANY: Invesco Ltd. | SECTOR: Financial Services | INDUSTRY: Asset Management
  SNIPPET: Stocks Fall After Historic Run as Trade Risks Loom: Markets Wrap : (Bloomberg) -- A historic stock-market run came to a halt as President Donald Trump’s latest tariff remarks provided little relief to investors bracing for the impacts of his trade war on the economy and corporate earnings.Most Read from BloombergThe Battle Over the Fate of Detroit’s Renaissance CenterNYC Real Estate Industry Asks Judge to Block New Broker Fee LawNJ Transit Strike Would Be ‘Disaster’ for Region, Sherrill SaysIceland Plans for a More Volcanic FutureVail to Borrow Muni Debt to Eas

### Question: How is investor sentiment described in recent financial headlines?

#### RAG v1 (no metadata)
Investor sentiment in recent financial headlines appears to be cautiously optimistic yet skeptical. While many stocks are highlighted for their potential upside and positive catalysts, there is a consistent reminder to take analysts' forecasts with a grain of salt, as there may be institutional pressures influencing overly optimistic predictions. Additionally, bearish assessments are notable, particularly when analysts express concerns about stocks, which are rare occurrences in their reviews. Overall, there is a conflicting dynamic between bullish forecasts and caution regarding fundamental strength and market realities.

**Context (top-5):**
- 2025-05-26 | KMX | 3 of Wall Street’s Favorite Stocks Facing Headwinds : Wall Street has set ambitious price targets for the stocks in this article. While this suggests attractive upside potential, it’s important to remain skeptical because analysts face institutional pressures that can sometimes lead to overly optimistic forecasts.
- 2025-05-20 | MCHP | 3 Hyped Up  Stocks Facing Headwinds : Great things are happening to the stocks in this article. They’re all outperforming the market over the last month because of positive catalysts such as a new product line, constructive news flow, or even a loyal Reddit fanbase.
- 2025-05-06 | MPWR | 1 of Wall Street’s Favorite Stock with Impressive Fundamentals and 2 to Think Twice About : The stocks in this article have caught Wall Street’s attention in a big way, with price targets implying returns above 20%. But investors should take these forecasts with a grain of salt because analysts typically say nice things about companies so their firms can win business in other product lines like M&A advisory.
- 2025-05-21 | DRI | 1 Unpopular Stock that Should Get More Attention and 2 to Steer Clear Of : When Wall Street turns bearish on a stock, it’s worth paying attention. These calls stand out because analysts rarely issue grim ratings on companies for fear their firms will lose out in other business lines such as M&A advisory.
- 2025-05-23 | RVTY | 3 of Wall Street’s Favorite Stocks with Questionable Fundamentals : Wall Street is overwhelmingly bullish on the stocks in this article, with price targets suggesting significant upside potential. However, it’s worth remembering that analysts rarely issue sell ratings, partly because their firms often seek other business from the same companies they cover.

#### RAG v2 (with financial metadata)
Investor sentiment in recent financial headlines appears to be mixed. The Allstate Corporation (NYSE:ALL) captures attention amidst a search for potential growth, hinting at a speculative or cautious approach among investors. Conversely, the Bank of New York Mellon's (BK) CEO noted a disconnect between market sentiment and factual conditions, indicating uncertainty. Meanwhile, MarketAxess (MKTX) highlights heightened market volatility influencing bond trading, reflecting growing investor engagement amid fluctuating market dynamics. With Morgan Stanley (MS) showcasing strong returns over five years, there is a contrasting sense of optimism in certain segments of the financial market. Overall, the recent narrative underscores a complex landscape of investor sentiment influenced by varying perceptions and market conditions.

**Context (top-5 enriched):**
- DATE: 2025-05-21 | TICKER: ALL | COMPANY: The Allstate Corporation | SECTOR: Financial Services | INDUSTRY: Insurance - Property & Casualty
  SNIPPET: Here's Why We Think Allstate (NYSE:ALL) Might Deserve Your Attention Today : Investors are often guided by the idea of discovering 'the next big thing', even if that means buying 'story stocks...
- DATE: 2025-05-07 | TICKER: MKTX | COMPANY: MarketAxess Holdings Inc. | SECTOR: Financial Services | INDUSTRY: Capital Markets
  SNIPPET: MarketAxess tops profit expectations as market volatility fuels record trading : Bond trading platform MarketAxess beat Wall Street estimates for first-quarter profit on Wednesday, as heightened market volatility sparked record trading results. WHY IT'S IMPORTANT Sweeping changes by U.S. President Donald Trump affecting global trade policy have injected volatility in the bond markets and spurred investor engagement. MarketAxess' results offer an insight into the bond market, widely viewed as a more reliable indicator of recession than the stock market.
- DATE: 2025-05-06 | TICKER: BK | COMPANY: The Bank of New York Mellon Corporation | SECTOR: Financial Services | INDUSTRY: Banks - Diversified
  SNIPPET: BNY CEO explains why there is a disconnect in the markets : Markets can move on many things: data, commentary, or even emotions. Speaking with Yahoo Finance Executive Editor Brian Sozzi from the Milken Institute Global Conference, BNY CEO Robin Vince (BK) explains why there is a disconnect between sentiment and facts in the markets right now.&nbsp; To watch more expert insights and analysis on the latest market action, check out more Market Domination Overtime here.
- DATE: 2025-05-31 | TICKER: MS | COMPANY: Morgan Stanley | SECTOR: Financial Services | INDUSTRY: Capital Markets
  SNIPPET: Morgan Stanley's (NYSE:MS) investors will be pleased with their splendid 204% return over the last five years : When you buy shares in a company, it's worth keeping in mind the possibility that it could fail, and you could lose...
- DATE: 2025-05-15 | TICKER: AMP | COMPANY: Ameriprise Financial, Inc. | SECTOR: Financial Services | INDUSTRY: Asset Management
  SNIPPET: Do Options Traders Know Something About Ameriprise Stock We Don't? : Investors need to pay close attention to AMP stock based on the movements in the options market lately.

### Question: What role is artificial intelligence playing in recent finance-related news stories?

#### RAG v1 (no metadata)
Artificial intelligence (AI) is playing a significant role in recent finance-related news stories by driving productivity and enabling companies to streamline operations. For example, Jack Henry integrates AI-driven lending technology to enhance efficiency and reduce human error. Additionally, Palantir utilizes AI to support government and commercial clients in managing generative AI capabilities, reflecting a growing demand in both sectors. Meanwhile, Meta Platforms is focusing on AI investments to potentially increase its stock value beyond its legacy business. Overall, AI is seen as a transformative tool that could significantly impact financial performance in the technology sector.

**Context (top-5):**
- 2025-03-17 | JKHY | Jack Henry (JKHY) Integrates AI-Driven Lending Tech With Algebrik : We recently published a list of 12 AI News Investors Should Not Miss This Week. In this article, we are going to take a look at where Jack Henry & Associates, Inc. (NASDAQ:JKHY) stands against other AI news Investors should not miss this week. Artificial Intelligence (AI) is known to increase productivity, decrease human error, […]
- 2025-05-31 | META | This "Magnificent Seven" Stock Is Set to Skyrocket If Its AI Investments Pay Off : Meta Platforms has investments in several AI applications.  The tech giant's stock is only valued on its legacy business.  Over the past two-and-a-half years, investors have heard about various artificial intelligence (AI) investments that tech companies are making.
- 2025-05-31 | PLTR | Billionaires Are Buying 2 Artificial Intelligence (AI) Stocks That Wall Street Analysts Say Can Soar Up to 240% : Several billionaire hedge fund managers bought shares of Palantir and/or Upstart in the first quarter -- stocks where certain analysts anticipate substantial upside.  Palantir is successfully tapping demand for artificial intelligence (AI) with government and commercial customers, but the stock trades at a very expensive valuation.  Upstart is generating attractive returns for lenders by helping them quantify credit risk with artificial intelligence, and the stock trades at a very reasonable valuation.
- 2025-05-31 | PLTR | Better Artificial Intelligence (AI) Stock: Palantir vs. Snowflake : Shares of both Palantir and Snowflake have delivered healthy gains in 2025 despite the broader stock market weakness.  Palantir stock has shot up 63% this year despite bouts of volatility.  Palantir Technologies helps commercial and government clients integrate generative AI capabilities into their operations with its Artificial Intelligence Platform (AIP), which was launched roughly two years ago.
- 2025-05-29 | NFLX | 2 Underrated Artificial Intelligence (AI) Stocks to Buy and Hold : Generative AI can simplify and speed up many tasks, including content production.  It's easy to see the potential for Netflix, whose content strategy is integral to its success.  Netflix's creations have attracted millions of viewers and won many awards.

#### RAG v2 (with financial metadata)
Artificial intelligence (AI) is emerging as a significant driver of innovation within the financial services sector, as evidenced by various recent news stories. For instance, the fintech company BILL is leveraging AI-driven automation and partnerships to enhance its platform adoption, despite facing market challenges and competition. In contrast, companies like Palantir and BigBear.ai are exploring AI applications beyond government contracts, reflecting a broader trend of integrating advanced technologies in finance management and operations. Additionally, the impact of high borrowing costs on AI-driven tech valuations suggests that while AI offers potential growth, external economic conditions may temper investor enthusiasm in sectors like asset management, as noted with Apollo Global Management. Overall, AI is positioned as a pivotal factor in shaping competitive strategies and market perceptions across different segments of the financial services industry.

**Context (top-5 enriched):**
- DATE: 2025-05-27 | TICKER: RF | COMPANY: Regions Financial Corporation | SECTOR: Financial Services | INDUSTRY: Banks - Regional
  SNIPPET: BILL Holdings Plunges 47% Year to Date: Should You Buy the Stock on Dip? : BILL stock suffers from market challenges and competition, but AI-driven automation, partnerships, and growing platform adoption drive its fintech momentum.
- DATE: 2025-05-31 | TICKER: PYPL | COMPANY: PayPal Holdings, Inc. | SECTOR: Financial Services | INDUSTRY: Credit Services
  SNIPPET: Better AI Stock: Palantir vs. BigBear.ai : Palantir and BigBear.ai are artificial intelligence (AI) stocks involved in the defense industry.  Both companies are also working to move beyond the U.S. government.  Two of the leading artificial intelligence (AI) stocks over the past year are Palantir Technologies (NASDAQ: PLTR) and BigBear.ai (NYSE: BBAI).
- DATE: 2025-05-20 | TICKER: FDS | COMPANY: FactSet Research Systems Inc. | SECTOR: Financial Services | INDUSTRY: Financial Data & Stock Exchanges
  SNIPPET: Do FactSet Research Systems' (NYSE:FDS) Earnings Warrant Your Attention? : It's common for many investors, especially those who are inexperienced, to buy shares in companies with a good story...
- DATE: 2025-05-27 | TICKER: APO | COMPANY: Apollo Global Management, Inc. | SECTOR: Financial Services | INDUSTRY: Asset Management
  SNIPPET: Magnificent 7's AI Froth May Deflate on Rates : High borrowing costs and low risk appetite could puncture future-focused tech valuations
- DATE: 2025-05-21 | TICKER: ALL | COMPANY: The Allstate Corporation | SECTOR: Financial Services | INDUSTRY: Insurance - Property & Casualty
  SNIPPET: Here's Why We Think Allstate (NYSE:ALL) Might Deserve Your Attention Today : Investors are often guided by the idea of discovering 'the next big thing', even if that means buying 'story stocks...

### b) Company-focused

In [32]:
# Company-focused: optionally force ticker relevance (e.g., MSFT, AMZN)
print(f"Number of company-focused questions: {len(questions_company)}")
print("-" * 30)
for i, q in enumerate(questions_company):
    print(f"Q{i + 1}. {q}")

Number of company-focused questions: 2
------------------------------
Q1. How is Microsoft being portrayed in news stories about artificial intelligence?
Q2. What financial news headlines connect Amazon with automation or logistics?


In [33]:
# Question 1 is about MSFT
compare_v1_v2_df(questions_company[0], k=5, tickers={"MSFT"})

### Question: How is Microsoft being portrayed in news stories about artificial intelligence?

#### RAG v1 (no metadata)
The information provided does not mention Microsoft at all. Therefore, I cannot determine how Microsoft is being portrayed in news stories about artificial intelligence based on the given context.

**Context (top-5):**
- 2025-05-31 | META | This "Magnificent Seven" Stock Is Set to Skyrocket If Its AI Investments Pay Off : Meta Platforms has investments in several AI applications.  The tech giant's stock is only valued on its legacy business.  Over the past two-and-a-half years, investors have heard about various artificial intelligence (AI) investments that tech companies are making.
- 2025-05-29 | CRM | How Salesforce has 'overcorrected' by leaning into AI : D.A. Davidson head of technology research Gil Luria joins Market Domination to discuss Salesforce (CRM) earnings and the company's trajectory. Luria says Salesforce is "too focused" on artificial intelligence (AI), as the other parts of its business "rapidly" decelerate and the company loses market share to competitors. Luria has the equivalent of a Sell rating on the stock. To watch more expert insights and analysis on the latest market action, check out more Market Domination here.
- 2025-03-17 | JKHY | Jack Henry (JKHY) Integrates AI-Driven Lending Tech With Algebrik : We recently published a list of 12 AI News Investors Should Not Miss This Week. In this article, we are going to take a look at where Jack Henry & Associates, Inc. (NASDAQ:JKHY) stands against other AI news Investors should not miss this week. Artificial Intelligence (AI) is known to increase productivity, decrease human error, […]
- 2025-05-29 | NFLX | 2 Underrated Artificial Intelligence (AI) Stocks to Buy and Hold : Generative AI can simplify and speed up many tasks, including content production.  It's easy to see the potential for Netflix, whose content strategy is integral to its success.  Netflix's creations have attracted millions of viewers and won many awards.
- 2025-05-30 | META | Meta (META) AI Reaches 1 Billion Users, Eyes Paid Features and Subscriptions : We recently published a list of 10 AI Stocks on Wall Street’s Radar. In this article, we are going to take a look at where Meta Platforms, Inc. (NASDAQ:META) stands against other AI stocks on Wall Street’s radar. Meta Platforms, Inc. (NASDAQ:META) is a global technology company. On May 28, CNBC reported that Meta Platforms, Inc. (NASDAQ:META)’s artificial […]

#### RAG v2 (with financial metadata)
Microsoft (MSFT) is portrayed positively in news stories regarding artificial intelligence, highlighting its robust partnerships and significant market position. The company is noted for its collaboration with AI giants like Amazon, which has helped boost ServiceNow's stock, indicating Microsoft's influence in the AI sector. Additionally, following its most recent earnings report, Microsoft’s stock has risen by 7.8%, suggesting investor confidence in its growth prospects related to AI technologies. However, amidst these positives, Microsoft also faces challenges, as seen in layoffs within LinkedIn and broader tech sector cutbacks. Overall, Microsoft's role in AI appears essential as it navigates both opportunities and hurdles in the industry.

**Context (top-5 enriched):**
- DATE: 2025-05-30 | TICKER: MSFT | COMPANY: Microsoft Corporation | SECTOR: Technology | INDUSTRY: Software - Infrastructure
  SNIPPET: Marvell Stock Slides. Why It Could Be the Cheap AI Chip Play. : The company’s earnings didn’t dispel concerns it might lose out on designing Amazon’s Trainium AI chips. Still, analysts are upbeat.
- DATE: 2025-05-30 | TICKER: MSFT | COMPANY: Microsoft Corporation | SECTOR: Technology | INDUSTRY: Software - Infrastructure
  SNIPPET: ServiceNow Regenerates On Swarm Of AI Deals With Amazon, Microsoft And More : Teaming up with AI giants like Amazon, Microsoft and others, ServiceNow stock has rebounded and stands poised to break out.
- DATE: 2025-05-30 | TICKER: MSFT | COMPANY: Microsoft Corporation | SECTOR: Technology | INDUSTRY: Software - Infrastructure
  SNIPPET: Why Is Microsoft (MSFT) Up 7.8% Since Last Earnings Report? : Microsoft (MSFT) reported earnings 30 days ago. What's next for the stock? We take a look at earnings estimates for some clues.
- DATE: 2025-05-30 | TICKER: MSFT | COMPANY: Microsoft Corporation | SECTOR: Technology | INDUSTRY: Software - Infrastructure
  SNIPPET: LinkedIn cuts 281 workers in California as tech layoffs continue : The professional social network is owned by Microsoft, which announced earlier this month that it was slashing 3% of its global workforce.
- DATE: 2025-05-30 | TICKER: MSFT | COMPANY: Microsoft Corporation | SECTOR: Technology | INDUSTRY: Software - Infrastructure
  SNIPPET: Judge Weighs Big Changes to Google, Including Breakup, AI Limits : (Bloomberg) -- The federal judge who will decide how to limit Google’s monopoly in search is considering its advantage in artificial intelligence, and aiming to minimize harm to the other players in the market with any resolution. Most Read from BloombergBillionaire Steve Cohen Wants NY to Expand Taxpayer-Backed FerryNow With Colorful Blocks, Tirana’s Pyramid Represents a Changing AlbaniaNYC Congestion Toll Brings In $216 Million in First Four MonthsThe Economic Benefits of Paying Workers to Mov

In [34]:
# Question 2 about AMZN
compare_v1_v2_df(questions_company[1], k=5, tickers={"AMZN"})

### Question: What financial news headlines connect Amazon with automation or logistics?

#### RAG v1 (no metadata)
The financial news headlines that connect Amazon with automation or logistics include a mention of Amazon's integration of artificial intelligence across all aspects of its services, as discussed by Matt Garman, CEO of Amazon Web Services, on May 30, 2025. Additionally, on May 23, 2025, there was an analysis featuring C.H. Robinson Worldwide, which looked at the logistics sector, although it did not directly link Amazon to the logistics discussion. However, given Amazon's significant role in e-commerce and logistics, it remains a key player in the conversation about automation in these sectors. Overall, the context highlights Amazon's commitment to leveraging AI, which includes enhancing its logistics and operations.

**Context (top-5):**
- 2025-05-25 | TFC | Truist Reiterates Buy on Amazon.com (AMZN) as Q2 Revenue Tracks Ahead : We recently published a list of 10 AI Stocks on Wall Street’s Radar. In this article, we are going to take a look at where Amazon.com Inc. (NASDAQ:AMZN) stands against other AI stocks on Wall Street’s radar. Amazon.com Inc. (NASDAQ:AMZN) is an American technology company offering e-commerce, cloud computing, and other services, including digital streaming […]
- 2025-05-30 | AMZN | Amazon's AI Roadmap With AWS CEO Garman : Every aspect of Amazon is leveraging artificial intelligence, says Matt Garman, CEO of Amazon Web Services. Garman discusses Amazon's AI roadmap and reflects on his first year in the role with Ed Ludlow on "Bloomberg Technology."
- 2025-05-23 | TFC | Woodward's Volumes, Automation to Drive Earnings Growth, Truist Says : Woodward's (WWD) increasing volumes, pricing, automation, and products will push its aerospace margi
- 2025-04-30 | AON | Top Stock Reports for Amazon.com, Johnson & Johnson & Cisco Systems : Today's Research Daily features new research reports on 16 major stocks, including Amazon.com, Inc. (AMZN), Johnson & Johnson (JNJ) and Cisco Systems, Inc. (CSCO), as well as a micro-cap NeurAxis, Inc. (NRXS).
- 2025-05-23 | CHRW | Winners And Losers Of Q1: C.H. Robinson Worldwide (NASDAQ:CHRW) Vs The Rest Of The Air Freight and Logistics Stocks : As the craze of earnings season draws to a close, here’s a look back at some of the most exciting (and some less so) results from Q1. Today, we are looking at air freight and logistics stocks, starting with C.H. Robinson Worldwide (NASDAQ:CHRW).

#### RAG v2 (with financial metadata)
The financial news connecting Amazon (AMZN) with automation and AI highlights the company's significant investment in artificial intelligence, exceeding $100 billion this year to enhance efficiency and revenue growth within its operations. AWS CEO Matt Garman emphasized that every aspect of Amazon is leveraging AI, indicating a substantial focus on automation across the company's logistics and retail sectors. Additionally, Amazon's collaborations with other tech giants in AI development, such as ServiceNow and Microsoft, underscore the strategic importance of automation in enhancing operational efficiency and driving future growth, positioning Amazon favorably in the competitive Internet Retail industry.

**Context (top-5 enriched):**
- DATE: 2025-05-30 | TICKER: AMZN | COMPANY: Amazon.com, Inc. | SECTOR: Consumer Cyclical | INDUSTRY: Internet Retail
  SNIPPET: Amazon's AI Roadmap With AWS CEO Garman : Every aspect of Amazon is leveraging artificial intelligence, says Matt Garman, CEO of Amazon Web Services. Garman discusses Amazon's AI roadmap and reflects on his first year in the role with Ed Ludlow on "Bloomberg Technology."
- DATE: 2025-05-30 | TICKER: AMZN | COMPANY: Amazon.com, Inc. | SECTOR: Consumer Cyclical | INDUSTRY: Internet Retail
  SNIPPET: ServiceNow Regenerates On Swarm Of AI Deals With Amazon, Microsoft And More : Teaming up with AI giants like Amazon, Microsoft and others, ServiceNow stock has rebounded and stands poised to break out.
- DATE: 2025-05-31 | TICKER: AMZN | COMPANY: Amazon.com, Inc. | SECTOR: Consumer Cyclical | INDUSTRY: Internet Retail
  SNIPPET: 3 Soaring Stocks I'd Buy Now With No Hesitation : Amazon is seeing strong revenue growth and efficiency gains coming from AI.  Dutch Bros has a two huge growth drivers in front of it.  Philip Morris' growth is being powered by its smokeless portfolio.
- DATE: 2025-05-31 | TICKER: AMZN | COMPANY: Amazon.com, Inc. | SECTOR: Consumer Cyclical | INDUSTRY: Internet Retail
  SNIPPET: Marvell price target raised to $70 from $60 at TD Cowen : TD Cowen raised the firm’s price target on Marvell (MRVL) to $70 from $60 and keeps a Buy rating on the shares. The firm said an in-line print/guide with strong language on 3nm engagement with Amazon (AMZN), but “multiple paths” commentary is likely to continue to concern investors who will be hoping for more detail at the June AI webinar. Long-term momentum is there, but lack of “upside” in a strong spending environment, and inherent limited visibility in custom is likely to keep the stock a ba
- DATE: 2025-05-31 | TICKER: AMZN | COMPANY: Amazon.com, Inc. | SECTOR: Consumer Cyclical | INDUSTRY: Internet Retail
  SNIPPET: 2 Best Stocks to Buy With $1,000 Right Now : Taiwan Semiconductor expects demand to double in 2025.  Amazon is investing more than $100 billion in its artificial intelligence (AI) business this year alone.  If you're looking for reliable, low-risk stocks that could deliver outstanding returns over time, I recommend Taiwan Semiconductor (NYSE: TSM) and Amazon (NASDAQ: AMZN).

### c) Industry-focused

In [35]:
# Industry-focused: optionally force sector/industry relevance (e.g., Technology/Semiconductors)
print(f"Number of industry-focused questions: {len(questions_industry)}")
print("-" * 30)
for i, q in enumerate(questions_industry):
    print(f"Q{i + 1}. {q}")

Number of industry-focused questions: 3
------------------------------
Q1. What are the main themes emerging in financial news about the semiconductor industry?
Q2. What trends are being reported in the retail industry?
Q3. What risks or challenges are discussed in recent news about the energy industry?


In [36]:
unique_industries = df_meta["INDUSTRY"].dropna().unique()
unique_industries.sort()
print(f"Unique industries ({len(unique_industries)}):")
print("-" * 30)
print(unique_industries)

Unique industries (113):
------------------------------
['Advertising Agencies' 'Aerospace & Defense' 'Agricultural Inputs'
 'Airlines' 'Apparel Manufacturing' 'Apparel Retail' 'Asset Management'
 'Auto & Truck Dealerships' 'Auto Manufacturers' 'Auto Parts'
 'Banks - Diversified' 'Banks - Regional' 'Beverages - Brewers'
 'Beverages - Non-Alcoholic' 'Biotechnology' 'Building Materials'
 'Building Products & Equipment' 'Capital Markets' 'Chemicals'
 'Communication Equipment' 'Computer Hardware' 'Confectioners'
 'Conglomerates' 'Consulting Services' 'Consumer Electronics' 'Copper'
 'Credit Services' 'Diagnostics & Research' 'Discount Stores'
 'Drug Manufacturers - General' 'Drug Manufacturers - Specialty & Generic'
 'Electrical Equipment & Parts' 'Electronic Components'
 'Electronic Gaming & Multimedia' 'Engineering & Construction'
 'Entertainment' 'Farm & Heavy Construction Machinery' 'Farm Products'
 'Financial Data & Stock Exchanges' 'Food Distribution'
 'Footwear & Accessories' 'Furni

In [37]:
# Question 1 is about semiconductors
compare_v1_v2_df(questions_industry[0], k=5, industry="Semiconductors")

### Question: What are the main themes emerging in financial news about the semiconductor industry?

#### RAG v1 (no metadata)
The financial news surrounding the semiconductor industry highlights a strong focus on international revenue trends, which are critical for companies like ON Semiconductor Corp. as they impact Wall Street forecasts and investor sentiment. Additionally, there's notable attention from investors and analysts, particularly regarding ON Semiconductor and its earnings performance, despite some optimism despite soft earnings. Furthermore, the recent analysis of stocks within the sector, including considerations of potential upside and strategic evaluations of various semiconductor companies, suggests a keen interest in identifying promising investment opportunities.

**Context (top-5):**
- 2025-05-13 | ON | Investing in ON Semiconductor Corp. (ON)? Don't Miss Assessing Its International Revenue Trends : Explore ON Semiconductor Corp.'s (ON) international revenue trends and how these numbers impact Wall Street's forecasts and what's ahead for the stock.
- 2025-05-21 | ON | ON Semiconductor Corporation (ON) is Attracting Investor Attention: Here is What You Should Know : Recently, Zacks.com users have been paying close attention to ON Semiconductor Corp. (ON). This makes it worthwhile to examine what the stock has in store.
- 2025-05-12 | ON | Some May Be Optimistic About ON Semiconductor's (NASDAQ:ON) Earnings : Soft earnings didn't appear to concern ON Semiconductor Corporation's ( NASDAQ:ON ) shareholders over the last week...
- 2025-05-29 | ADI | Spotting Winners: Vishay Intertechnology (NYSE:VSH) And Analog Semiconductors Stocks In Q1 : The end of an earnings season can be a great time to discover new stocks and assess how companies are handling the current business environment. Let’s take a look at how Vishay Intertechnology (NYSE:VSH) and the rest of the analog semiconductors stocks fared in Q1.
- 2025-05-11 | ON | ON Semiconductor (ON): Among Billionaire Glenn Russell Dubin’s Stock Picks with Huge Upside Potential : We recently published a list of Billionaire Glenn Russell Dubin’s 10 Stock Picks with Huge Upside Potential. In this article, we are going to take a look at where ON Semiconductor Corporation (NASDAQ:ON) stands against Billionaire Glenn Russell Dubin’s other stock picks with huge upside potential. Glenn Russell Dubin is one of the industry’s most […]

#### RAG v2 (with financial metadata)
The financial news about the semiconductor industry, particularly regarding ON Semiconductor Corporation (ON), highlights several key themes. Firstly, there is a focus on international revenue trends and their potential impact on Wall Street forecasts, suggesting that geographic performance may influence investor sentiment significantly. Additionally, ON has garnered increased attention from investors, especially following discussions of its earnings outcomes, which, despite appearing soft, did not deter shareholder optimism. Furthermore, industry insights suggest that high-profile investors, such as billionaire Glenn Russell Dubin, view ON as a stock with substantial upside potential, reflecting a broader interest in identifying promising companies within the semiconductor sector. Overall, the themes indicate a mix of cautious optimism and scrutiny over financial performance and market positioning.

**Context (top-5 enriched):**
- DATE: 2025-05-13 | TICKER: ON | COMPANY: ON Semiconductor Corporation | SECTOR: Technology | INDUSTRY: Semiconductors
  SNIPPET: Investing in ON Semiconductor Corp. (ON)? Don't Miss Assessing Its International Revenue Trends : Explore ON Semiconductor Corp.'s (ON) international revenue trends and how these numbers impact Wall Street's forecasts and what's ahead for the stock.
- DATE: 2025-05-21 | TICKER: ON | COMPANY: ON Semiconductor Corporation | SECTOR: Technology | INDUSTRY: Semiconductors
  SNIPPET: ON Semiconductor Corporation (ON) is Attracting Investor Attention: Here is What You Should Know : Recently, Zacks.com users have been paying close attention to ON Semiconductor Corp. (ON). This makes it worthwhile to examine what the stock has in store.
- DATE: 2025-05-12 | TICKER: ON | COMPANY: ON Semiconductor Corporation | SECTOR: Technology | INDUSTRY: Semiconductors
  SNIPPET: Some May Be Optimistic About ON Semiconductor's (NASDAQ:ON) Earnings : Soft earnings didn't appear to concern ON Semiconductor Corporation's ( NASDAQ:ON ) shareholders over the last week...
- DATE: 2025-05-29 | TICKER: ADI | COMPANY: Analog Devices, Inc. | SECTOR: Technology | INDUSTRY: Semiconductors
  SNIPPET: Spotting Winners: Vishay Intertechnology (NYSE:VSH) And Analog Semiconductors Stocks In Q1 : The end of an earnings season can be a great time to discover new stocks and assess how companies are handling the current business environment. Let’s take a look at how Vishay Intertechnology (NYSE:VSH) and the rest of the analog semiconductors stocks fared in Q1.
- DATE: 2025-05-11 | TICKER: ON | COMPANY: ON Semiconductor Corporation | SECTOR: Technology | INDUSTRY: Semiconductors
  SNIPPET: ON Semiconductor (ON): Among Billionaire Glenn Russell Dubin’s Stock Picks with Huge Upside Potential : We recently published a list of Billionaire Glenn Russell Dubin’s 10 Stock Picks with Huge Upside Potential. In this article, we are going to take a look at where ON Semiconductor Corporation (NASDAQ:ON) stands against Billionaire Glenn Russell Dubin’s other stock picks with huge upside potential. Glenn Russell Dubin is one of the industry’s most […]

In [38]:
# Question 2 is about the retail industry
compare_v1_v2_df(questions_industry[1], k=5, industry="Internet Retail")

### Question: What trends are being reported in the retail industry?

#### RAG v1 (no metadata)
The retail industry is currently experiencing volatility, with demand showing sensitivity to changes in consumer spending. Recent reports indicate that retail stocks have fallen by 13.7% over the past six months, a performance significantly worse than the S&P 500’s 5.5% loss. Furthermore, retailers are adapting to external pressures, such as evolving shopping behaviors and trade-war implications, by adjusting their supply chains and implementing price increases. Overall, the industry's performance is linked closely to economic cycles, which have contributed to a recent downturn of 12.3% in consumer discretionary businesses.

**Context (top-5):**
- 2025-05-12 | KMX | 3 Consumer Stocks That Concern Us : Retailers are adapting their business models as technology changes how people shop. Still, demand can be volatile as the industry is exposed to the ups and downs of consumer spending. This has stirred some uncertainty lately as retail stocks have tumbled by 13.7% over the past six months. This performance was worse than the S&P 500’s 5.5% loss.
- 2025-05-29 | BBY | Retailers, Ducking Trade-War Curveballs, Stick to Their Plans : As legal rulings roll in on Trump’s tariff policies, retail executives say they have shifted their supply chains and many price increases already have hit shelves.
- 2025-05-22 | HLT | 3 Consumer Stocks Skating on Thin Ice : The performance of consumer discretionary businesses is closely linked to economic cycles. Over the past six months, it seems like demand trends are working against their favor as the industry has tumbled by 12.3%. This drop was significantly worse than the S&P 500’s 2.1% decline.
- 2025-04-26 | PKG | Packaging Corporation of America (NYSE:PKG) Hasn't Managed To Accelerate Its Returns : What trends should we look for it we want to identify stocks that can multiply in value over the long term? Amongst...
- 2025-05-14 | APD | Air Products and Chemicals (NYSE:APD) Will Be Hoping To Turn Its Returns On Capital Around : To find a multi-bagger stock, what are the underlying trends we should look for in a business? Firstly, we'll want to...

#### RAG v2 (with financial metadata)
The retail industry, particularly in the Internet Retail segment, is experiencing notable trends amid a fluctuating market environment. Companies like eBay (EBAY) are reaching new highs as the market benefits from the easing of trade tensions, particularly related to tariffs under President Trump. Meanwhile, DoorDash (DASH) is seeing an impressive earnings projection of a 647% increase, reflecting strong consumer demand, although it faces challenges with a recent stock drop despite reporting significant quarterly profits. Additionally, a rise in credit card delinquencies and Buy Now Pay Later (BNPL) usage indicates shifting consumer purchasing behaviors, hinting at potential financial stress among consumers. Overall, the retail sector is adapting to these dynamics while seeking growth opportunities amidst the evolving landscape.

**Context (top-5 enriched):**
- DATE: 2025-05-23 | TICKER: DASH | COMPANY: DoorDash, Inc. | SECTOR: Consumer Cyclical | INDUSTRY: Internet Retail
  SNIPPET: This Online Retail Stock Sprints To Entry; Earnings Are Seen Soaring 647% : This online retail stock is offering an opportunity as it sprints toward an entry. Strong earnings are also seen ahead for the equity, which is up around 20% already this year.
- DATE: 2025-05-16 | TICKER: EBAY | COMPANY: eBay Inc. | SECTOR: Consumer Cyclical | INDUSTRY: Internet Retail
  SNIPPET: Retail Leaders Savor Trump Tariff Unwinding. Two Stocks Hit Milestones. : Ebay and Urban Outfitters are hitting new highs as the stock market rallies on President Trump's tariff pause, China trade deal.
- DATE: 2025-05-23 | TICKER: DASH | COMPANY: DoorDash, Inc. | SECTOR: Consumer Cyclical | INDUSTRY: Internet Retail
  SNIPPET: More consumers are buying now and paying never, a new warning sign : Alongside credit card delinquency, BNPL has become a novel gauge to track.
- DATE: 2025-05-14 | TICKER: DASH | COMPANY: DoorDash, Inc. | SECTOR: Consumer Cyclical | INDUSTRY: Internet Retail
  SNIPPET: May 2025's Top Stock Selections Estimated Below Intrinsic Value : The United States market has experienced a notable upswing, rising 5.3% over the last week and 12% over the past year, with earnings projected to grow by 14% annually. In this environment, identifying stocks that are potentially undervalued can be an effective strategy for investors seeking opportunities that may offer growth potential relative to their intrinsic value.
- DATE: 2025-05-13 | TICKER: DASH | COMPANY: DoorDash, Inc. | SECTOR: Consumer Cyclical | INDUSTRY: Internet Retail
  SNIPPET: DoorDash Stock (DASH) Falls as Record Profits Meet Aggressive Expansion : Delivery service DoorDash (DASH) reported its largest quarterly profit to date at $193 million for Q1 2025 — a remarkable swing from a $23 million loss this time last year. Yet, shares have tumbled over 10% as the company also announced two major acquisitions totaling $5 billion, leaving investors wondering whether the food delivery giant is growing too fast for its own good. The answer isn’t immediately clear. Therefore, investors might want to hold off on this one for now and let the dust sett

In [39]:
# Question 3 is about the energy industry
compare_v1_v2_df(questions_industry[2], k=5, sector="Energy")

### Question: What risks or challenges are discussed in recent news about the energy industry?

#### RAG v1 (no metadata)
Recent news highlights significant challenges for the energy industry, particularly in the renewable sector. A bill advancing in Congress threatens to repeal crucial subsidies, potentially making renewable energy projects uneconomical and leading to a crash in renewable energy stocks. Additionally, the oilfield service sector faces pressures from sliding oil prices, rising tariffs, and shrinking drilling budgets, raising concerns about future profitability. Companies in this space are left to ponder whether new demands for LNG and AI can provide sufficient support amidst these headwinds.

**Context (top-5):**
- 2025-05-23 | NEE | Renewable Energy Stocks Crash as U.S. Advances Bill That Could Decimate the Industry : Congress is pushing forward a bill that could upend the renewable energy industry.  Just as companies have ramped up production and renewable electricity generation in the U.S., those projects may become uneconomical.  The news was about as bad as it could get for renewable energy stocks this week as the U.S. House of Representatives early Thursday passed a bill that will repeal some of the most important subsidies for the industry if it becomes law.
- 2025-05-23 | ENPH | Renewable Energy Stocks Crash as U.S. Advances Bill That Could Decimate the Industry : Congress is pushing forward a bill that could upend the renewable energy industry.  Just as companies have ramped up production and renewable electricity generation in the U.S., those projects may become uneconomical.  The news was about as bad as it could get for renewable energy stocks this week as the U.S. House of Representatives early Thursday passed a bill that will repeal some of the most important subsidies for the industry if it becomes law.
- 2025-05-21 | HAL | Tariffs, Prices, and Pain: What's Next for Oilfield Service? : The likes of SLB, HAL and BKR face a tough future as oil prices slide, tariffs rise and drilling budgets shrink - can LNG and AI demand offer enough support?
- 2025-05-21 | BKR | Tariffs, Prices, and Pain: What's Next for Oilfield Service? : The likes of SLB, HAL and BKR face a tough future as oil prices slide, tariffs rise and drilling budgets shrink - can LNG and AI demand offer enough support?
- 2025-05-21 | FCX | 3 American Companies Investors Need to Know Amid Trump's Tariff Wars : Copper is a critical metal for the U.S. industrial economy.  This American appliance maker expects the Trump administration to close loopholes that will improve its competitive positioning.  It's difficult to predict precisely what the tariff landscape will look like when the dust settles on the trade conflict, but we can say some things with a high degree of certainty.

#### RAG v2 (with financial metadata)
Recent news highlights several significant challenges facing the energy industry, particularly for oilfield service companies like Halliburton (HAL), Baker Hughes (BKR), and Schlumberger (SLB). These firms are grappling with declining oil prices, rising tariffs, and shrinking drilling budgets, which threaten their profitability and operational stability. Additionally, the trade tensions exacerbated by government policies, including tariffs, pose an ongoing risk to the sector. The question remains whether emerging demand from liquefied natural gas (LNG) and artificial intelligence (AI) applications can sufficiently offset these challenges in the near future.

**Context (top-5 enriched):**
- DATE: 2025-05-21 | TICKER: HAL | COMPANY: Halliburton Company | SECTOR: Energy | INDUSTRY: Oil & Gas Equipment & Services
  SNIPPET: Tariffs, Prices, and Pain: What's Next for Oilfield Service? : The likes of SLB, HAL and BKR face a tough future as oil prices slide, tariffs rise and drilling budgets shrink - can LNG and AI demand offer enough support?
- DATE: 2025-05-21 | TICKER: BKR | COMPANY: Baker Hughes Company | SECTOR: Energy | INDUSTRY: Oil & Gas Equipment & Services
  SNIPPET: Tariffs, Prices, and Pain: What's Next for Oilfield Service? : The likes of SLB, HAL and BKR face a tough future as oil prices slide, tariffs rise and drilling budgets shrink - can LNG and AI demand offer enough support?
- DATE: 2025-05-10 | TICKER: SLB | COMPANY: Schlumberger Limited | SECTOR: Energy | INDUSTRY: Oil & Gas Equipment & Services
  SNIPPET: Schlumberger Limited (SLB): Among the Best Energy Stocks to Buy Right Now : We recently published a list of the 13 Best Energy Stocks to Buy Right Now. In this article, we are going to take a look at where Schlumberger Limited (NYSE:SLB) stands against other best energy stocks. The worldwide energy industry has recently been rattled by a combination of factors, including the trade war sparked by President […]
- DATE: 2025-05-29 | TICKER: WMB | COMPANY: The Williams Companies, Inc. | SECTOR: Energy | INDUSTRY: Oil & Gas Midstream
  SNIPPET: Trump’s New York Pipeline Dreams Inch Closer to Reality. Big Hurdles Remain. : The projects would be a boon for Pennsylvania gas producers such as Expand Energy and Coterra Energy.
- DATE: 2025-05-15 | TICKER: FANG | COMPANY: Diamondback Energy, Inc. | SECTOR: Energy | INDUSTRY: Oil & Gas E&P
  SNIPPET: A Trade Made for Buffett: Energy Stocks Priced Below Book Value : (Bloomberg) -- Here’s something you don’t see in the market too often: A third of all mid- and small-cap oil and gas stocks in the US are now trading below their book values.Most Read from BloombergAs Coastline Erodes, One California City Considers ‘Retreat Now’How a Highway Became San Francisco’s Newest ParkPower-Hungry Data Centers Are Warming Homes in the NordicsMaryland’s Credit Rating Gets Downgraded as Governor Blames Trump NYC Commuters Brace for Chaos as NJ Transit Strike LoomsThat’s the

## Analysis & Questions - Section 2

### Instructions: Evaluate Answers With and Without Metadata

For each question, compare the two answers provided:
- One generated **without** metadata
- One generated **with** metadata

---

### Steps:

1. Use the following evaluation criteria:
   - Clarity
   - Detail & Depth
   - Use of Context
   - Accuracy & Grounding
   - Relevance
   - Narrrative Flow

2. For each criterion, write brief notes comparing how the answer **without metadata** performs versus the answer **with metadata**.

3. Summarize your evaluation in a markdown table with the following columns:

| Criteria       | WITHOUT METADATA            | WITH METADATA             |
|----------------|----------------------------|--------------------------|
| Clarity        | [Your brief note here]     | [Your brief note here]   |
| Detail & Depth         | [Your brief note here]     | [Your brief note here]   |
| Use of Context        | [Your brief note here]     | [Your brief note here]   |
| Accuracy & Grounding       | [Your brief note here]     | [Your brief note here]   |
| Relevance      | [Your brief note here]     | [Your brief note here]   |
| Narrative Flow      | [Your brief note here]     | [Your brief note here]   |

---

**Note:** Keep comments short and clear for easy comparison.



**RESPONSE:**

| Criteria                 | WITHOUT METADATA                                                                                                                                                                                                             | WITH METADATA                                                                                                                                                                                  |
| ------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Clarity**              | The answers are generally clear but lack specific company context, making them less precise.                                                                                                                                 | The answers are highly clear, as they directly tie information to specific companies, sectors, and industries.                                                                                 |
| **Detail & Depth**       | Provides a broad, generalized summary of financial trends, often citing news without linking it to specific entities. The depth is limited to the provided news snippets.                                                    | Offers a richer, more detailed answer by explicitly mentioning companies (e.g., BlackRock, Goldman Sachs, Microsoft) and their respective sectors, providing a more granular view.             |
| **Use of Context**       | The model struggles with company-specific questions (like Microsoft), returning a "lack of information" response because its retrieval mechanism isn't explicitly tied to tickers.                                           | The model effectively uses the enriched context. For example, it correctly identifies Microsoft's partnerships and Amazon's AI investments, directly answering the questions.                  |
| **Accuracy & Grounding** | Answers are accurate relative to the retrieved snippets, but can be misleading due to the lack of specific entity-based grounding. For example, it might mention a trend without identifying which companies are driving it. | The answers are highly grounded and accurate. By including metadata, the model can explicitly reference companies and their roles, making the information more verifiable and reliable.        |
| **Relevance**            | The answers are relevant to the general topic but may not be the most precise. For example, for the Amazon question, the answer is a little generic and pulls in irrelevant context.                                         | The answers are highly relevant and focused. The metadata ensures that the retrieved context is more specific to the query, leading to a more targeted response.                               |
| **Narrative Flow**       | The text often reads as a collection of loosely connected facts, as it combines snippets from various companies and topics without a unifying thread.                                                                        | The narrative flows more cohesively. The inclusion of company names and sectors allows the model to build a more logical and structured narrative that directly addresses the user's question. |

This evaluation clearly demonstrates that enriching a Retrieval-Augmented Generation (RAG) pipeline with structured financial metadata (such as company name, ticker symbol, sector, and industry) yields substantial performance gains across all key evaluation dimensions. Compared to the baseline semantic similarity approach, the metadata-enhanced pipeline consistently delivers answers that are clearer, more accurate, and better grounded in verifiable context, while also maintaining a coherent narrative structure.

The strategic integration of metadata acts as a contextual anchor for the Large Language Model (LLM), reducing ambiguity and ensuring that generated outputs remain closely tied to relevant entities and their roles. This not only mitigates the risk of vague or misleading responses but also enhances trustworthiness, an essential requirement in high-stakes domains like finance.

From a broader perspective, this approach exemplifies the principles of context engineering: the deliberate structuring and enrichment of information to optimize LLM performance. Its scalability and domain-agnostic nature make it a viable strategy for other sectors where entity-level metadata is available, paving the way for more precise, reliable, and explainable AI systems.