# MARKET DATA AGENT

The purpose of this notebook is to test a tool in isolation from the entire graph to confirm proper implementation.

**NOTE**: You must have your `GOOGLE_API_KEY` defined in your .env file. You can get this API key here: https://aistudio.google.com/app/apikey

In [1]:
from langgraph.checkpoint.memory import MemorySaver
from langchain_core.messages import HumanMessage, AIMessage
from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import create_react_agent
from langchain_core.runnables import RunnableLambda
from langchain.tools import tool
from langchain_community.utilities import GoogleSerperAPIWrapper
import os
import pprint
from currensee.core import get_model, settings
from dotenv import load_dotenv
import pandas_datareader.data as web
from datetime import datetime, timedelta
import yfinance as yf
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt


# Load environment variables
load_dotenv()

# Set environment variable for SERPER API key
#os.environ["SERPER_API_KEY"] = "4369e38ae59aff075549b44c923813da127c06ef"


#%pip install --upgrade --quiet  langchain-community
#%pip install pandas_datareader


True

In [2]:
from dotenv import load_dotenv

load_dotenv()

True

## Agent

An agent requires:
* the model
* tools
* a name
* a prompt describing the purpose of the tool


In [3]:
# Initialize the model
model = get_model(settings.DEFAULT_MODEL)

# Initialize Google Serper API Wrapper for data retrieval
serper_api = GoogleSerperAPIWrapper()


In [4]:
# Function to summarize the outputs from any number of tools
def summarize_outputs(tool_outputs: list) -> str:
    """
    Summarizes the outputs from all provided tools into one coherent summary.
    
    Parameters:
    - tool_outputs: A list of strings (outputs from different tools)
    
    Returns:
    - A summarized string with key points from all the tool outputs.
    """
    # Combine all outputs into a formatted prompt
    combined_prompt = "\n\n".join(
        [f"Tool {i+1} Output:\n{output}" for i, output in enumerate(tool_outputs)]
    )
    combined_prompt += "\n\nPlease summarize the key points from all the outputs into one concise, long summary. Include specific numbers where applicable."
    
    # Create the messages to pass to the model
    messages = [
        HumanMessage(content=combined_prompt)
    ]
    
    # Use the 'invoke' method for summarization
    summary = model.invoke(messages)
    
    # Access the message content correctly
    return summary.content  # Return the content of the AIMessage

# Function for a generic tool to retrieve information
def query_tool(query: str, tool_func) -> str:
    """Generic function to query a tool and retrieve the output."""
    return tool_func(query)

# Initialize MemorySaver to keep track of the state if needed
memory_saver = MemorySaver()

# Define a React agent to invoke any number of tools and summarize the results
def agent_function(queries: list, tool_functions: list) -> str:
    """
    This function accepts any number of queries and corresponding tool functions,
    retrieves results, and returns a summarized output.
    
    Parameters:
    - queries: List of queries to be passed to tools
    - tool_functions: List of tool functions to process the queries
    
    Returns:
    - A summarized string of all tool outputs
    """
    # Collect outputs from all tools
    tool_outputs = []
    for query, tool_func in zip(queries, tool_functions):
        #output = query_tool(query, tool_func)
        output = tool_func.invoke(query)
        tool_outputs.append(output)
    
    # Summarize the outputs from all tools
    final_summary = summarize_outputs(tool_outputs)
    return final_summary




### FINANCIAL TOOLS

In [5]:
#initial synthesized prompt?

# Parameters
client_name = "Walmart"
start_date = "1/31/2025"
end_date = "4/30/2025"
industry = "retail"
largest_holdings = ["Microsoft","Netflix","Netflix","tech sector"]
#Future: portfolio tickers(for data and news)


In [6]:
#definitions

keywords_client = ["announces", "acquires", "launches", "earnings", "report", 
               "profit", "CEO", "crisis", "disaster","recession","recovery", "red flag", 
               "urgent","challenge","emergency", "tumble","drop","opportunity","slowdown"]

keywords_econ = ["recovery","crisis", "disaster","recession","red flag", 
               "urgent","challenge","emergency", "tumble","drop","slowdown"]

trusted_sources = ["reuters.com", "bloomberg.com", "cnn.com", "forbes.com", 
                "finance.yahoo.com","marketwatch.com","morningstar.com", "https://www.wsj.com","www.ft.com"]

# Define trusted sources
allowed_sites = trusted_sources
site_filter = " OR ".join(f"site:{site}" for site in allowed_sites)

**TOOL: CLIENT AND INDUSTRY - Search for top k news articles about the client and its industry (title, snippet, date, source)**

In [7]:
# Function to retrieve news about client and its industry (Client and Industry)
@tool
def client_and_industry(query_ci: str) -> str:
    """Return the most relevant news about the client and its industry."""
    def format_google_date(date_str):
        parts = date_str.split("/")
        return f"{parts[2]}{parts[0].zfill(2)}{parts[1].zfill(2)}"

    google_start = format_google_date(start_date)
    google_end = format_google_date(end_date)

    sort_param = f"date:r:{google_start}:{google_end}"  # Google's date range format

    # Search with date filter
    search = GoogleSerperAPIWrapper(k=30, sort=sort_param)  # Pass sort parameter
    results = search.results(query_ci)

    def score_result(result):
        score = 0
        keywords = keywords_client  # Assume this variable is defined elsewhere
        link = result.get("link", "")
        title = result.get("title", "").lower()
        snippet = result.get("snippet", "").lower()

        if any(site in link for site in allowed_sites):
            score += 3

        if any(word in title or word in snippet for word in keywords):
            score += 2

        if "date" in result:
            score += 1

        return score

    if results.get("organic"):
        sorted_results_client = sorted(results.get("organic", []), key=score_result, reverse=True)
        return sorted_results_client
    else:
        return "No results found for client or industry news."



**TOOL: MACRO NEWS - Search for relevant articles about the economy (title, snippet, date, source)**

In [8]:
# Function to retrieve macroeconomic events news (Tool MACRO NEWS)
@tool
def macro_news(query_mn: str) -> str:
    """Return the most relevant macroeconomic news based on the query."""
    def format_google_date(date_str):
        parts = date_str.split("/")
        return f"{parts[2]}{parts[0].zfill(2)}{parts[1].zfill(2)}"

    google_start = format_google_date(start_date)
    google_end = format_google_date(end_date)

    sort_param = f"date:r:{google_start}:{google_end}"

    search = GoogleSerperAPIWrapper(k=30, sort=sort_param)  # Pass sort parameter
    results = search.results(query_mn)

    def score_result(result):
        score = 0
        keywords = keywords_econ  # Assume this variable is defined elsewhere
        link = result.get("link", "")
        title = result.get("title", "").lower()
        snippet = result.get("snippet", "").lower()

        if any(site in link for site in allowed_sites):
            score += 3

        if any(word in title or word in snippet for word in keywords):
            score += 2

        if "date" in result:
            score += 1

        return score

    if results.get("organic"):
        sorted_results_econ = sorted(results.get("organic", []), key=score_result, reverse=True)
        return sorted_results_econ
    else:
        return "No results found for macroeconomic events."

**TOOL: CLIENT HOLDINGS - Search for top k news articles about the client's highest exposure holdings (title, snippet, date, source)**

In [9]:
# Function to retrieve macroeconomic events news (Tool HOLDINGS NEWS)
@tool
def holdings_news(query_hn: str) -> str:
    """Return the most relevant news based on each major holding."""
    def format_google_date(date_str):
        parts = date_str.split("/")
        return f"{parts[2]}{parts[0].zfill(2)}{parts[1].zfill(2)}"

    google_start = format_google_date(start_date)
    google_end = format_google_date(end_date)

    sort_param = f"date:r:{google_start}:{google_end}"

    search = GoogleSerperAPIWrapper(k=30, sort=sort_param)  # Pass sort parameter
    results = search.results(query_hn)

    def score_result(result):
        score = 0
        keywords = keywords_client
        link = result.get("link", "")
        title = result.get("title", "").lower()
        snippet = result.get("snippet", "").lower()

        if any(site in link for site in allowed_sites):
            score += 3

        if any(word in title or word in snippet for word in keywords):
            score += 2

        if "date" in result:
            score += 1

        return score

    if results.get("organic"):
        sorted_results_econ = sorted(results.get("organic", []), key=score_result, reverse=True)
        return sorted_results_econ
    else:
        return "No results found for holdings."

**TOOL: CLIENT HOLDINGS DATA - Provide data on the biggest movers of client portfolio**

**TOOL: MACROFIN DATA - Provide a generate market update on the latest macro/financial indicators**

In [16]:

# === Fetch functions ===

def fetch_fred_data(series_id):
    """Fetch latest value from FRED."""
    data = web.DataReader(series_id, 'fred')
    return round(data.iloc[-1, 0], 2)

def fetch_cpi_levels():
    """Fetch CPI levels and calculate changes."""
    cpi = web.DataReader('CPIAUCSL', 'fred')
    latest = cpi.iloc[-1, 0]
    one_month_ago = cpi.iloc[-2, 0]
    one_year_ago = cpi.iloc[-13, 0]
    two_years_ago = cpi.iloc[-25, 0]
    return {
        'level': round(latest, 2),
        '1mo': round(((latest - one_month_ago) / one_month_ago) * 100, 2),
        '1yr': round(((latest - one_year_ago) / one_year_ago) * 100, 2),
        '2yr': round(((latest - two_years_ago) / two_years_ago) * 100, 2)
    }

def fetch_yf_data(ticker, period='1d', interval='1d'):
    data = yf.download(ticker, period=period, interval=interval)
    return float(round(data['Close'].iloc[-1], 2))

def fetch_yf_change(ticker, period):
    data = yf.download(ticker, period=period, interval='1d')
    start = data['Close'].iloc[0]
    end = data['Close'].iloc[-1]
    return float(round(((end - start) / start) * 100, 2))

# === Indicators ===

indicators = {
    "S&P 500": {"fetch_func": fetch_yf_data, "source": "^GSPC"},
    "NASDAQ": {"fetch_func": fetch_yf_data, "source": "^IXIC"},
    "WTI Crude Oil Price": {"fetch_func": fetch_yf_data, "source": "CL=F"},
    "US GDP Growth Rate": {"fetch_func": fetch_fred_data, "source": "A191RL1Q225SBEA"},
    "US Unemployment Rate": {"fetch_func": fetch_fred_data, "source": "UNRATE"},
    "10-Year Treasury Yield": {"fetch_func": fetch_fred_data, "source": "GS10"},
    "Fed Funds Rate": {"fetch_func": fetch_fred_data, "source": "FEDFUNDS"},
    "US Dollar Index (DXY)": {"fetch_func": fetch_yf_data, "source": "DX-Y.NYB"},
}

# === Fetch Level Data ===

data = {}
for indicator, details in indicators.items():
    func = details["fetch_func"]
    source = details.get("source")
    value = func(source)
    data[indicator] = value

# === Fetch CPI changes ===
cpi_data = fetch_cpi_levels()

# === Fetch % changes for market indicators ===

for ticker, label in {
    "^GSPC": "S&P 500",
    "^IXIC": "NASDAQ",
    "CL=F": "WTI Crude Oil Price",
    "DX-Y.NYB": "US Dollar Index (DXY)"
}.items():
    data[f"{label} 1-Month Change (%)"] = fetch_yf_change(ticker, '1mo')
    data[f"{label} 1-Year Change (%)"] = fetch_yf_change(ticker, '1y')
    data[f"{label} 2-Year Change (%)"] = fetch_yf_change(ticker, '2y')

# === Build final DataFrame ===

df_macrofin = pd.DataFrame({
    'Indicator': [
        'S&P 500',
        'NASDAQ',
        'WTI Crude Oil Price',
        'US CPI',
        'US Dollar Index (DXY)',
        'US GDP Growth Rate',
        'US Unemployment Rate',
        '10-Year Treasury Yield',
        'Fed Funds Rate'
    ],
    'Level': [
        data['S&P 500'],
        data['NASDAQ'],
        data['WTI Crude Oil Price'],
        cpi_data['level'],
        data['US Dollar Index (DXY)'],
        data['US GDP Growth Rate'],
        data['US Unemployment Rate'],
        data['10-Year Treasury Yield'],
        data['Fed Funds Rate']
    ],
    '1-Month Change (%)': [
        data['S&P 500 1-Month Change (%)'],
        data['NASDAQ 1-Month Change (%)'],
        data['WTI Crude Oil Price 1-Month Change (%)'],
        cpi_data['1mo'],
        data['US Dollar Index (DXY) 1-Month Change (%)'],
        '', '', '', ''
    ],
    '1-Year Change (%)': [
        data['S&P 500 1-Year Change (%)'],
        data['NASDAQ 1-Year Change (%)'],
        data['WTI Crude Oil Price 1-Year Change (%)'],
        cpi_data['1yr'],
        data['US Dollar Index (DXY) 1-Year Change (%)'],
        '', '', '', ''
    ],
    '2-Year Change (%)': [
        data['S&P 500 2-Year Change (%)'],
        data['NASDAQ 2-Year Change (%)'],
        data['WTI Crude Oil Price 2-Year Change (%)'],
        cpi_data['2yr'],
        data['US Dollar Index (DXY) 2-Year Change (%)'],
        '', '', '', ''
    ]
})



[*********************100%***********************]  1 of 1 completed
  return float(round(data['Close'].iloc[-1], 2))
[*********************100%***********************]  1 of 1 completed
  return float(round(data['Close'].iloc[-1], 2))
[*********************100%***********************]  1 of 1 completed
  return float(round(data['Close'].iloc[-1], 2))
[*********************100%***********************]  1 of 1 completed
  return float(round(data['Close'].iloc[-1], 2))
[*********************100%***********************]  1 of 1 completed
  return float(round(((end - start) / start) * 100, 2))
[*********************100%***********************]  1 of 1 completed
  return float(round(((end - start) / start) * 100, 2))
[*********************100%***********************]  1 of 1 completed
  return float(round(((end - start) / start) * 100, 2))
[*********************100%***********************]  1 of 1 completed
  return float(round(((end - start) / start) * 100, 2))
[*********************100%**

### SUMMARY OF MAJOR ECONOMIC AND FINANCIAL INDICATORS

In [17]:
df_macrofin.set_index('Indicator', inplace=True)
df_macrofin

Unnamed: 0_level_0,Level,1-Month Change (%),1-Year Change (%),2-Year Change (%)
Indicator,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
S&P 500,5528.75,-1.48,8.06,32.65
NASDAQ,17366.13,0.39,8.65,42.2
WTI Crude Oil Price,61.85,-10.83,-25.15,-19.45
US CPI,319.62,-0.05,2.41,5.96
US Dollar Index (DXY),99.06,-4.79,-6.18,-2.57
US GDP Growth Rate,2.4,,,
US Unemployment Rate,4.2,,,
10-Year Treasury Yield,4.28,,,
Fed Funds Rate,4.33,,,


### SUMMARY - CLIENT, ITS SECTOR AND MACRO NEWS

In [18]:
# Query for stock market and industry-related news
query_ci = f"{site_filter} news about {client_name} and about {industry} industry"

# Query for macroeconomic events
query_mn = f"{site_filter} news about relevant macro events and the economy."


In [19]:
# Example queries
queries = [query_ci,query_mn]

# Tool functions list (can be extended with more tools)
tool_functions = [client_and_industry, macro_news]

# Run the agent and get the summary
final_summary = agent_function(queries, tool_functions)

# Print the result
print(final_summary)

The combined outputs reveal a complex economic picture dominated by the impact of President Trump's tariffs and their ripple effects on various sectors.  Walmart, a central focus of the first output, experienced strong sales growth in 2024, with revenue reaching $680.47 billion (LSEG estimate) and a 78% stock surge (Bloomberg).  However, 2025 presented a more uncertain outlook, with Walmart's online business projected to achieve profitability but overall profit forecasts being lower than expected.  The company's actions reflect this uncertainty: they are pushing Chinese suppliers to cut prices to offset tariffs, cutting roles and closing an office, and investing heavily ($4.51 billion in Canada) in expansion and supply chain improvements.  Despite its strong performance, Walmart's position as the largest retailer was overtaken by Amazon in February 2025.



### SUMMARY - CLIENT HOLDINGS

In [20]:
# Query for holdings
query_ch = f"{site_filter} news about any of these top holdings:{largest_holdings}"

queries = [query_ch]

# Tool functions list (can be extended with more tools)
tool_functions = [holdings_news]

# Run the agent and get the summary
final_summary = agent_function(queries, tool_functions)

# Print the result
print(final_summary)

Several recent financial news articles highlight the strong performance and investment potential within the technology sector, particularly in artificial intelligence (AI).  Multiple sources point to significant investments in AI infrastructure by major tech companies like Microsoft and Meta, involving billions of dollars in areas such as Nvidia chips, custom silicon, and data center expansion.  This investment is driving growth and fueling the sector's overall positive performance.

Specific companies frequently mentioned as strong performers or attractive investments include Microsoft (MSFT), Nvidia (NVDA),  Tesla, Netflix (NFLX), Amazon (AMZN), and Apple (AAPL).  One article notes that Microsoft's stock has gained 342% over the last decade.  Another highlights Tesla's 23.8% surge and Netflix's 11.9% jump in a single week.  The technology sector as a whole is noted for its strong performance, with the S&P 500 reaching record highs partly due to positive tech results and AI investment