#### @prompt example

## LangChain with GeminiAI
In this notebook we will build two agents for analyzing financial information.
The first agent will use two tools to perform the following functionality#
- Extract the Sector which has performed better over the past week, month, quarter and year

- Once this sector is found , we will ask our agent to retrieve few companies from this sector, and produce a table where those companies are sorted by P/E descending

The second agent will use a Chroma store which has been populated with a document containing Altria Group Q424 earning transcripts, and we  will ask our agent to summarize the call.

#### Installing dependencies

In [9]:
!pip install langchain
!pip install -U langchain-google-genai
!pip install -U -q "google-genai==1.7.0"
!pip install langchain_community
!pip install docx2txt
!pip install chromadb
!pip install wikipedia
!pip install finvizfinance

Collecting protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<7.0.0,>=3.20.2 (from google-ai-generativelanguage<0.7.0,>=0.6.16->langchain-google-genai)
  Downloading protobuf-3.20.3-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.metadata (679 bytes)
Downloading protobuf-3.20.3-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m24.1 MB/s[0m eta [36m0:00:00[0m00:01[0m
[?25hInstalling collected packages: protobuf
  Attempting uninstall: protobuf
    Found existing installation: protobuf 5.29.4
    Uninstalling protobuf-5.29.4:
      Successfully uninstalled protobuf-5.29.4
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-cloud-bigtable 2.27.0 requires google-api-core[grpc]<3.0.0dev,>=2.16.0, but you have google-api-core 1.34.1 

#### Loading Keys

In [10]:
import os
from kaggle_secrets import UserSecretsClient

GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")
os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY

### Importing Libraries


### Creating our external functions
* get_sectors_performance will retrieve data from finvizfinance containing last week, month, quarter and year peformance. I had to pass a dummy parameter as google API prevented me from using as a tool a function with no parameters. Same function works though with OpenAI LLM
* get companies for sector accepts a sector as input and will retrieve from finvizfinance companies for a specific sector.More filters have been added in the query to make sure that 'enough' data is returned

In [None]:
from langchain_core.tools import tool

@tool
def get_sectors_performance(sector:str = None):
    """ Useful for getting performance of each  sector for last week, last month, last quarter, last half year and last year. **This tool does not require any input from the user.**"""
    
    from finvizfinance.group import Performance

    try:
        performance = Performance()
        # Get the performance data
        return performance.screener_view().to_dict('records')
    except Exception as e:
        print(f"An error occurred: {e}")
        print("Please ensure the finvizfinance library is installed correctly.")
        print("You can install it using: pip install finvizfinance")
        print("Also, check your internet connection as the library fetches data from Finviz.")

@tool
def get_companies_for_sector(sectorName:str):
    """ Return a subset of companies for the given sector"""
    from finvizfinance.screener.overview import Overview
    foverview = Overview()
    filters_dict = {'Sector': sectorName,
                    'Market Cap.': '+Small (over $300mln)',
                    'Average Volume': 'Over 200K',
                    'Current Ratio': 'Over 1',
                    'Debt/Equity': 'Under 1',
                    'InstitutionalOwnership': 'Under 60%',
                    'Price': 'Over $10'}
    foverview.set_filter(filters_dict=filters_dict)
    df = foverview.screener_view(order='Company')
    return df.head(5).to_dict('records')


### Chat Memory

In [None]:

MEMORY_KEY = "chat_history"
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are very powerful stock recommendation assistant , but dont know current events so you should use your tools as much as you can.",
        ),
        MessagesPlaceholder(variable_name=MEMORY_KEY),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.agents import AgentType, Tool, initialize_agent
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate


# Initialize Gemini Pro model
llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",
    google_api_key=GOOGLE_API_KEY,
    temperature=0.7
)


# Define the tools and create a "tools" node.

tools = [get_sectors_performance, get_companies_for_sector ]

# Attach the tools to the model so that it knows what it can call.
#llm_with_tools = llm.bind_tools(tools)

In [None]:
chat_history = []
chat_history.append(HumanMessage(content="Your question here"))
chat_history.append(AIMessage(content="AI response here"))
# Define your prompt
prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{input}"),
        MessagesPlaceholder(variable_name="chat_history"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True, max_token_limit=18000)

# Initialize the agent
agent_chain = initialize_agent(
    tools,
    llm,
    agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
    prompt=prompt,
    verbose=True,
    memory=memory,
)

### Finding the best performing sector across week, month, quarter, and year

In [None]:
input1 = '''Find the sector which across week, month, quarter and year has shown the best performance and summarize it.
'''
result = agent_chain.invoke({"input": input1, "chat_history": chat_history})
chat_history.extend(
    [
        HumanMessage(content=input1),
        AIMessage(content=result["output"]),
    ]
)
print(result['output'])

### Now we are going to find few companies for the best performing sector. We'll output a table sorted by P/E

In [None]:
input1 = "Now find me some companies for the sector you found in the previous step. Create  a table with Ticker, Company,  P/E and change"
result = agent_chain.invoke({"input": input1, "chat_history": chat_history})
print(result['output'])

### Now we will play around with Chroma
### We will upload an extract from Altria Group latest Earning calls and ask our llm to extract some information

In [3]:
!pip install -U -q "google-genai==1.7.0"

In [12]:
from langchain.agents import initialize_agent, AgentType
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.tools import Tool
from langchain.vectorstores import Chroma
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.memory import ConversationBufferMemory
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.tools import create_retriever_tool

import os

# --- Configure Google API Key (if needed) ---
# os.environ["GOOGLE_API_KEY"] = "YOUR_GOOGLE_API_KEY"

# --- Define the LLM and Embeddings ---
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash", temperature=0)
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

# --- Load and Index the Text Document ---
def load_document_into_chroma(file_path, persist_directory="chroma_db_single"):
    """Loads a text document, creates embeddings, and stores it in ChromaDB."""
    loader = TextLoader(file_path)
    documents = loader.load()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
    chunks = text_splitter.split_documents(documents)
    vectorstore = Chroma.from_documents(chunks, embeddings, persist_directory=persist_directory)
    vectorstore.persist()
    return vectorstore

# --- Define the Retrieval Tool ---
def query_chroma_document(query):
    """Queries the ChromaDB for the loaded document."""
    vectorstore = Chroma(persist_directory="chroma_db_single", embedding_function=embeddings)
    retriever = vectorstore.as_retriever()
    results = retriever.get_relevant_documents(query)
    return "\n\n".join([doc.page_content for doc in results])

vectorstore = load_document_into_chroma('/kaggle/input/altria-q424-earning-call/Altria_Q424EarningCall.txt')

retriever = vectorstore.as_retriever(search_type='mmr', search_kwargs = {'k' : 3, 'lambda_mult' : 0.7})
retriever_tool = create_retriever_tool(retriever= retriever,
                                        name="Altria Q424 Earning calls",
                                        description="For any questions regarding Altria latest earning call you must use this tool")

# --- Create the Retrieval Tool ---
retrieval_tool = Tool(
    name="Document Retriever",
    func=query_chroma_document,
    description="Useful for retrieving specific content from the loaded document.",
)

# --- Define the Prompt Template for the Agent ---
MEMORY_KEY = "chat_history"
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant that can retrieve information from a specific document based on user queries.",
        ),
        MessagesPlaceholder(variable_name=MEMORY_KEY),
        ("user", "{input}"),
        ("user", "Use the 'Document Retriever' tool to find relevant information in the document."),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

# --- Initialize Memory for the Agent ---
memory = ConversationBufferMemory(memory_key=MEMORY_KEY, return_messages=True)

# --- Initialize the Agent ---
chromaagent = initialize_agent(
    [retrieval_tool],
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
    memory=memory,
    prompt=prompt,
)



  memory = ConversationBufferMemory(memory_key=MEMORY_KEY, return_messages=True)
  chromaagent = initialize_agent(


### Now we'll query to find highlights from Altria latest earning calls

In [None]:
input1 = '''Please provide the main higlights from latest earning call transcript'''
result = chromaagent.invoke({"input": input1})
print(result['output'])

### Checking Columbia Annual Report

In [4]:
!pip install pypdf



In [13]:

from langchain.document_loaders import PyPDFLoader
from langchain.embeddings import OpenAIEmbeddings  # Or any other embedding model
from langchain.vectorstores import Chroma
from langchain_core.tools import create_retriever_tool


In [14]:
from langchain.agents import initialize_agent, AgentType
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.tools import Tool
from langchain.vectorstores import Chroma
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.memory import ConversationBufferMemory
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder

import os

# --- Configure Google API Key (if needed) ---
# os.environ["GOOGLE_API_KEY"] = "YOUR_GOOGLE_API_KEY"

# --- Define the LLM and Embeddings ---
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash", temperature=0)
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

# --- Load and Index the Text Document ---
def load_document_into_chroma(file_path, persist_directory="chroma_db_single"):
    """Loads a text document, creates embeddings, and stores it in ChromaDB."""
    loader = PyPDFLoader(file_path)
    documents = loader.load()
    # 3. Create the Chroma vector store
    vectorstore = Chroma.from_documents(documents, embeddings, persist_directory=persist_directory)
    vectorstore.persist()
    return vectorstore

# --- Define the Retrieval Tool ---
def query_chroma_document(query):
    """Queries the ChromaDB for the loaded document."""
    vectorstore = Chroma(persist_directory="chroma_db_single", embedding_function=embeddings)
    retriever = vectorstore.as_retriever()
    results = retriever.get_relevant_documents(query)
    return "\n\n".join([doc.page_content for doc in results])

vectorstore = load_document_into_chroma('/kaggle/input/columbia-annual-report/columbia_sportswear_company_ar_25.pdf')

retriever = vectorstore.as_retriever(search_type='mmr', search_kwargs = {'k' : 3, 'lambda_mult' : 0.7})
retriever_tool = create_retriever_tool(retriever= retriever,
                                        name="Columbina Annual Report",
                                        description="For any questions regarding Columbia Annual Report")

# --- Create the Retrieval Tool ---
retrieval_tool = Tool(
    name="Document Retriever",
    func=query_chroma_document,
    description="Useful for retrieving specific content from the loaded document.",
)

# --- Define the Prompt Template for the Agent ---
MEMORY_KEY = "chat_history"
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant that can retrieve information from a specific document based on user queries.",
        ),
        MessagesPlaceholder(variable_name=MEMORY_KEY),
        ("user", "{input}"),
        ("user", "Use the 'Document Retriever' tool to find relevant information in the document."),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

# --- Initialize Memory for the Agent ---
memory = ConversationBufferMemory(memory_key=MEMORY_KEY, return_messages=True)

# --- Initialize the Agent ---
chromaagent = initialize_agent(
    [retrieval_tool],
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
    memory=memory,
    prompt=prompt,
)



In [16]:
input1 = '''What were the key financial highlights of 2024 from Columbia annual report'''
result = chromaagent.invoke({"input": input1})
print(result['output'])



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find the key financial highlights of 2024 from Columbia's annual report. I should use the document retriever to search for this information.
Action: Document Retriever
Action Input: "Columbia annual report 2024 financial highlights"[0m
Observation: [36;1m[1;3mCONSOLIDATED STATEMENTS OF OPERATIONS 
Year Ended December 31,
(in thousands, except per share amounts) 2024 2023 2022
Net sales $ 3,368,582 $ 3,487,203 $ 3,464,152 
Cost of sales  1,677,497  1,757,271  1,753,074 
Gross profit  1,691,085  1,729,932  1,711,078 
Selling, general and administrative expenses  1,443,906  1,416,313  1,304,394 
Impairment of goodwill and intangible assets  —  25,000  35,600 
Net licensing income  23,562  21,665  22,020 
Operating income  270,741  310,284  393,104 
Interest income, net  27,703  13,687  2,713 
Other non-operating income (expense), net  (257)  2,221  1,593 
Income before income tax  298,187  326,192  397,410 
Income t

In [21]:
input2 = '''Extract excerpts discussing the company's revenue growth (or decline) in 2025 and the factors contributing to it.'''
result = chromaagent.invoke({"input": input2})
print(result['output'])



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find information about the company's revenue growth or decline in 2025 and the reasons behind it. I will use the document retriever to search for relevant excerpts.
Action: Document Retriever
Action Input: "revenue growth 2025"[0m
Observation: [36;1m[1;3mtowards our PMTA and accelerated our work on an PMTA submission for Ploom. We now expect to make a combined submission in mid-year 2025. And in December, we commenced a small scale test launch of SWIC, our heated tobacco capsule product, through e-commerce in Great Britain. We expect to use consumer insights from the test to further inform our strategies. Turning to our 2025 financial outlook, we remain committed to tobacco harm reduction in the U.S. and continue to believe there is a significant opportunity to shift millions of smokers to FDA-authorized smoke-free alternatives. Our planned investment areas include market activities in support of our smoke-free p

ValueError: An output parsing error occurred. In order to pass this error back to the agent and have it try again, pass `handle_parsing_errors=True` to the AgentExecutor. This is the error: Could not parse LLM output: `The company expects to deliver 2025 full year adjusted diluted EPS in a range of $5.22 to $5.37, representing an adjusted diluted EPS growth rate of 2% to 5% from a $5.12 base in 2024. Factors influencing this outlook include:

*   One fewer shipping day in 2025 (occurring in the first quarter).
*   A limited impact on combustible and e-vapor product volumes from enforcement efforts in the illicit e-vapor market.
*   Reinvestment of anticipated cost savings related to the "optimize and accelerate" initiative.
*   Lower expected net periodic benefit income.
*   The external environment, including the cumulative impact of inflation, tobacco consumer dynamics (purchasing patterns and adoption of smoke-free products), illicit product enforcement, and regulatory litigation and legislative developments.
*   Planned investment in market activities supporting smoke-free products and continued smoke-free product research, development, and regulatory preparations.`
For troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/OUTPUT_PARSING_FAILURE 

In [19]:
input3 = '''"What is the company's outlook or guidance for the next financial year, as stated in the report?
Extract any forward-looking statements regarding revenue, earnings, or other key metrics.'''

result = chromaagent.invoke({"input": input3})
print(result['output'])



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find the company's outlook or guidance for the next financial year. I will use the document retriever to search for keywords like "outlook", "guidance", "next financial year", "revenue", "earnings", and "forward-looking statements".
Action: [Document Retriever]
Action Input: "outlook guidance next financial year revenue earnings forward-looking statements"[0m
Observation: [Document Retriever] is not a valid tool, try one of [Document Retriever].
Thought:[32;1m[1;3mI made a mistake in calling the tool. I should use the correct tool name "Document Retriever".
Action: [Document Retriever]
Action Input: "outlook guidance next financial year revenue earnings forward-looking statements"[0m
Observation: [Document Retriever] is not a valid tool, try one of [Document Retriever].
Thought:[32;1m[1;3mI seem to be having trouble using the tool. I will try a simpler query to see if I can get any results. I will focus on "ou

In [20]:
input4 = '''What are the key drivers expected to influence the company's performance in the future, according to the report?
"Are there any specific targets or goals mentioned for the coming years?'''

result = chromaagent.invoke({"input": input4})
print(result['output'])



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find the key drivers influencing future performance and any specific targets or goals for the coming years. I will use the document retriever to search for these topics.
Action: [Document Retriever]
Action Input: "key drivers future performance targets goals"[0m
Observation: [Document Retriever] is not a valid tool, try one of [Document Retriever].
Thought:[32;1m[1;3mI apologize for the error. I need to use the correct tool.

Thought:I need to find the key drivers influencing future performance and any specific targets or goals for the coming years. I will use the document retriever to search for these topics.
Action: [Document Retriever]
Action Input: "key drivers future performance targets goals"[0m
Observation: [Document Retriever] is not a valid tool, try one of [Document Retriever].
Thought:[32;1m[1;3mI apologize for the repeated errors. I seem to be having trouble accessing the document retriever tool. I