### Guide on langchain agents. Available agents and custom agent creation

#### This notebook provides the content on pre-build agents availables from langchain module and the functionality to build custom agent
      
#### Link for all available agents with langchain. Refer to toolkit section in documentation: https://python.langchain.com/v0.1/docs/integrations/toolkits/

#### Link to langchain custom agent generation. Go through all subsecgtions under agent. Refer to section tool callings.: https://python.langchain.com/v0.1/docs/modules/agents/

#### Links to short course on Agentic AI: https://learn.deeplearning.ai/courses/functions-tools-agents-langchain/lesson/1/introduction

In [1]:
### Initiate LLM object using langchain. Have separate guide.
# from langchain.embeddings import AzureOpenAIEmbeddings
# from langchain_openai import AzureOpenAIEmbeddings
# from langchain_community.chat_models import AzureChatOpenAI

## Create embedding of the chunks using Faiss
from langchain_community.vectorstores import FAISS

from langchain_openai import ChatOpenAI
from langchain_community.embeddings.openai import OpenAIEmbeddings

# Required following parameters. Update following input parameters. Make sure embedding model and LLM models are deployed within same resource.
openai_key="<Enter OpenAI Key Here>"
llm = ChatOpenAI(api_key=openai_key,model="gpt-4o-mini")

embedding = OpenAIEmbeddings(api_key=openai_key)


# llm = AzureChatOpenAI(azure_deployment=api_deployment,
#                       api_version=api_version,
#                       azure_endpoint=base_url,
#                       api_key=api_key,
#                       model=model_name)


# embedding = AzureOpenAIEmbeddings(deployment=embedding_deployment,
#                                         openai_api_key=api_key,
#                                         openai_api_version=api_version,
#                                         azure_endpoint=base_url,
#                                         openai_api_type="azure")

  embedding = OpenAIEmbeddings(api_key=openai_key)


### CSV Agent

##### Agent which takes python dataframe as input with user prompt. It generates and exectues the python code using python_repl_ast tool. 
##### Refer to details documentations: https://python.langchain.com/v0.1/docs/integrations/toolkits/csv/


In [2]:
import pandas as pd
from langchain_experimental.agents import create_pandas_dataframe_agent
from langchain.callbacks import get_openai_callback

In [3]:
class csv_agent_res:
    
    def __init__(self,query,llm,kpi_df):
        '''
        Initiate csv agent class.
        query: User Prompt
        llm: Langchain llm object
        kpi_df: Dataframe to query.
        '''
        self.query=query
        self.llm=llm
        self.kpi_df=kpi_df
        self.kpi_df_head = kpi_df.head()  
        self.prefix="""
                        You are working with python dataframe. The name of dataframe is `df`.
                        Do not create your own dataframe.
                        Do not create your own question.

                        You should use the 'python_repl_ast' tool to answer the question posed of you and Use the format given below:
                            Question: Input question you must answer. Do not create your own question and dataframe. Use provided df and question to answer.
                            Thought: You should always think about what to do and what action to take. Do not repeat the same thought.
                            Action: python_repl_ast
                            Action Input: Input to the action, never add backticks "`" arround the action
                            Observation: The result of the action. Memorize the Obersavation for each thought. Do not repeat the same thought.
                            ...(Different Thought/Action/Action Input/Observation can repeat N times step by step and stop once you get the Final Answer.)
                            Thought: I know the Final Answer. Generate Final Answer. Stop when you know Final Answer and generate Final Answer.
                            Final Answer: The final answer to the original input question
            """
        self.suffix="""
                    This is result of `print(df.head())` to understand columns content:
                    {df_head}
                    
                    Customer column is customer or company who signed the contract with the legal entity.
                    
                    Begin!
                    Question: {input}
                    {agent_scratchpad}
                    """
        
        
        self.cost_details={}
        self.cost_details['prompt_tokens']=0
        self.cost_details['completion_tokens']=0
        self.cost_details['api_calls']=0
        self.cost_details['llm_cost']=0
        
    def csv_response(self):
        
        '''
        Generate response using the csv agent.
        '''
        
        try:
            #Initialize the agent
            self.pd_agent = create_pandas_dataframe_agent(llm=self.llm,
                                                            df=self.kpi_df,
                                                            suffix=self.suffix,
                                                            prefix=self.prefix,
                                                            include_df_in_prompt=None,
                                                            verbose=True,
                                                            max_iterations=5,
                                                            reduce_k_below_max_tokens=True, 
                                                            number_of_head_rows=2,
                                                            allow_dangerous_code=True,                                   
                                                            agent_executor_kwargs={'handle_parsing_errors':True,'Input':self.query}
                                                            )
            ## Generate cost details
            with get_openai_callback() as cb:
                self.csv_agent_response=self.pd_agent.run(self.query)
                self.cost_details['prompt_tokens']=self.cost_details['prompt_tokens']+cb.prompt_tokens
                self.cost_details['completion_tokens']=self.cost_details['completion_tokens']+cb.completion_tokens
                self.cost_details['api_calls']=self.cost_details['api_calls']+cb.successful_requests
            return self.csv_agent_response,self.cost_details
        except Exception as e:
            print("Exception while generating csv_response >>> ", str(e))
            return "",self.cost_details

In [21]:
df=pd.read_csv("titanic.csv")
# user_prompt="How many memebers are survived ?"
user_prompt="How many memebers are survived who were travelling using class 1 tickets?"
user_prompt="How many females are survived who were travelling through the class 1 tickets?"

In [22]:
csv_agent = csv_agent_res(user_prompt,llm,df)

In [23]:
response = csv_agent.csv_response()





[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: To find out how many females survived while traveling in class 1, I need to filter the dataframe for females who survived and were in Pclass 1, then count the resulting entries. 
Action: python_repl_ast
Action Input: df[(df['Sex'] == 'female') & (df['Survived'] == 1) & (df['Pclass'] == 1)].shape[0][0m[36;1m[1;3m91[0m[32;1m[1;3mI now know the final answer.
Final Answer: 91[0m

[1m> Finished chain.[0m


In [24]:
response

('91',
 {'prompt_tokens': 0, 'completion_tokens': 0, 'api_calls': 0, 'llm_cost': 0})

## Custom Agents

#### Build the custom agent which generally provie flexibility to use custom tools for execution.
#### Notebook demonstrate the custom agent which basically retrieve the information from the vector database and calculate using maths tools. Make sure Index folder created during the Step 2 Execution is present else recreate vector embeddings using faiss and store to index folder

In [9]:
from langchain.tools import  tool
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.chains import LLMMathChain


In [10]:
try:
    db=FAISS.load_local("./Index",embedding)    
except:
    db=FAISS.load_local("./Index",embedding,allow_dangerous_deserialization=True)
    
user_query="What is total liability of assests as of december 2023 and september 2023."

#Retrieve the most similar content to pass for simple QA tool.
most_similar_chunks = db.similarity_search_with_score(user_query,k=10)
aggregated_content = "\n".join([chunk[0].page_content for chunk in most_similar_chunks])

In [11]:
# There are multiple ways to add prompt (i.e. system prompt or chat history) while calling the LLMs using langchain. 
# Here provided most commonally used process.
from langchain_core.prompts import ChatPromptTemplate
    
#Create the Tools which runs using the custom agent. This notebook will create 2 tools.
# 1. Tool 1 : Which generates summary to answer the user question.
# 2. Tool 2 : Which execute the the expressions of maths using the LLMMathChain. 

#Objective here is to make calculation more robust. It has been observed LLM makes mistakes in calculations. Calculations can be done usign LLMMathChain which develop executable expressions for maths.


##Doc string can be used as prompt for the TOOL selection.

@tool
def extract_information(query: str):
    """
    1. Use this tool to retrieve information in term of the mathamatical expressions to answer the user question.
    2. Refer to the example of question for this tool selection:
      i.e. 1. What is sum of revenue for year 2023 and 2024 for Tibco company?
           2. What is total liability for the 2024 and 2020?
    """
    

    # Add custom prompt which produce response as maths expressions. Provide retrieved content here.
    
    template = ChatPromptTemplate([
        ("system", """
                    You are assistant to financial firm. 
                    Generate the mathamatical expression to answer to the question based on provided content. 
                    Make sure mathamatical expressions are capable to response user question.
                    
                    Response sample: Sum the number 2000 and 3000. 
                    """),
         ("human", "Questoin: {question} \n Contenet: :{content}")])

    # All variables must added to while invoking the prompt 
    prompt_value = template.invoke({"question": query,"content":aggregated_content})
    messages = prompt_value.to_messages()
    response = llm.invoke(messages)
    return response.content

@tool
def maths_chain(query: str):
    """
    1. Use this tool to answer question which has mathamatical expressions.
    2. Do not use this tool to retrieve the information.
    2. Refer to examples:
       1. Sum the 30000 and 4000.
       2. Multiply the 300 by 200.
    """
    llm_math = LLMMathChain.from_llm(llm)
    response = llm_math.invoke(query)
    return response

In [12]:
outer_agent_PREFIX = """Answer the following questions as best as you can. Do not modify the original question and use it as it is mentioned. You have access to the following tools. You must Select atleast one tool out of two tools mentioned below to generate response for question:"""

outer_agent_FORMAT_INSTRUCTIONS = """
Do not validate the response generated with one tool with the other tool.
Do not use other tool if the tool which is intially selected is not able to generate response. If you are not able to find answer, then respond the final observation.

Use the following format strickly:
    Question: Input question you must answer. Do not create your own question. Use provided question to answer.
    Thought: You should always think about what to do and what action to take. Do not repeat the same thought. Memorize the Obersavation for each thought. Do not repeat the same thought.
    Action: The action to take, should be one of [{tool_names}]. Strickly select one of the tool.
    Action Input: Input to the action, never add backticks "`" arround the action. Do not select any tool which is not in list [{tool_names}]. 
    Observation: The result of the action. Memorize the Obersavation for each thought. Do not repeat the same thought.
    ...(Different Thought/Action/Action Input/Observation can repeat N times step by step and stop once you get the Final Answer.)
    Thought: I know the Final Answer. Generate Final Answer. Stop when you know Final Answer and generate Final Answer. You must stop execution once final answer is generated.
    Final Answer: Response the final answer to the original input question.
"""
                    
                                                            
outer_agent_SUFFIX = """
    Begin!

    Question: {input}
    Thought:{agent_scratchpad}
    """

In [13]:
tools = load_tools([])
tools = tools + [extract_information,maths_chain]

agent = initialize_agent(tools,
                        llm,
                        agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
                        verbose=True,
                        max_iterations=5,
                        handle_parsing_errors=True,
                        agent_kwargs={'prefix':outer_agent_PREFIX,
                                    'format_instructions':outer_agent_FORMAT_INSTRUCTIONS,
                                    'suffix':outer_agent_SUFFIX,
                                    "handle_parsing_errors":True})

  agent = initialize_agent(tools,


In [14]:
answer = agent.invoke(user_query)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mQuestion: What is total liability of assests as of december 2023 and september 2023.
Thought: I need to find the total liability of assets for two specific dates, December 2023 and September 2023. This requires retrieving the respective values for those dates.
Action: extract_information
Action Input: What is total liability of assets as of December 2023 and September 2023?[0m
Observation: [36;1m[1;3mTotal liability for December 2023 and September 2023 can be expressed as:

Total Liabilities = Liabilities (December 31, 2023) + Liabilities (September 30, 2023)

Thus, 

Total Liabilities = 20,755 + 25,890[0m
Thought:[32;1m[1;3mI can now calculate the total liabilities using the values obtained from the previous observation. 

Action: maths_chain  
Action Input: Sum the 20755 and 25890.  [0m
Observation: [33;1m[1;3m{'question': 'Sum the 20755 and 25890.', 'answer': 'Answer: 46645'}[0m
Thought:[32;1m[1;3mI know the Fi

In [15]:
answer

{'input': 'What is total liability of assests as of december 2023 and september 2023.',
 'output': '46645'}

In [16]:
### Work Arround: Update the Prompt wiht more reliable few shots. There is no limitation to add new tools to the agent. It also supports nested agent work flow.