# Price Discovery Project Using Agents

This is a notebook that contains the experimentation of prioce discovery using agents.

## Import Libraries

In [92]:
import os
import json
from typing import Any, Union


import PIL
from dotenv import load_dotenv

from langchain_community.tools.tavily_search import TavilySearchResults
from langchain.agents import AgentExecutor, Tool, create_react_agent, create_self_ask_with_search_agent
from langchain.agents.agent import AgentAction
from langchain_core.callbacks.base import BaseCallbackHandler
from langchain_core.messages import HumanMessage
from langchain_community.llms import HuggingFaceEndpoint
from langchain_community.utilities import SerpAPIWrapper
from langchain.prompts import PromptTemplate
from langchain.tools.retriever import create_retriever_tool
from langchain_core.output_parsers import JsonOutputParser
from langchain_google_genai import ChatGoogleGenerativeAI


## Specify Variables

In [2]:

# Load environment variables
load_dotenv()

True

## Create agent for retrieval

**Instatiate the llms**

In [105]:
# Mixtral

repo_id = "mistralai/Mixtral-8x7B-Instruct-v0.1"
repo_id_two = "WizardLM/WizardCoder-33B-V1.1"
repo_id_three = "Phind/Phind-CodeLlama-34B-v2"

mixtral = HuggingFaceEndpoint(
    repo_id=repo_id,
    temperature=0.3
)



Token has not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to C:\Users\Spectra\.cache\huggingface\token
Login successful


In [93]:

# Setup google api
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")

gemini = ChatGoogleGenerativeAI(model="gemini-pro", google_api_key=GOOGLE_API_KEY, temperature=0.3)
# gemini_vision = ChatGoogleGenerativeAI(model="gemini-pro-vision", google_api_key=GOOGLE_API_KEY, temperature=0.1)

**Instantiate prompts**

In [106]:

# Create prompts
prompt = """
You are an expert software developer proficient at writing code and building reliable software. Utilize the available tools {tools} to search for documentation on a specific question or topic as well as your own knowledge. 
Then, synthesize the information gathered to provide a detailed step by step quide with code examples that one can easily follow.


Use the following format:

Question: the input question you must answer.
Thought: you should always think about what to do. 
Action: the action to take, should be ONE of or ALL of [{tool_names}]. 
Action Input: the input to the action. It SHOULD be a string.
Action: choose another tool if a tool fails or handle its error.
Observation: the result of the action, including insights gained or findings.
... (this Thought/Action/Action Input/Observation can repeat N times). If no result is found, use your own knowledge.
Thought: I should recheck my answer and make sure it meets the specifications. If not, try again before coming up with the final answer.
Thought: I now know the final answer that is based on the indepth analysis of your findings and observations. 
Final Answer: the final answer to the original input question, which must be as accurate as possible. NEVER say you can't answer. 
Begin!

Question: {input}
Thought: {agent_scratchpad}

"""
prompt_two = """

You are an expert developer assistant trained to help programmers build software. I can access and process information from the web using {tools}.  
When a developer asks a question about a specific technology or framework, I will use my knowledge and search capabilities to:

1. Find relevant documentation and tutorials on that framework.
2. Identify and extract code examples from the retrieved resources.
3. Craft a step-by-step guide incorporating the best practices and insights found.
4. Ensure the guide is detailed and answers the question asked.
5. Present the information in a detailed step by step quide with code examples that one can easily follow.


Use the following format:

Question: the input question you must answer.
Thought: you should always think about what to do. 
Action: the action to take, should be ONE of or ALL of [{tool_names}]. 
Action Input: the input to the action.
Action: choose another tool if a tool fails or handle its error.
Observation: the result of the action, including insights gained or findings.
... (this Thought/Action/Action Input/Observation can repeat N times). If no result is found, use your own knowledge.
Thought: I should recheck my answer and make sure it meets the specifications. If not, try again before coming up with the final answer.
Thought: I now know the final answer that is based on the indepth analysis of your findings and observations. 
Final Answer: the final answer to the original input question, which must be as accurate as possible. NEVER say you can't answer. 
Begin!

Question: {input}
Thought: {agent_scratchpad}

"""

prompt_three ="""
You are an expert developer assistant trained to help programmers build software. I can access and process information from the web using {tools}.  
When a developer asks a question about a specific technology or framework, I will use my knowledge and search capabilities to:

1. Analyze the Question:
2. Find relevant documentation and tutorials needed to fully answer that question.
2. Identify and extract code examples from the retrieved resources.
3. Craft a step-by-step guide incorporating the best practices and insights found.
4. Ensure the guide is detailed and answers the question asked.
5. Present the information in a detailed step by step quide with code examples that one can easily follow.


Use the following format:

Question: the input question you must answer.
Thought: you should always think about what to do. 
Action: the action to take, should be ONE [{tool_names}]. 
Action Input: the input to the action.
Action: choose another tool if a tool fails or handle its error.
Observation: the result of the action, including insights gained or findings.
... (this Thought/Action/Action Input/Observation can repeat N times). If no result is found, use your own knowledge.
Thought: I now know the final answer that is based on the indepth analysis of your findings and observations. 
Final Answer: the final answer to the original input question, which must be as accurate as possible. NEVER say you can't answer. 
Begin!

Question: {input}
Thought: {agent_scratchpad}
"""


In [115]:
prompt_template = PromptTemplate.from_template(prompt_three)
prompt_template_two = PromptTemplate.from_template(prompt_two)

**Instantiate Agents**

In [108]:
# Instantiate error handler
class RetryCallbackHandler(BaseCallbackHandler):
    def on_tool_error(
        self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
    ) -> Any:
        # Log the error or perform any necessary actions
        
        # Retry logic
        max_retries = 1
        current_retry = kwargs.get('retry_count', 0)
        if current_retry < max_retries:
            print(f"Retrying tool execution (Attempt {current_retry + 1})")
            # Increment retry count
            kwargs['retry_count'] = current_retry + 1
            # Re-run the tool
            return AgentAction.RERUN
        else:
            print("Maximum retries reached. Switching to another tool.")
            return AgentAction.CHANGE_TOOL

In [109]:
# The error handler
error_handler = RetryCallbackHandler()

In [110]:
# Create tools
search = TavilySearchResults()

tool_search_item = Tool(
    name="search internet",
    description="Searches the internet for the documentation and examples in order to answer a query",
    func=search.run,
    handle_tool_error=True
)

# List of all tools
tools = [tool_search_item]

In [116]:

# Instantiate agents


agent = create_react_agent(mixtral, tools, prompt_template_two) 



agent_executor = AgentExecutor(agent=agent,
                                tools=tools,
                                verbose=True,
                                return_intermediate_steps=True,
                                handle_parsing_errors=True,
                                callbacks=[error_handler])


**Run the pipeline**

In [113]:
query = "How do I build a retrieval augmented system over mmy documents"

In [117]:
agent_executor.invoke({"input": query})




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mTo build a retrieval augmented system over my documents, I need to understand the basics of retrieval augmented systems and how they work with documents. I will search the internet for relevant documentation and tutorials on retrieval augmented systems and their application to documents.

Action: search internet
Action Input: "retrieval augmented systems over documents"

Observation[0m[36;1m[1;3m[{'url': 'https://mallahyari.github.io/rag-ebook/03_prepare_data.html', 'content': 'Step 1: Install necessary libraries\nStep 2: Load the pdf file and extract the text from it\nCode below will iterate over the pages of the pdf file, extract the text and add it to the documents list object, see Figure\xa03.1.\n Here’s the code demonstraing the implementation, Figure\xa03.10:\nFigure\xa03.11 shows how to use load_qa_with_sources_chain:\nSimilarly, if we use RetrievalQA, we will have Figure\xa03.12:\nAnd here’s the code when we use Re

{'input': 'How do I build a retrieval augmented system over mmy documents',
 'output': "To build a retrieval augmented system over your documents, follow these steps:\n\n1. Install necessary libraries:\n\nBefore you can harness the power of Large Language Models (LLMs) and particularly RAG method for question answering over PDF documents, it's essential to prepare your data. To ensure the accuracy of question answering, perform text cleanup and normalization.\n\n2. Load the pdf file and extract the text from it:\n\nThe following code will iterate over the pages of the pdf file, extract the text and add it to the documents list object:\n\n```python\ndocuments = []\nwith pdfplumber.open(pdf_file) as pdf:\n    for page in pdf.pages:\n        documents.append(page.extract_text())\n```\n\n3. Perform text cleanup and normalization:\n\nThis step involves cleaning and normalizing the text extracted from the PDF documents.\n\n4. Implement language detection algorithms to identify the language u