# [GPT-Assig10] "Sequential Multi Agent"
---------------
### Lecture12: Agents
* In a new Jupyter notebook create `a research AI agent` and give it custom tools.
* The agent should be able to do the following tasks:
    * Search in Wikipedia
    * Search in DuckDuckGo
    * Scrape and extract the text of any website.
    * Save the research to a .txt file
> Run the agent with this query: "Research about the XZ backdoor", the agent should try to search in Wikipedia or DuckDuckGo, if it finds a website in DuckDuckGo it should enter the website and extract it's content, then it should finish by saving the research to a .txt file.



In [1]:
import time
from typing import Type
from langchain.chat_models import ChatOpenAI
from langchain.schema import SystemMessage
from langchain.tools import BaseTool
from pydantic import BaseModel, Field
from langchain.agents import initialize_agent, AgentType
from langchain.utilities import DuckDuckGoSearchAPIWrapper, WikipediaAPIWrapper
from langchain.tools import DuckDuckGoSearchResults
from langchain.document_loaders import AsyncHtmlLoader
from langchain.document_transformers import Html2TextTransformer
from langchain.prompts import PromptTemplate
from langchain.schema.runnable import RunnablePassthrough

llm = ChatOpenAI(temperature=0.1, model_name="gpt-3.5-turbo-0125")


# 검색
class SearchQueryArgsSchema(BaseModel):
    query: str = Field(
        description="The query you will search for.Example query: Research about the XZ backdoor"
    )


class DuckDuckGoSearchTool(BaseTool):
    name = "DuckDuckGoSearchTool"
    description = """
    Use this tool to search and find THE MOST relevant website in DuckDuckGo.
    You should return THE ONLY link.
    It takes a query as an argument.
    
    """
    args_schema: Type[SearchQueryArgsSchema] = SearchQueryArgsSchema

    def _run(self, query):
        wrapper = DuckDuckGoSearchAPIWrapper(max_results=1)
        search = DuckDuckGoSearchResults(api_wrapper=wrapper)
        return search.run(query)


class WikipediaSearchTool(BaseTool):
    name = "WikipediaSearchTool"
    description = """
    Use this tool to research.
    It takes a query as an argument.
    """
    args_schema: Type[SearchQueryArgsSchema] = SearchQueryArgsSchema

    def _run(self, query):
        wkp = WikipediaAPIWrapper()
        return wkp.run(query)




# 웹스크랩
class ScrappingWebsiteToolArgsSchema(BaseModel):
    link: str = Field(
        description="Just one link you will scrape for. Examplelink: [https://duckduckgo.com/c/XZ_backdoor] "
    )

class ScrappingWebsiteTool(BaseTool):
    name = "ScrappingWebsiteTool"
    description = """
    Use this tool to enter the website link and extract its content.
    It takes a link as an argument.
    """
    args_schema: Type[ScrappingWebsiteToolArgsSchema] = ScrappingWebsiteToolArgsSchema

    def _run(self, link):
        loader = AsyncHtmlLoader(link)
        docs = loader.load()
        html2text = Html2TextTransformer()
        docs = html2text.transform_documents(docs)
        return docs



# txt파일저장
class SaveToTextToolArgsSchema(BaseModel):
    docs: str = Field(
        description="The extracted document you will save to .text file."
    )

class SaveToTextFileTool(BaseTool):
    name = "SaveToTextTool"
    description = """
    Use this tool to to save the result of the research to a .txt file.
    It takes a doc as an argument.
    """
    args_schema: Type[SaveToTextToolArgsSchema] = SaveToTextToolArgsSchema

    def _run(self, docs):
        file_path= f"/workspaces/FULLSTACK-GPT/files/{time.strftime('%H%M%S')}.txt"
        with open(file_path, "wb") as f:
            docs_bytes = docs.encode('utf-8')
            f.write(docs_bytes)
        


agent = initialize_agent(
    llm=llm,
    verbose=True,
    agent=AgentType.OPENAI_FUNCTIONS,
    handle_parsing_errors=True,
    tools=[
        DuckDuckGoSearchTool(),
        WikipediaSearchTool(),
        ScrappingWebsiteTool(),
        SaveToTextFileTool()
    ],
    agent_kwargs={
        "system_message": SystemMessage(
            content="""
            You are a helful research manager assistance.
        
            The agent should try to search in Wikipedia or DuckDuckGo. 
            If it finds the most relevant website in DuckDuckGo it should enter the just ONLY ONE website and extract its content.
            Afte that, it should finish by saving the research to a .txt file.
            
        """)
    }
)
prompt = "Give me the Research about the XZ backdoor. "

agent.invoke(prompt)




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `DuckDuckGoSearchTool` with `{'query': 'Research about the XZ backdoor'}`


[0m[36;1m[1;3m[snippet: Getty Images. 207. On Friday, a lone Microsoft developer rocked the world when he revealed a backdoor had been intentionally planted in xz Utils, an open source data compression utility available ..., title: What we know about the xz Utils backdoor that almost infected the world ..., link: https://arstechnica.com/security/2024/04/what-we-know-about-the-xz-utils-backdoor-that-almost-infected-the-world/], [snippet: That operation matches the style of the XZ Utils backdoor far more than the cruder supply chain attacks of APT41 or Lazarus, by comparison. "It could very well be someone else," says Aitel., title: The Mystery of 'Jia Tan,' the XZ Backdoor Mastermind | WIRED, link: https://www.wired.com/story/jia-tan-xz-backdoor/], [snippet: 2024-02-24: Jia Tan tags and builds v5.6.0 and publishes an xz-5.6..tar.gz distri

Fetching pages: 100%|##########| 1/1 [00:01<00:00,  1.40s/it]


[38;5;200m[1;3m[Document(page_content='Skip to main content\n\n  * Biz & IT\n  * Tech\n  * Science\n  * Policy\n  * Cars\n  * Gaming & Culture\n  * Store\n  * Forums\n\nSubscribe\n\nClose\n\n###  Navigate\n\n  * Store\n  * Subscribe\n  * Videos\n  * Features\n  * Reviews\n\n  * RSS Feeds\n  * Mobile Site\n\n  * About Ars\n  * Staff Directory\n  * Contact Us\n\n  * Advertise with Ars\n  * Reprints\n\n###  Filter by topic\n\n  * Biz & IT\n  * Tech\n  * Science\n  * Policy\n  * Cars\n  * Gaming & Culture\n  * Store\n  * Forums\n\n###  Settings\n\nFront page layout\n\n  \nGrid\n\n  \nList\n\nSite theme\n\nlight\n\ndark\n\nSign in\n\n####  NIGHTMARE SUPPLY CHAIN ATTACK SCENARIO --\n\n# What we know about the xz Utils backdoor that almost infected the world\n\n## Malicious updates made to a ubiquitous tool were a few weeks away from\ngoing mainstream.\n\nDan Goodin \\- Apr 1, 2024 6:55 am UTC\n\nEnlarge\n\nGetty Images\n\n#### reader comments\n\n209\n\nOn Friday, a lone Microsoft developer

{'input': 'Give me the Research about the XZ backdoor. ',
 'output': 'I have extracted the research about the XZ backdoor from the website. The document has been saved successfully. If you would like to access the information, please let me know.'}

In [53]:
# prompt = PromptTemplate.from_template(
#         """    
#         The agent should try to search in Wikipedia or DuckDuckGo. 
#         If it finds a website in DuckDuckGo it should enter the website and extract it's content 
#         Afte that, it should finish by saving the research to a .txt file.
        
#         query: {query}    
#         """,
#     )


# chain = {"query": RunnablePassthrough()} | prompt | agent

# query = "Research about the XZ backdoor"

# agent.invoke(query)