새로운 Jupyter notebook에서 리서치 AI 에이전트를 만들고 커스텀 도구를 부여합니다.
에이전트는 다음 작업을 수행할 수 있어야 합니다:

- Agent만들기
- Agent 실행시 순서
    - 검색어: Research about the XZ backdoor
    - Wikipedia에서 검색
    - 웹사이트의 텍스트 스크랩하고 추출
    - 리서치 결과 .txt파일로 저장

## researchGPT 설계
- agent생성: initialize_agent
    - verbose=True,
    - handle_parsing_error=True,
    - llm=llm,
    - agent=AgentType.OPENAI_FUNCTIONS,
    - tools=[
        query: Research about the XZ backdoor
        1) duckducgo에서 쿼리 검색
        2) 해당 내용 스크래핑
        3) 텍스트로 저장
    ]

In [44]:
# duckduckgo에서 쿼리 검색
from typing import Type
from langchain_community.utilities import DuckDuckGoSearchAPIWrapper
from langchain.tools import BaseTool
from pydantic import BaseModel, Field
import requests
from bs4 import BeautifulSoup as bs
import wikipediaapi


class QueryReasearchUrlArgsSchema(BaseModel):
    query:str=Field(description="What comes in when you invoke the agent")

class QueryReasearchUrlTool(BaseTool):
    name = "QueryResearchUrl"
    description = "Tool to find widipedia url for query"
    args_schema: Type[QueryReasearchUrlArgsSchema] = QueryReasearchUrlArgsSchema

    def _run(self, query):        
        wiki = wikipediaapi.Wikipedia(
            user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36"
        ) #
        page = wiki.page(query)
        if page.exists():
            return page.fullurl
        else:
            return None

class UrlContentScrappingArgsSchema(BaseModel):
    url:str=Field(description="URL from QueryReasearchUrlTool")

class UrlContentScrappingTool(BaseTool):
    name = "UrlContentScrappingTool"
    description = "A tool that accesses URLs and scrapes content"
    args_schema: Type[UrlContentScrappingArgsSchema]=UrlContentScrappingArgsSchema
    def _run(self, url):
        res = requests.get(url)
        soup = bs(res.text, 'html.parser')
        paragraphs = soup.find_all('p')
        content = "\n".join([para.text for para in paragraphs])
        return content
    
class SaveToTextFileArgsSchema(BaseModel):
    content:str
    
class SaveToTextFileTool(BaseTool):
    name = "SaveToFileTool"
    description = "Save the content returned with UrlContentScrappingTool as a file."
    args_schema:Type[SaveToTextFileArgsSchema]=SaveToTextFileArgsSchema
    
    def _run(self, content):
        filename='research.txt'
        with open(f'./files/{filename}', 'w') as f:
            f.write(content)
        return f"{filename}에 저장 완료"

In [45]:
#agnet 정의
from tabnanny import verbose
from langchain.agents import initialize_agent, AgentType
from langchain_openai import ChatOpenAI


llm = ChatOpenAI(temperature=0.1)

research_agent = initialize_agent(
    llm=llm,
    agent=AgentType.OPENAI_FUNCTIONS,
    verbose=True,
    handle_parsing_errors=True,
    tools=[
        QueryReasearchUrlTool(),
        UrlContentScrappingTool(),
        SaveToTextFileTool()
    ]
)
prompt = """
    Research about the XZ backdoor.
"""
research_agent.invoke(prompt)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `QueryResearchUrl` with `{'query': 'XZ backdoor'}`


[0m[36;1m[1;3mhttps://en.wikipedia.org/wiki/XZ_Utils_backdoor[0m[32;1m[1;3m
Invoking: `UrlContentScrappingTool` with `{'url': 'https://en.wikipedia.org/wiki/XZ_Utils_backdoor'}`


[0m[33;1m[1;3m

In February 2024, a malicious backdoor was introduced to the Linux utility xz within the liblzma library in versions 5.6.0 and 5.6.1 by an account using the name "Jia Tan".[b][2] The backdoor gives an attacker who possesses a specific Ed448 private key remote code execution capabilities on the affected Linux system. The issue has been given the Common Vulnerabilities and Exposures number CVE-2024-3094 and has been assigned a CVSS score of 10.0, the highest possible score.[3][4]

While xz is commonly present in most Linux distributions, at the time of discovery the backdoored version had not yet been widely deployed to production systems, but was present in devel

{'input': '\n    Research about the XZ backdoor.\n',
 'output': 'I have saved the research about the XZ backdoor to a file named "research.txt". If you would like to access the content, please let me know!'}