In [1]:
"""
1. Wikipedia에서 검색
2. DuckDuckGo에서 검색
3. 웹사이트의 텍스트를 스크랩하고 추출합니다.
4. 리서치 결과를 .txt 파일에 저장하기
다음 쿼리로 에이전트를 실행합니다: "Research about the XZ backdoor" 라는 쿼리로 에이전트를 실행하면, 에이전트는 Wikipedia 또는 DuckDuckGo에서 검색을 시도하고, DuckDuckGo에서 웹사이트를 찾으면 해당 웹사이트에 들어가서 콘텐츠를 추출한 다음 .txt 파일에 조사 내용을 저장하는 것으로 완료해야 합니다.
"""

from typing import Type
from langchain.agents import initialize_agent, AgentType
from langchain.chat_models import ChatOpenAI
from langchain.tools import WikipediaQueryRun, BaseTool, DuckDuckGoSearchResults
from langchain.utilities import WikipediaAPIWrapper
from langchain.schema import SystemMessage
from pydantic import BaseModel, Field


class SearchToolArgsSchema(BaseModel):
    keyword: str = Field(description="The keyword you will search for.")


class CrawlToolArgsSchema(BaseModel):
    url: str = Field(description="The url you will crawl for.")


class WriteTxtToolArgsSchema(BaseModel):
    information: str = Field(description="The information wroten into txt file.")


class WikipediaSearchTool(BaseTool):
    name = "WikipediaSearch"
    description = """
    Use this tool to find information about keyword in Wikipedia.
    It takes a query as an argument.
    """
    args_schema: Type[SearchToolArgsSchema] = SearchToolArgsSchema

    def _run(self, keyword):
        wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())
        return wikipedia.run(keyword)


class DuckDuckGoSearchTool(BaseTool):
    name = "DuckDuckGoSearch"
    description = """
    Use this tool to find sites about keyword in DuckDuckGo.
    It takes a query as an argument.
    """
    args_schema: Type[SearchToolArgsSchema] = SearchToolArgsSchema

    def _run(self, keyword):
        ddg = DuckDuckGoSearchResults()
        return ddg.run(keyword)

class WebsiteCrawlTool(BaseTool):
    name="WebsiteCrawl"
    description="""
    Use this tool to find information in sites.
    It takes a address of site as an argument.
    """
    args_schema: Type[CrawlToolArgsSchema] = CrawlToolArgsSchema

    def _run(self, url):
        return url

class WriteTxtTool(BaseTool):
    name="WriteTxt"
    description="""
    Use this tool to write information into txt file.
    It takes arguments like "{{"filename": "xxx.txt", "information": "aaaa"}}"
    """
    arg_schema: Type[WriteTxtToolArgsSchema] = WriteTxtToolArgsSchema

    def _run(self, information):
        path = 'result.txt'
        f = open(path, "w", encoding="utf-8")
        f.write(information)
        f.close()
        return path

llm = ChatOpenAI(temperature=1e-1)


agent = initialize_agent(
    llm=llm,
    verbose=True,
    agent=AgentType.OPENAI_FUNCTIONS,
    handle_parsing_errors=True,
    tools=[
        WikipediaSearchTool(),
        DuckDuckGoSearchTool(),
        WebsiteCrawlTool(),
        WriteTxtTool(),
    ],
    agent_kwargs={
        "system_message": SystemMessage(
            content="""
            You are a information collector.
            
            You find a information about keyword.

            You Wikipedia or DuckDuckGo.

            if you used DuckDuckGo, crawl informations from each websites.
            
            finally, information write into txt file named keyword.
            """
        )
    }
)

agent.invoke("Research about the XZ backdoor")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `WikipediaSearch` with `{'keyword': 'XZ backdoor'}`


[0m[36;1m[1;3mPage: XZ Utils backdoor
Summary: On 29 March 2024, software developer Andres Freund reported that he had found a maliciously introduced backdoor in the Linux utility xz within the liblzma library in versions 5.6.0 and 5.6.1 released in February 2024.While xz is commonly present in most Linux distributions, the backdoor only targeted Debian- and RPM-based systems running on the x86-64 architecture. At the time of discovery the backdoored version had not yet been widely deployed to production systems, but was present in development versions of major distributions.The backdoor gives an attacker who possesses a specific Ed448 private key remote code execution capabilities on the affected Linux system. The issue has been assigned a CVSS score of 10.0, the highest possible score.

Page: XZ Utils
Summary: XZ Utils (previously LZMA Utils) is a set of fr

{'input': 'Research about the XZ backdoor',
 'output': 'I have gathered information about the XZ backdoor and related topics. The information has been saved in a text file named "XZ backdoor.txt".'}

In [2]:
!cat result.txt

XZ backdoor

On 29 March 2024, software developer Andres Freund reported that he had found a maliciously introduced backdoor in the Linux utility xz within the liblzma library in versions 5.6.0 and 5.6.1 released in February 2024. While xz is commonly present in most Linux distributions, the backdoor only targeted Debian- and RPM-based systems running on the x86-64 architecture. At the time of discovery, the backdoored version had not yet been widely deployed to production systems but was present in development versions of major distributions. The backdoor gives an attacker who possesses a specific Ed448 private key remote code execution capabilities on the affected Linux system. The issue has been assigned a CVSS score of 10.0, the highest possible score.

XZ Utils (previously LZMA Utils) is a set of free software command-line lossless data compressors, including the programs lzma and xz, for Unix-like operating systems and, from version 5.0 onwards, Microsoft Windows. For compression

In [5]:
agent.invoke("Research about the XZ backdoor in DuckDuckGo")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `DuckDuckGoSearch` with `{'keyword': 'XZ backdoor'}`


[0m[33;1m[1;3m[snippet: In February of this year, Tan issued commits for versions 5.6.0 and 5.6.1 of XZ Utils. The updates implemented the backdoor. In the following weeks, Tan or others appealed to developers of Ubuntu ..., title: The XZ Backdoor: Everything You Need to Know | WIRED, link: https://www.wired.com/story/xz-backdoor-everything-you-need-to-know/], [snippet: On March 28, 2024 a backdoor was identified in XZ Utils. This vulnerability, CVE-2024-3094 with a CVSS score of 10 is a result of a software supply chain compromise impacting versions 5.6.0 and 5.6.1 of XZ Utils. The U.S. Cybersecurity and Infrastructure Security Agency (CISA) has recommended organizations to downgrade to a previous non-compromised XZ Utils version., title: Microsoft FAQ and guidance for XZ Utils backdoor, link: https://techcommunity.microsoft.com/t5/microsoft-defender-vulner

{'input': 'Research about the XZ backdoor in DuckDuckGo',
 'output': 'The information about the XZ backdoor has been successfully written to a text file named "XZ backdoor.txt".'}

In [6]:
!cat result.txt

XZ backdoor: In February of this year, Tan issued commits for versions 5.6.0 and 5.6.1 of XZ Utils. The updates implemented the backdoor. In the following weeks, Tan or others appealed to developers of Ubuntu ... Source: https://www.wired.com/story/xz-backdoor-everything-you-need-to-know/